Lazy Michelson storage

Michelson storage is a typed value which is serialized as is and stored on disk, and deserialized on demand when the contract is called. This has the downside of requirement time linear in the size of the storage, even when only small parts of the storage are needed. This is largely addressed through the use of big_map which, unlike any other Michelson type, are not serialized into a single blob, but rather represented directly within the context. The values stored in a big_map are accessed and deserialized from the context as they are used. One might say that the big_maps are deserialized lazily while the rest of the storage is deserialized eagerly.

This could be generalized to the rest of Michelson storage by introducing a lazy constructor. For instance the type:

lazy (pair int (lazy string)) would represent a value that isn’t deserialized at all until the storage is accessed. If the contract never accesses the storage, it never needs to be deserialized at all.

Assuming the storage is accessed, the pair constructor would be deserialized, along with the int that constitutes the first element of the pair. However, the second element of the pair, the string would not be deserialized until accessed as it is marked as lazy.

The lazy keyword would only exist in the storage’s type definition, not in Michelson code. As such, it would be completely transparent for scripts.

If we have such a keyword, why not use it everywhere? Simply because it is less space-efficient. Writing down a serialized structure in one place in the context is far more space efficient than writing the different bits of it at different places in the context. It may also cost more gas to retrieve every individual parts of the storage as opposed to retrieving a blob in one go. Therefore, the lazy keyword should be reserved to part of the storage which can grow meaningfully large but which, for some reason, do not map well to a big_map structure.


I have just opened a similar feature request here:

1 Like