Introducing Tezos Cache

Marigold · October 15, 2021, 2:09am

Tezos now has an active memory, cache mechanism that can store recently-used information.

Like in any large-scale system, a proper cache system can greatly increase Tezos’ efficiency and execution speed. But, even better, the new cache in protocol Hangzhou is fully integrated with the protocol, allowing gas discounts whenever the cache is used. This will drive gas costs down, especially for popularly-used contracts.

The Size Limit

In the real world, there are no unlimited resources, so a trimming policy of some sort is required to keep the size of the cache reasonable. The protocol Hangzhou uses a FIFO (first-in-first-out) trimming policy. That is, a certain number of the most recently-used data are cached in-memory. Whenever the size limit is exceeded, the oldest data in the cache is removed. (Don’t worry, it’s still in the storage!)

Persistence Across Blocks

To be understood by the Protocol, the cache must be persistent across blocks. In other words, it must be deterministic for every node in each cycle and be repopulatable from the context. To do so, the whole idea of cache is split into two layers:

the in-memory cache which lives in the memory; and,
the in-storage cache which is stored in the storage so that it can be passed around through the network as part of the context

The in-memory cache loads all the cached data into memory and can be used during block production. On the other hand, unlike the in-memory cache, to avoid redundant duplication in the storage, the in-storage cache stores unique keys mapping the data in memory to the data in storage. In this way, the in-storage cache contains only the minimal amount of data to rebuild the in-memory storage from disk.

Whenever a block is produced, a in-memory cache will be final and then a portion of it will be updated in storage as the latest in-storage cache. Then it is sent to the Tezos network as part of the consensus context so that every Tezos node will receive it and be able to repopulate an in-memory cache.

Between blocks, a Tezos node either already has the in-memory cache during the production of the latest block; or, can use the latest in-storage cache to repopulate it. In this way, the cache itself would persist across blocks without consuming too much storage space.

The contract cache

A deployed and well-typed contract needs to be parsed and type-checked whenever it is called. This costs both time and gas and that makes caching the result worthwhile. In fact, just caching the parsed form of a contract can reduce its execution time and thus increase the overall TPS (transaction per second). The protocol Hangzhou, in addition to the generalized cache infrastructure described above, includes also a powerful contract cache.

Conclusion

Currently, the new cache mechanism in the protocol Hangzhou can store the most important Tezos object, the Michelson contract, and therefore can improve Tezos performance in both execution throughput and gas consumption. Additionally, the generalized implementation of cache paves the way for caching other frequently-used types of information such as bigmaps and tickets in the future.

For contract developers, there are new handy RPCs that can be used for observing the cached contract.

context/cache/contracts/all returns the list of contracts in the cache.
context/cache/contracts/size returns an overapproximation of the cache size (in bytes).
context/cache/contracts/size_limit returns the maximal cache size (in bytes). When this size is reached, the cache removes the least recently used entries.
context/cache/contract_rank gives the number of contracts that are less recently used than the one provided as argument.
scripts/script_size gives the size of the script and its storage when stored in the cache.

For protocol developers and interested friends, you can read the Hangzhou changelog to know more about the current cache design.