Michelson REPL in a Jupyter notebook

m-kus · March 12, 2020, 2:21pm

Michelson REPL in a Jupyter notebook

We are glad to present a replenishment in our developer toolset — a Read-Eval-Print-Loop environment for the Tezos native contract language. This product is based on a popular data-science tool, Jupyter, and our ultimate Python SDK for Tezos, PyTezos. The combination of these two ingredients made it possible to implement the REPL in a short time. It should also be noted, that our solution is based on a reimplementation of the Michelson interpreter and its correctness is not guaranteed. Thus, it is not recommended to use this tool for critical tasks.

Acknowledgements

We want to thank Tezos Foundation for supporting this project, and also Nomadic Labs for new structured Michelson documentation and unit test set, which saved a lot of time and effort.

What is the purpose?

We are known as Jupyter notebook evangelists and advocate for the interactive approach in general, still, there are several areas it is best suited for.

Prototyping

Some people use TDD, some begin from the interfaces and high-level abstractions, I personally prefer to do some sketches before writing any production code. Jupyter notebook allows you to do extremely fast prototyping using practically every language (including compilable ones) right in your browser. Michelson fits very well into this concept if you add a few helper commands, which we did.

Learning

From our experience, the toughest thing in Michelson is to understand what you did wrong when the type checker is yelling at you. There are several tools that can ease your life such as IntelliJ, Emacs plugins, Try-Michelson playground, they show you how stack type changes. However, you don’t see the whole picture. In addition to that, some Michelson instructions are quite complicated and it’s almost impossible to keep all the stack transformations in your head.

That’s why we decided to focus on stack visualization and verbose logging. We believe that this “execute and check the result” approach is a perfect way to organize the learning process, and can help to achieve better results in a shorter time.

Future plans

Together with our integration testing engine, contract deployment & interaction SDK, and chain analysis tooling (that are also adapted for use in Jupyter) we can now cover the whole lifecycle of a smart contract. Our goal is to make an express Tezos developer course based on a series of interactive notebooks, which can be used as learning material, as a boilerplate for hackathon attendees for instance, or just as documentation.

REPL features

Although Michelson is a purely functional language, there are several concepts in its virtual machine that are tightly coupled with the on-chain execution environment. Nevertheless, we tried to maximally reproduce the behavior of the reference interpreter while not complicating the interface.

Debug helpers

In Michelson REPL we introduced a helper for visualizing the whole stack DUMP and also a printf-like PRINT which allows using placeholders to return only specified stack elements.
Moreover, detailed logging of all operations is enabled by default. Each instruction is divided into a sequence of atomic operations with the stack. It can be disabled using the DEBUG switch.

To clear the whole stack you can use the DROP_ALL helper.

You can also load Michelson code from a local file using the INCLUDE statement.

Blockchain-agnostic instructions

All Michelson instructions that only read and write the stack work as usual. This includes control structures, stack manipulation, arithmetic operations, boolean operations, cryptographic operations and operations on data structures (except big maps).

Dealing with Big Maps

Working with large maps is a bit difficult, as they are represented on the stack as an integer pointer, not an array of key-value pairs. When you call EMPTY_BIG_MAP a new temporary (has a negative index) big map is allocated. You can use MEM, GET, and UPDATE instructions as usual, but in order to see the big map state call BIG_MAP_DIFF helper while you have this big map element on top of the stack.

Mocking blockchain data

Values that are pushed onto the stack with the following instructions can be overwritten with the PATCH helper: AMOUNT, BALANCE, CHAIN_ID, NOW, SENDER, SOURCE.

The CONTRACT statement by default does not do type checking when casting.

SELF instruction uses a dummy address (same as the reference interpreter) and requires the parameter section to be defined.

Internal operations

SET_DELEGATE just pushes an internal operation content onto the stack.

CREATE_CONTRACT and TRANSFER_TOKENS also decrease BALANCE if called with a non-zero amount. CREATE_CONTRACT calculates the originated contract address the same way the reference interpreter does.

Unlike in the real environment, these operations will not be applied, and accordingly, the metadata section with the results will not appear.

Simulating contract execution

When on-chain, a contract execution starts with pushing a pair of parameter and storage values onto the stack and ends with reading the list of spawned operations and resulting storage as well as allocating big maps and generating big map diff.

In our REPL, we tried to reproduce this workflow and introduced several helpers that determine the start and the end of the contract execution.

First of all, you need to define parameter and storage types. Optionally, you can also define the code section. Then you can either use the RUN helper (if code is defined) or BEGIN / COMMIT statements. RUN and COMMIT return a list of spawned operations, resulting storage, and big map diff. As a side effect, you can reuse allocated big maps and thus emulate edge cases such as copy and remove.

Accessing on-chain data

Sometimes it is convenient to access the blockchain data right from the notebook. The RESET helper allows us to specify the network we should bind to. After that behavior of the following instructions changes: CHAIN_ID, NOW, CONTRACT (type check enabled). INCLUDE helper now can also load code from the blockchain.

The coolest thing is that now you can access real big maps by a pointer, right from your pseudo-contract. If you are loading the contract source from the network, a special variableCurrent is initialized with the current contract storage.

Jupyter workflow

Michelson REPL stores an internal context that includes a stack, patched blockchain instructions, allocated big maps, origination index, and several more variables. This context is shared across all the cells. The context can be reset using the RESET helper or by restarting the kernel. When there is a runtime error during a cell execution the context is rolled back to the previous state. This can be a bit counterintuitive because usually stack language REPL drops the whole stack. On the other hand, it is handy because you can treat the cell as an atomic transaction.

Try it out

We’ve prepared an interactive tutorial demonstrating the basic REPL functional. The quickest way is to run this notebook online via the Binder service.
Alternatively, you can check out the rendered version.
More installation options available at https://github.com/baking-bad/michelson-kernel

Feel free to ask any questions in our telegram chat @baking_bad_chat and slack channel #baking-bad
Take care and be well!