Problems with Concurrency

Concurrency

Introduction

Concurrency is a notoriously hard problem in Computer Science. In a concurrent system, you usually want to run multiple pieces of a program in a dynamic order, which have access to some shared memory. Those conditions violate many assumptions that are made when reasoning about classical programs. Reasoning about the evaluation order and memory ownership (who can modify some piece of memory) require careful analysis, rather than relying on simple patterns.

But however hard concurrency in regular systems is, blockchains make it much harder:

  • Other programs are adversarial! With regular concurrency, you are the one writing all the pieces that will be called together. In blockchains, people will actively try to break your contract by writing contracts that interact in unexpected ways with it. This makes writing a safe smart-contract more similar to a scheduler than a concurrent program.
  • Programs have to be right from the get go and errors are unforgiving. You can not rely on tests, looking for errors and patching them as they come.

Because of this, concurrency is expectedly a pain point for most blockchains, That’s why I believe it is worth its own discussion in Tezos.

Current Status

Message Passing vs Direct Call

In Ethereum, contracts can call each other’s methods, like a regular function call. This infamously led to a lot of problems, to which different mitigations (rather than full solutions, due to the not-self-improving nature of Ethereum) have been developed.

Tezos aimed to avoid this by using a message passing architecture. Concretely, it means that a contract can not directly call another contract, unboundedly yielding its control flow. This makes moot a lot of Ethereum mitigations by default, such as:

  • Finishing your state updated before calling another contract
  • Adding a self-lock so that other contracts can not unexpectedly call you while you call them

Subtleties

On top of this simple message passing architecture, Tezos has a few more features/quirks related to the interaction between smart-contracts.

Breadth First Search (BFS)

Tezos’ ordering of contract calls is quite unusual. Let’s consider this following chain of execution:

Current Call Next Calls
… [ A ]
Call to A (which call B and C) [ B ; C ]
Call to B (which call B1 and B2) [ C ; B1 ; B2 ]
Call to C (which call C1) [ B1 ; B2 ; C1 ]
Call to B1 (which call nothing) [ B2 ; C1 ]
Call to B2 (which call nothing) [ C1 ]
Call to C1 (which call nothing) [ ]

If we conceive B1 and B2 as being part of the execution of B, and C1 and C2 being part of the execution of C, then we can see that the execution of B and C are interweaved. This makes reasoning about their interactions quite hard, as you can not enforce the complete execution of B before that of C. This guarantee of the order of execution is usually a given in paradigms based on regular function/method calls.

(This is similar to a BFS of the call graph. The usual order is DFS)
Graphical Representation of the calls of the #BFS section

SOURCE / SENDER

Tezos’ has two ways to authenticate transfers’ emitters in smart-contracts:

  • SENDER, which returns the address that called the current-smart contract
  • SOURCE, which returns the address that created the initial operation that ultimately led to the smart-contract call

I do not know when you would use SOURCE, and my recommendation has usually been to always use SENDER. Indeed, relying on SOURCE for authentication makes for a ripe target of usurpation (the source calls A, which then calls B and makes it seem like the source meant to call B).

Theoretically, all the results of the execution of a smart-contract are known in advance. But practically, it is much easier to check who you directly call than the whole list of called contracts.

Problems

Closed World Assumption (CWA)

The first way to reason about smart-contracts is to use some kind of Closed-World Assumption (CWA). Concretely, we could reason about writing smart-contracts imagining that only the contracts of interest are running. This lets use reuse patterns from Cooperative Multitasking.

Cyclic Call Graph

In the example from the BFS section, reasoning about what happens is fairly easy. Each contract processes its inputs, changes its state, and then calls some other contract. There are no cycles (ie, a path where A calls B, B calls C, C calls ..., and ... calls A back).

However, recursie calls happen quite regularly. For instance, when a contract queries some information from another contract. Let’s consider the example where a contract B holds the balances of contracts, and contract A needs to know its balance. It will often look like this:

There, it’s not so good anymore. There is a cycle (A calls B and B calls A). Everything has become much more complicated:

  • A has to maintain internal state before B calls it back
  • A has to authenticate each caller. For instance, making sure that B is the one calling it back. This is further complicated by the fact that B can be decided dynamically (which means it has to be part of the internal state)

We can see here that even with a Closed World Assumption, reasoning about multiple smart-contracts that you control is not straightforward. This makes such interactions error-prone.

Open World Assumption

With an Open World Assumption, we are getting into the adversarial realm. We have seen that even with the CWA, interactions can be hard to reason about. However, in an open world, everything becomes much harder. If we continue with our previous example (A and B), then, we must also take into account that:

  • Other contracts can call A between the moment A calls Get Balance and B calls back A with Call with Balance. This means A should manage multiple pieces of internal state (one for each call to Main Entry Point). Authentication also becomes harder if B is dynamic and changes between calls to Main Entry Point.
  • Other contracts can call B between the moment A calls Get Balance and B calls back A with Call with Balance. This means that the balance of A could change between the moment A asks for it and A receives an answer. Depending on this needed invariant, this might cause small issues (A should keep track of this internally) to big ones (A can not keep track of this and B does not maintain the invariant A cares about). Or even crucial ones: all of this is too complicated for some smart-contract writer, who then messes up something.

The Open World Assumption is actually even more unforgiving. For instance, if a contract gives 1000tz to someone revealing a hash-preimage, that contract should factor a non-trivial commit mechanism. Else, bakers could simply censor the reveal operation and take the 1000tz themselves.
More attacks (on Ethereum) are described here. The one we mentioned is also common with Ethereum and is called Front Running.

Unknown Unknowns

Having such a document that people have to learn by heart, and to which more is added to as attacks are discovered, is exactly where I do not want Tezos to end. Whenever you are thinking with an Open World Assumption, you are actually never really safe. You only operate following guidelines, hoping that they cover all attacks.

In upgradable systems, it’s actually not that big of a problem (except for 0-days). You can just upgrade your software whenever new attacks are discovered. This is not possible with blockchains.

Sharding

If part of this complication stems from using a Message Passing Style and BFS, then why not move Direct Call Style and DFS? The answer is two-fold:

  • They hid and hide a lot of the complexity inherent to concurrency. This complexity however does not disappear, and makes it susceptible to bugs. It would be similar to moving to an untyped language, because the typer is too merciless.
  • They are likely totally incompatible with sharding. While Direct Call Style and DFS rely on strong synchronicity assumptions, Message Passing Style make calls asynchronous and BFS can be seen as a way to deterministically randomize the order of arrivals of messages. As such, the latters are more compatible with the assumptions of sharding than the formers.

Suggestions

I will discuss here some solutions. They are not necessarily mutually exclusive, and none I believe is a silver bullet.

Going The Ethereum Way

It was discussed before, but we could copy Ethereum.

Concretely, this would mean using a Direct Call Style and DFS. However, it is still easy to shoot yourself in the foot (and thus requires following a lot of ad-hoc rules, added over time), and is not compatible with sharding.

Read Only Calls

A simple mitigation of the problem would be to use Direct Call Style only in a limited setting.

I have identified one limited setting that seemed both safe and useful. It is described at length in Read-Only Calls. It consists of a direct call to another contract in order to get some data out of it (no mutation being possible).

I am waiting for feedback from other core developers before merging it or ditching it.

Stamps

Another mitigation would be to expand on the SENDER mechanism. This gets us something close to Capability Based Systems. It is been discussed on the Contracts Signature post (which I prefer to call “stamps”). I have worked on a WIP MR (which already features a prototype and a test). I am waiting for more feedback before putting more deep work into it.

I will likely make another post to explain the precise problems it solves, the design space, and the choices that I have made.

Framework

A proper solution would be to design a sound framework in which one can confidently write concurrent smart-contracts. Whether it should be implemented as an amendment, in high-level languages, or both, remains to be decided.

I have knowledge of different approaches that could be used to design such a framework:

  • Academic literature. Beyond some basic courses, I do not know much about PI calculus and Join calculus. At first sight, they do not seem amenable to blockchain environments. I am open to changing my mind in light of evidence that they actually are. I have heard that an INRIA researcher was working with Nomadic Labs on this.
  • Programming Languages. Some programming languages tackled similar problems (lockless concurrency, capability based security, etc.). I had a look in the past at Pony, which is about a completely different setting. More recently, I have read about ERights, which introductory material is incomplete and has not been updated in a while.
  • Layer 2 solutions. Some layer 2 faced similar problems. A fully general framework must account for the fact that different operations on a smart-contract might or might not be incompatible. This is linked to ownership of memory pieces, which is a main theme in some Layer 2 solution.

I by no means consider this research exhaustive. I do not have time to do a comprehensive overview, but follow pointers people link to me.
What would be good would be to have more people collaborate and research on this issue.


Conclusion

I hope this served as some introductory material about various aspects of concurrency on Tezos.
In the meanwhile, before better solutions are developed, I recommend to smart-contract developers to avoid complex interactions between smart-contracts as much as possible.

If you happen to necessarily need complex interactions between smart-contracts, please reach out here.
If you want to participate in an effort to alleviate one of these problems, please reach out! I would love more people working on core matters.

7 Likes

I think the problem we are trying to solve is to guarantee atomicity when performing an atomic transactional operation involving multiple contracts.

Current Tezos implementation guarantees atomicity of an operation with a single contract. Contract entry point (P, S) -> S depends on the contract current state and produces new updated state atomically.

A contract can spawn a new transaction with involves multiple contracts by creating multiple invocations (operations) for other contracts to be executed later ( (P,S) -> (operations, S) ). Breadth First Search (BFS) guarantees that the sequence of operations will not be interleaved with other operations which can alter the state of involved contracts. List of operations returned by a contract may be considered as a transaction that modifies a global state (state of multiple contracts) from one consistent state to another consistent state.

Things become problematic if we want to create an operation which depends on the state of the more than one contract. (SA, SB …, S) -> S. We want to guarantee that the multiple contract states, which are input for such operation, are consistent. We also want to guarantee that issued operations operate on the same input.

There are two discussed approaches to obtain states of other contracts:

  1. Readonly calls (views) which return immediately (Adding Read-Only Calls)
  2. Callback style view entry point design pattern (https://gitlab.com/tzip/tzip/blob/master/proposals/tzip-4/tzip-4.md#view-entrypoints)

But both of them do not guarantee transaction consistency.

Let’s consider an example of two way transfer:

There are three contracts A, B, C representing accounts with some balances that have nat %deposit and nat %withdraw entry points. A manager contract M has a nat %transfer entry point that withdraws specified amount from account A, splits it and deposit split amounts to accounts B and C.

Current Call Next Calls
... [ M(transfer) ]
Call to M transfer [A(withdraw), B(deposit), C[deposit]
Call to A withdraw [B(deposit), C[deposit]
Call to B deposit [C(deposit)]
Call to C deposit (which calls D1) [D1]
Call to D1 []

In the beginning and in the end of execution of the sequence [A, B, C] all account are in consistent state. Some operation D1 is executed after the transaction [A, B, C] finishes.

Now, let’s assume that contract B %deposit entry point wants to create another operation that depends on the state of contracts A and C (SA, SC, SB) -> (operations, SB).

If it we use proposed readonly views ( SA = VIEW “balance” A; SC = VIEW “balance” B), it will read inconsistent state (the contract A has already changed its state, but the contract C has not).

If we use callback style view entry points, the problem seems to be solved. The state of contracts A and C is inspected after the transaction [A, B, C] finishes:

Current Call Next Calls
... [M(transfer)]
Call to A withdraw [B(deposit), C[deposit]
Call to B deposit [C(deposit), A1(get_balance), C1(get_balance)]
Call to C deposit (which calls D1) [A1(get_balance), C1(get_balance), D1]
Call to A1 balance [C1(get_balance), D1, B1(return_balance_a])
Call to C1 balance [D1, B1(return_balance_a], B2(return_balance_c)]
Call to D1 [B1(return_balance_a], B2(return_balance_c)]
Call to B1 return_balance_a [B2(return_balance_c)]
Call to B2 return_balance_c [F(with balance a, c)]
Call to F with balance a, c []

But the callback style has drawbacks mentioned above:

  1. There is no guarantee that get_balance operation would not inject other operations and/or make and adversarial call (directly or indirectly) to the contract B while it collects responses from the contracts A and C.
  2. The developer needs to write boiler plate code to maintain an intermediate state while gathering responses from view calls.

The solution proposed below tries to address weak points of both mentioned approaches.

  1. Contracts may implement readonly view points as proposed in Adding Read-Only Calls
  2. Such readonly view points cannot be invoked from another contract code directly. Instead, a contract may create a delayed transaction (special kind of operation).
  3. To create a delayed transaction, a contract provides
  • A lambda which gets parameters of other contracts view results and a contract state, and return list of operations and updated state.
  • A list of other contracts views which correspond to view parameters of the delayed transaction lambda

let delayed_tx : ( a: va, c : vc, b : B) -> (operation list) * B = (a, c, b) ->

  let new_b = …

  let op = …

  [op], new_b

let contract_b_entry(p : P, state : B) : (operation list) * B =

  let view_a = VIEW “balance” a_address in

  let view_c = VIEW “balance” b_address in

  let delayed : operation = create_delayed_transaction view_a, view_b delayed_tx in

  [delayed], state

  1. When Tezos node executes such delayed transaction, it fetches a state of the contract and queries all specified views. After that, it evaluates transaction lambda and updates contract state as usual.
  2. If a delayed transaction produces other operations they are inserted into the beginning of the operation queue. They are considered a part of the whole delayed transaction to guarantee that they operate on the same global state reflected by input views.

In modified example %deposit to an account B created delayed transaction which invoke contract F with parameters based on balances of accounts A and C:

Current Call Next Calls
... [ M(transfer) ]
Call to M transfer [A(withdraw), B(deposit), C[deposit]
Call to A withdraw [B(deposit), C[deposit]
Call to B deposit (which creates delayed tx) [C(deposit), B1(delayed(view_a, view_c))]
Call to C deposit (which calls F1) [B1(delayed(view_a, view_c))], F1]
Call to B1 which calls F [F2; F1]
Call to F2 [F1]
Call to F1 []

My name is Margarita Capretto and I am doing research with prof. Cesar
Sanchez at the IMDEA Software Institute.

We are working on different lines of applications of formal methods
for Tezos smart contracts, including (1) verification of Ligo
contracts and (2) runtime verification for monitoring contract
execution (for example for preventing pitfalls dynamically).

While writing some smart contracts and trying to prove some properties
about them, we thought about some extensions to Tezos:

  1. Give to the contracts access to the call graph: this way smart
    contracts (or wrapper code inserted for the purpose of monitoring)
    would be able to differentiate between reentrant calls and external
    calls and change their behavior accordingly. As far as we
    understand, it is not possible to write a smart contract in Tezos
    that rejects only reentrant calls.

  2. Once all emitted operations are executed, return the control to the
    contracts to perform read-only operations and for example
    decided to accept or reject the whole transaction. With this, for
    example, a smart contract or a wrapper monitoring code can check
    how much of its balance was expended overall in the transaction and
    accept or reject a transaction based on that. We envision a simple
    monitoring language that allows to express constraints on the
    effect of the transaction (balanced used, reentrancy, etc) and
    reject depending on violations of the constraint. This enables
    writing super-imposition code that works (provably correct) for
    every contract monitored.

We can discuss particular examples if you wish.