Maintaining internal state between callbacks

murbard · February 10, 2020, 7:06pm

Michelson contracts do not treat internal transactions as function calls but rather explicitly return a list of operations to be executed. This has a few benefits and a few drawbacks.

Pros:

Guards against reentrancy bugs
Conceptually close to what cross-shard communication might look like

Cons:

Adds complexity when data is needed from other contracts
A few constructs are impossible (e.g. Marbe flash lending on Ethereum)

Views are supposed to address some of the complexity of getting read only data from another contract but, in some cases, you need to be able to update those contracts.

For instance, suppose that a contract X owns some basked of FA1.2 token and provides the functionality that, when you call it with a secret code, you receive 10% of that basket. In this example, the callback is read-only and would be addressed by views, but this is only for clarity of exposition, there are examples with stateful operations.

When you call X, X needs to find out its balance in each of the tokens it holds, so that it can pay 10% of it. The way this would work right now is that you’d want to have two entrypoint

type entrypoint = 
| RedeemCode of string
| ProcessCallback of nat

Normally, RedeemCode would be called first, and send a callback to the first token contract it might hold a balance in, passing ProcessCallback as a callback. The issue is that, when the callback is called, we need to remember

whether we are in a valid session (a code is being redeemed)
who to send the tokens to
which balance are we currently getting

The only way to do so, right now, would be to store all of this information in the contract’s storage so that it is preserved between calls.

On top of being cumbersome, this is highly inefficient as gas cost for writing data to storage are quite high, and they will be incurred for every single write even though none of them may actually be committed to disk.

If the getBalance function were to provide an extra field for information, and pass it back to us, this would be a lot easier than having to store the data. This can be done explicitly today, but is not part of the standard.

One possible extension of Michelson would be to attach data to an entry point. For instance if we have
an entry point %callback_foo of type (pair x y) we might pass an entry point %callback_foo x of type y. I’m using pair in this example but my preference has long been to allow mini stacks as types, so we might have %callback_foo taking type x :: y (meaning a x stacked on a y) and pass %callback_foo x taking type y.

Another, largely equivalent approach would be to use an ephemeral storage, a storage per contract that explicitly does not persist between calls.

tzemanovic · February 11, 2020, 12:47pm

I think partial application can be used to achieve this with some limitations. Say for the %callback_foo : (pair x y), the caller can construct lambda (pair x y) (pair x y) and APPLY it on x to attach data to it, getting lambda y (pair x y) that would be passed to the other contract. The other contract would EXEC on y to obtain the pair x y for the callback.

murbard · February 11, 2020, 11:04pm

That’s a good point, we may not need partial application of entry points since we have it in lambdas already.

If I understand correctly, what you’re suggesting is that instead of the pattern being that a contracts call you back with a piece of data, the pattern would be that you pass it a lambda, they execute the lambda on the data they want to send to you, and send you the result of the lambda.

I think we’re missing a bit of polymorphism : we don’t want the contract we call to have to know the type of x. All it should care about is the the type of the entry point and the return type of the lambda are the same.

tzemanovic · February 12, 2020, 10:01am

That’s right, but possibly this proposal for dynamic type might help with that. Another possible limitation with this approach though is that the other contract might manipulate x for the callback.

murbard · February 12, 2020, 3:03pm

Yes, it can be manipulated, which is why allowing partial applications in entry-points seems like the way to go.

tom · February 16, 2020, 4:31am

Somewhat off-topic, but…

Your example is interesting to me, despite getBalance being read-only.

Suppose getBalance is a ‘view’ (synchronous read-only call) and people try to implement “transfer 10%”. How will they do it?

I think, very likely, they will maintain no “internal state”. They will merely call getBalance and emit a (FA1.2) %transfer call, setting the amount to 10% of the returned balance.

Hooray, complexity addressed? I don’t think so. Not at all.

This implementation may be vulnerable to a kind of “reentrancy” attack: if a contract sender is able to redeem N times, they may emit the N redeem operations all at once. Due to the breadth-first handling of internal operations, the last getBalance will happen before the first %transfer. The net result will be a transfer of N*10% instead of the desired 1-(90%^N).

With asynchronous getBalance, I think the author of a “transfer 10%” behavior is likely to prevent such obvious problems. Being forced to maintain internal state, they will probably disallow multiple concurrent redemption.

However, it is still not completely clear to me what should be the specification of a “transfer 10%” behavior, and how to satisfy the specification. The redeem->getBalance->processCallback->transfer might generally happen concurrently with other operations by other contracts which may change the relevant balance.

Some ideas for satisfying a very strong specification (ignoring potential liveness issues…):

Forking FA1.2:

Add support for atomic “transfer 10%” as a new entrypoint. Great, if possible. But maybe missing the point.
Add an %assertBalance entrypoint, which fails if the balance is not equal to the provided amount. After doing an initial getBalance and receiving the callback, emit [%assertBalance, %transfer] in order to effectively do a CAS, asserting that the balance did not change immediately before the transfer is processed.
Alternatively, extend %transfer to optionally allow a direct CAS, as if it is %assertBalanceAndTransfer.

Not forking FA1.2:

Implement assertBalance using getBalance. After doing an initial getBalance, emit [%getBalance, %transfer], but this time set the callback for the %getBalance to your own implementation of assertBalance, maintaining the original balance value in internal state.
What else?

murbard · February 17, 2020, 12:41am

That’s an excellent point and, in my opinion, a fundamental problem with views, not with FA1.2. It seems that they simply don’t interact well with how operations are applied.

galfour · February 17, 2020, 2:04pm

I do not understand the post.

Why would X call itself to know its balance here? Can’t it just read it from its storage?

I have trouble parsing this sentence.

That still seems dangerous (for reasons similar to the solution proposed with internal contract signatures). Because other contracts can call you to mess with you before you’re finished processing stuff.
Possibly, having a heap of operations sharing the same operation order / storage / signatures is not the right structure.

tom:

This implementation may be vulnerable to a kind of “reentrancy” attack: if a contract sender is able to redeem N times, they may emit the N redeem operations all at once. Due to the breadth-first handling of internal operations, the last getBalance will happen before the first %transfer. The net result will be a transfer of N*10% instead of the desired 1-(90%^N).

With asynchronous getBalance, I think the author of a “transfer 10%” behavior is likely to prevent such obvious problems. Being forced to maintain internal state, they will probably disallow multiple concurrent redemption.

However, it is still not completely clear to me what should be the specification of a “transfer 10%” behavior, and how to satisfy the specification. The redeem->getBalance->processCallback->transfer might generally happen concurrently with other operations by other contracts which may change the relevant balance.

So. I actually do not think asynchronous getBalance + disabling of concurrent tx would solve the problem. Imagine there is a smart-contract that will predictably call another contract that will call this contract with this entry-point, you could imagine some logic that lets you call it before to block them.

I do not believe it solves the problem, but that it obfuscates it.

On the other hand, I have no idea what a good solution would look like. With a BFS order for the evaluation of txs spawning txs (which is what we have right now):

When designing a smart-contract, you have to keep in mind that you might asynchronously get multiple txs in different states. Do you maintain a map(tx , internal_state_management) to deal with this??
When calling a smart-contract, you have to well understand the semantics of this entanglement. For instance, if only one tx is enabled at a time, and you’re a smart contract relaying A and B, then if you relay some operations of A and B at the same time, the first one being called could prevent the other from having its call evaluated??

My immediate reaction would be to immediately change the BFS for a DFS, but a less immediate reaction is that an async / messaged based model does not solve re-entrency issues (it might make them easier to tackle?) and that I clearly do not have a deep grasp on all the implications of the current model.

I might spend some time trying to see how the academic literature might apply here. Even though locks seem way too constraining, they might be good (if I interact some contract, there is a lock on it). Possibly lockless concurrency will prove more fruitful.
Or I will realize there is a simpler solution that does not need to involve what I just mentioned.

If anyone has any pointer, please contact me.

murbard · February 17, 2020, 2:59pm

Different contract. Imagine a dex that decides to spend 10% of its balance.

I’m suggesting partial application of values to entry-points.

Sure but that’s orthogonal to whether or not its ephemeral. I think you’re assuming this thread was about the point Tom raised, but it wasn’t. This isn’t meant to address that issue.

They might create a “get_balance_pending” lock, but that doesn’t mean the balance wouldn’t get updated in the meantime, through some other contracts, just that they wouldn’t update it. In essence you have a lock on your own contract, but not a lock on the contract you’re getting the callback from.

murbard · February 17, 2020, 6:23pm

I suggest we move the discussion about asynchronous calls / views / reentrancy into its own thread.