Proposal Idea: Add Possibility to Verify Source For Tezos Smart Contracts

Current Status:
Project XY deploys his smart contracts (be it Ligo, SmartPy or Archetype). Everything works great. Users can see the compiled Michelson on Better Call Dev and even via thebsite simply fork it.

But the option to verify the contract source code(with the ABI and ByteCode) on etherscan so that after veryfing the solidity code is displayed for everyone is not possible on Tezos.

Quote from Alexander Eichhorn: @Alex
This level of verification does not (yet) exist in Tezos. Although the Michelson bytecode can have type annotations there is still a huge gap between actual source (smartpy, etc) and Michelson. I’m not aware of any Tezos compiler that adds version info into the bytecode (there is a more or less well used field in the tzip 16 metadata, but few contracts actually use tzip16 or add the compiler version there). As explorer we’d be interested to link original source to a published contract, but there’s no easy way for us to locate/identify the original source or automate the verification process.
Only thing you have for free in Tezos is the standardized ABI interface and type info for the contract’s storage. Both are explicitly published as Micheline structures together with the code of the contract (see parameters and storage sections in contract scripts)

What users cant do(especially non code/non Michelson savy users) is to verify the contract source before interacting. Why is this so important?

Well because…

  • the advertised contract code may be different from what runs on the blockchain.
  • smart contract verification enables investigating what a contract does through the higher-level language it is written in, without having to read Michelson which is a stack-based language.
  • to provide additional transparency
  • it adds ultimately to the “trustless” ethos - and smart contracts are designed to be “trustless”. So a user shouldnt trust but be able to verify the smart contract source code
  • verified smart contracts can be more easy used/changed by others (becuase it is not only Michelson thats available) = this is a benefit for the whole ecosystem

Why is Michelson not sufficient enough for that matter?
Michelson forms the lowest level of a Tezos smart contract. The Michelson code is what will be deployed on the Tezos network. However, while reading or writing a Michelson code is accessible for small smart contracts, it can become pretty tedious and complicated to use or read, for more complex smart contracts.

Most of the people know this feature from the Ethereum Blockchain. This is an excerpt from the Ethereum documentation how the process works on Ethereum:

Deploying a smart contract on Ethereum requires sending a transaction with a data payload (compiled bytecode) to a special address. The data payload is generated by compiling the source code, plus the constructor arguments of the contract instance appended to the data payload in the transaction. Compilation is deterministic, meaning it always produces the same output (i.e., contract bytecode) if the same source files, and compilation settings (e.g. compiler version, optimizer) are used.

Verifying a smart contract basically involves the following steps:

1. Input the source files and compilation settings to a compiler.
2. Compiler outputs the bytecode of the contract
3. Get the bytecode of the deployed contract at a given address
4. Compare the deployed bytecode with the recompiled bytecode. If the codes match, the contract gets verified with the given source code and compilation settings.
5. Additionally, if the metadata hashes at the end of the bytecode match, it will be a full match.

So this basically looks like this
grafik

I am (unfortunately) not competent enough to judge if this is a good way but I suppose that the Core developers might know if this is a feasible solution to enable this on Tezos too. Or if another approach is needed. It would be great if Core developers would take this on or someone else like for example @G_B_Fefe

Once this possibility is added to Tezos Block Explorers like Tzstas or Better Call Dev etc can add this source verification option so users can easy verify the contract. Maybe also other tools will emerge from this possibility like there are on Ethereum (e.g. Sourcify or Tenderly)

3 Likes

It’s a good idea.

It’s more of an ecosystem coordination issue than a dev issue. You need some sort of standardized pipeline where the inputs are:

  • source code
  • a compiler (smartpy, ligo…), potentially in docker container format
  • some metadata, such as the options passed to the compiler

and the output is some michelson and micheline structure that can be mapped to a KT1 address.

You also need a standardized storage format for verified source code and metadata, maybe in git. It would also take care of deduplication (store the source for similar contracts only once).

1 Like

Thank you for your comment. Ecosystem coordination issue sounds like a bigger hurdle than a dev issue, at least it seems to me but I am hopeful we will see this feature on Tezos in the coming months :slight_smile:

So this can be achieved via a new TZIP and not necessarily an change/update in the protocol itself?

Again than you for the comment nicholasochem. I am still interested to hear other peoples thoughts that are for sure more clever than me :smile:

Please share your opinion and thoughts people :slight_smile: @trolleps @tulpenhaendler @simonmcl @codecrafting @rognierbenoit @Jordan @tom @germanD @iguerNL @rafoo and also others please!
Thank you very much :slight_smile:

1 Like

I didn’t thought a lot about it. But I guess that this could be achieved, e.g., via a dedicated section in Michelson (let’s call it source, origin, info, or whatever) like parameter, storage, code sections.

Such a section could contain the hash of the source code, information about the compilation toolchain, … as well as the hash of the michelson smart contract (without this new section).

But I guess that this should also be applied to global constants.

1 Like

We plan to implement something like this for SmartPy. To reproduce the Michelson code deterministically the following elements will have to be stored:

  • the compiler version
  • any flags given to the compiler
  • all the SmartPy source files (there may be several ones, due to imports)

Then there is the question of how to archive and actually run different compiler versions. Tools like Nix and Docker come to mind, though we would probably take into account indexers’ suggestions (@Alex?).

What I’m not clear about is whether TZIP-16 metadata is the right tool for this. A contract’s metadata may change. So anyone able to modify it could replace the SmartPy code, as long as it compiles to the same Michelson code. For example, one could rename local variables inside an entrypoint or refer to a different compiler version. Should this be considered a problem? Certainly something to be aware of. On the other hand, an on-chain element source as suggested above by @iguerNL, would presumably be immutable.

A slightly different approach would be not to have any link from the contract at all, but submitting the source code to indexers directly. The relationship with the deployed contract would be established by comparing (hashes of) the compilation result and deployed contracts. An advantage of this approach is that source code could be added post hoc, including for contracts already deployed today.

1 Like

I don’t understand the use case. If I have the hash of the source code I still need the source code to check the hash so how does that differ from compiling the code and checking that the output is the given Michelson script?

Thank you for taking your time! This sounds good although my technical background is a bit limited, with the hash of the source code one could then add later the source code and the hashes would be compared to see if it matches?

Thats great news!

I agree I think this should be the end goal and the preferred way? - but as I said you guys have way more technical knowledge!

Thats also an interesting approach. I dont know which one fits the best as a final solution for implementation.

Ok I guess I understand it a bit wrong what @iguerNL proposed? :slight_smile: Thanks for taking your time here rafoo and roland!

To go a bit further, I would add also :

  • sourceCodeHash + compiler version + sourceCodeSignature
  • michelsonSignature

we this I can verify that :

  • compiled source code Hash produces same has Michelson code Hash <= to trust sourcecode <-> Michelson correspondance
  • sourceCode is signed by known tz address <= to trust the editor
  • deployed Michelson code is deployed by known tz address <= to trust the deployer
2 Likes

The process of giving users the ability to do this would indeed be great. As someone else mentioned there needs to be a lot of agreement and standards between dev teams to make something like this happen

… then, what everyone forgets is tooling. There are still many things that rely on third parties and hacky code to achieve. If you want to forge / parse operations outside of the CLI you have to rely on something like Taquito, which is built with type script. Doesn’t work well for native desktop / mobile apps. I see something like the above feature being quite challenging for every team to implement until bits and pieces of the CLI / Octez are provided as standalone binaries in something like c++ or rust, so that in can be imported in any language

2 Likes
1 Like

Any chance to get this in the next few months?