Tezos Public Finance Dataset Integrated into Google BigQuery

Tezos Public Finance Dataset Integrated into Google BigQuery

Tezos Commons is excited to share that we have completed an integration of a Tezos dataset into Google BigQuery!

It is hosted by Google Cloud as a public dataset under ‘Financial services’, has been listed in the Google Cloud Marketplace, and can be found here.

Google Cloud graciously hosts the dataset for public access, you pay only for the queries that you perform on the data. However, you can query this dataset for free, up to 1TB per month, every month!

Google BigQuery makes it simple to query on-chain data in a simple, organized manner on the cloud using standard SQL syntax. You can think of it like a massive indexer that allows you to perform powerful data analysis. You are also able to perform the same or similar queries on multiple blockchains at once and compare them, or track inter-chain activity on interoperable tokens such as TzBTC.

While Tezos nodes do allow basic queries for retrieving data about addresses and transactions, even basic analyses are not supported. The Tezos data needs to be loaded into a data warehouse to derive business relevant insights. For example, this chart shows the growth of the Tezos community (funded accounts) over time, queried in BigQuery and visualized with Google Data Studio:

As you can imagine, there are countless ways that you can utilize this to gather and analyze data. Some possibilities:

  • Monitor gas costs over time
  • Monitor a smart contract’s calls and activity
  • Map to a double-entry book data structure that enables multi-chain meta-analyses
  • Integration with conventional financial record processing systems
  • Determine the most active/popular tokens, contracts, or applications on Tezos
  • Query all transactions that adhere to specific conditions
  • Audit trails

A query to get the total amount of transactions ever executed on the Tezos blockchain would be as simple as this:

#standardSQL
SELECT COUNT(*) as count
FROM public-data-finance.crypto_tezos.transaction_operations

Tezos Commons currently loads Tezos data into BigQuery daily, via new Tezos ETL (Extract, Transform, Load) tools, Airflow DAGs, and a Kubernetes node. All of which are open source and can be found here:

Please feel free to contribute or use them for your own projects!

It has been a pleasure to work with Allen Day and the rest of the Google Cloud team on this project, as well as the founder of the Blockchain ETL project Evgeny Medvedev and his team at Helix Technologies, whom we recommend to any project looking to implement their own ETL and BigQuery integration.

We look forward to continuing to improve this project and its capabilities going forward.

Other relevant links:

6 Likes