Blockchain Interoperability Part III: Storage Proofs, Powering new cross-chain use cases

In Part 2 of our 3-part article on Interoperability, we explored Consensus Proofs as an emerging trust-minimized way to facilitate bridging between blockchains.

In this article we’ll explore Storage Proofs which take the trust-minimized verification concept and extends it to checking transactions in older, historical blocks. Being able to verify past transactions and user activity in this manner unlocks a multitude of cross-chain use cases.

In Part II we covered Consensus Proofs, a trust-minimized approach for bridging funds across blockchains. As bridge users typically want to see the transaction happen in the latest moment without delay, consensus proofs are useful as they constantly check the latest state of the blockchain as it moves forward.

This concept of trust-minimized bridging can also be applied in the other direction, reaching back into the past and using zero-knowledge proofs to verify transactions and data in older blocks. These ‘Historical Storage Proofs’ enable a different range of cross-chain use cases and in this article we’ll cover what they are, how they work and the players building in this space.

Retrieving Historical Data

Historical blockchain data is useful for a variety of reasons. It can prove asset ownership, user behavior and transaction history which can then be fed into on-chain smart contracts or applications. As of writing there are over 18 million blocks written to Ethereum. Smart contracts can only access the latest 256 blocks (or data in the last ~30 minutes), so ‘historical data’ refers to anything outside of the last 256 blocks.

To access historical data today, protocols typically query archive nodes providers, i.e. third parties like Infura, Alchemy or other indexers. This means trusting & relying on them and their data.

Historical data

This data can, however, be retrieved in a more trust-minimized fashion, through the use of Storage Proofs.

Storage Proofs are zero-knowledge proofs which allow verification of historical data stored on a blockchain. More specifically, Storage Proofs can be used to prove that a specific state existed at a particular block in the past. This method doesn’t require trust in a third party or oracle - instead, trust is built into the Storage Proof.

How does Storage Proofs help to validate that some data existed in older, historical blocks? This involves verifying two things:

Step 1. Check that the specific block is part of blockchain’s canonical history i.e. the block is a valid part of the source chain’s history
Step 2. Check that the specific data is part of the block i.e the piece of information, such as a specific transaction, is part of that block (this can be proven with a Merkle Inclusion proof)

After receiving and validating the proof, the receiver (e.g. a smart contract on the destination chain) has the confidence in the validity of the data and can execute a corresponding set of instructions. This concept can be extended further: it is possible to run additional off-chain computations with the verified data and then generate another zero-knowledge proof to attest to both the data and the computation.

In short, Storage Proofs allow for retrieval of historical on-chain data in a trust-minimized fashion. This is important because as we outlined in Part I, we see web3 becoming more multi-chain and multi-layer in the coming years. The emergence of multiple layer 1s, rollups and appchains means that users’ on-chain activity will likely be split across multiple chains. This puts even greater emphasis on the need for trust-minimized interoperability solutions that maintain the composability of a user’s assets, identity and transaction history across multiple domains. This is something Storage Proofs can help solve.

What are some of the use cases?

A Storage Proof allows a smart contract to check any historical transaction or data as a condition precedent. This provides a lot of flexibility in cross-chain application design.

To begin with, the Storage Proof can attest to any historical data on the source blockchain, e.g.

Account balances and token ownership
User transaction activity or inactivity
Historical prices that assets transacted at over a specified period of time
Real time asset balances in liquidity pools across different chains

Then, the proofs can be sent to the destination chain to unlock range of cross-chain use cases:

Enabling users to vote on governance proposals on a lower cost L2
Allowing NFT holders to access new NFT mints or community benefits on new chains
Rewarding users (e.g. through airdrops) based on a user’s history & interaction with specific dApps
Offering loans with interest rates tailored to a user’s aggregated transaction & credit history
Triggering account recovery for dormant accounts
Computing a historical TWAP for future swaps
Computing a more accurate AMM swap price based on liquidity pools on multiple chains

Storage Proofs essentially allow applications to query & port a user’s on-chain activity & history across multiple chains to feed into a smart contract or application on another chain.

Storage Proof - Use cases

Let’s go through a detailed example of how they work.

A detailed example of how Storage Proofs work

Let’s consider ‘X’, which is a DeFi protocol with tokens on Ethereum. A governance proposal is coming up and they want to facilitate on-chain voting on a lower-cost destination chain. Users can only vote if they held X tokens on Ethereum at a specific point in time that we’ll call ‘the snapshot’, e.g. block #17,000,000

How is this done on-chain currently?

The current way of doing this would be to query an archive node to obtain the full list of eligible token holders at block #17,000,000. The list is then stored by a DAO administrator in a smart contract on the destination chain to determine who can vote. This approach has some limitations:

The voter list can be very large & changes every snapshot, which makes it costly to store and update on-chain for every vote proposal;
There is an implicit trust on the archive node provider and the data they provided; and
The administrating DAO members have to be trusted to not tamper with the voter list

How do Storage Proofs solve this?

As we have explained in Part II, costly computations can be offloaded to an off-chain zero-knowledge prover.

The zk prover will generate a succinct proof that is sent to the destination chain for verification. For the DAO voter eligibility example above:

The prover generates a zero-knowledge proof that attests to the fact that block #17,000,000 is part of Ethereum's history (Step 1.* from above*).
After proving the validity of a block, we can prove that the user held DAO tokens when this block was finalized (Step 2.* from above*) using Merkle inclusion proofs

Verifying historical data to enable cross-chain voting

The proof is then sent to the smart contract on the destination chain for verification. If verification succeeds, the smart contract on the L2 allows the user to vote.

This approach achieves a few things. It eliminates the need:

To trust archive node providers;
For the protocol to maintain costly on-chain voter lists; and
For the user to move their assets to the destination chain

What is the setup required for a Storage Proof?

So far, we have abstracted away some of the complexity around Storage Proofs. Using them, however, requires an elaborate initial setup by service providers to ensure that they can be used without trusting the provider. Two things are generated and stored on-chain as part of this process:

A zero-knowledge proof of the entire chain (’zk commitment’): Service providers group all the historical blocks on the source chain into consecutive fixed-size ‘chunks’ (using a Merkle tree) and generate a zero-knowledge proof for each chunk to validate the grouping. These proofs are then recursively combined until they obtain a final zero-knowledge proof which is a ‘zk commitment’ to the entire chain. This attests that the provider has indexed the entire history of the chain correctly.

Illustration of 'zk commitment' to the entire history of Ethereum

Merkle Mountain Range: Providers also store the Keccak Merkle roots of groupings of block hashes (chunks) of the source chain in an on-chain data structure called Merkle Mountain Range (MMR). This data structure is used as it is easy to query and update and enables a provider to prove that a given block exists in the history of a chain efficiently. The MMR is created using Keccak256 hashes, Poseidon hashes or both. Poseidon hashes are more zero-knowledge friendly, which allows computation to be done on historical data, after which both data and computation validity can be proven via zero-knowledge.

Illustration of a Merkel Mountain Range (MMR)

As new blocks are added to the source chain, the service provider updates the ‘zk commitment’ and the MMR periodically (e.g., every hour or day) to keep up with the chain. This is done so that past blocks are always linked to one of the 256 blocks currently accessible from the EVM. This ensures that historical data is linked to one of the blocks currently available from Ethereum.

In the diagram below we set out more detail on how the setup is achieved:

Bringing it all together, here’s how Storage Proofs are used (after the setup is completed) in the context of the DAO voting example we presented earlier:

The service provider creates and stores the ‘zk commitment’ to the entire chain (i.e. Ethereum’s history) and the MMR on the destination chain
The provider allows applications to query historical data on-chain or off-chain via an API
The voting dApp on the destination chain sends a query to the provider smart contract and aims to find out whether a user held DAO tokens on block #17,000,000 on Ethereum
The provider checks two things:
- That the queried block is part of Ethereum's canonical history (Step 1. from above); the provider then generates a zero-knowledge proof of block inclusion via Merkle Mountain Range
- That the user held DAO tokens in block #17,000,000 (Step 2. from above); the provider then generates another zero-knowledge proof that the user held DAO tokens inside the block
The provider aggregates the proofs generated above into a single zero-knowledge proof
The aggregated ZK proof is then sent back to voting dApp smart contract on destination chain which verifies it and on successful verification, allows the user to vote

Teams building in the space

Several players are building to enable smart contract access to historical on-chain data in a trust-minimized manner.

Axiom, now live on Ethereum, aims to provide smart contracts on Ethereum with access to historical Ethereum data via zk-based Storage Proofs. The team is also enhancing capabilities for off-chain computation on top of historical data, and proving the correctness of this data and computation in zero-knowledge.

Relic Protocol has a similar technical approach to Axiom and is live on Ethereum and zkSync Era. Relic uses Merkle inclusion proofs to demonstrate data inclusion (as opposed to Axiom's approach of proving Merkle inclusion in zero-knowledge).

Herodotus is channeling its efforts towards the provisioning of Ethereum's historical data for L2s. The testnet implementations are available today on Starknet and zkSync Era. With an OP foundation grant, we think we know where the Herodotus team is heading next.

Lagrange Labs has introduced fully updatable proofs through their recent ZK MapReduce (ZKMR) innovation. It uses a new vector commitment called Recproofs, which extends the concept of updatability to computation over data.

Teams working on Storage Proofs

Conclusion

In this part, we covered how Storage Proofs allow verification of historical on-chain data without the need to trust a third party. This makes them a valuable tool for on-chain composition and cross-chain interoperability.

As Total Value Locked continues to migrate from Ethereum to Layer 2 ecosystems, we anticipate the rise of more expressive applications that leverage historical on-chain data through Storage Proofs.

While zero-knowledge technology is becoming faster and cheaper, the cost of generating Storage Proofs constantly to keep up with a chain's state remains a challenge. The profitability of such services will depend on the volume of queries generated by querying applications.

Despite the challenges, the significance of Consensus Proofs and Storage Proofs powered by zero-knowledge technology, cannot be overstated. We are excited to see how these technologies will be used to build a more trust-minimized multi-chain future.