Vitalik: Deeper dive on cross-L2 reading for wallets and other use cases - Cookies Research

Article TL;DR

Make it easier to read L1 from L2, L2 from L1, or L2 from another L2

Necessary to implement asset / keystore separation architecture
Can be used to optimize reliable cross-L2 calls

Goal

When L2s become more mainstream → Users will have assets across multiple L2s and L1s

When smart contract wallets become mainstream → Keys needed to access some account are going to change over time (old keys will no longer be valid)

Once both of the above occurs → User needs to find a way to change the keys that can access many accounts which live in many different places → Without making extremely high no. of txns

This is essentially saying that they are allowing the users to access the different smart contract wallets (accounts) they have on the different networks with specific keys. And there is a need to implement an architecture to achieve this without creating a complex UX

Need a way to handle counterfactual addresses which are:

Addresses that have not yet been ‘registered’ in any way on-chain → But need to receive and securely hold funds
Necessary for all: When using Ethereum for the first time → A user generates an ETH address that someone uses to transfer assets to → This address is not registered on-chain yet
Registered on-chain: Require paying of txn fees → Which requires the wallet to already hold onto some ETH
- We can think of this as a sort of ‘kick-start’ mechanism → Where the address will be registered once they have started making txns → Active wallet instead of passive wallet (purely receiving, not making txns and paying gas fees)
All EOAs start off as counterfactual addresses

Possible for smart contract wallets to be counterfactual addresses

CREATE2: Allows ETH address to only be filled by a smart contract that has code matching a particular hash

Challenges of Smart Contract Wallets

Possibility of access keys changing

Smart contract wallets have an address (unique identifier for the wallet) → Generated based on initcode → Only contain initial verification key (password for wallet)

Current verification key (might be different from the initial verification key) → Stored inside wallet’s storage

This is not propagated to other L2s

User might have multiple addresses on different L2s → Should there be counterfactual addresses that are not known yet by the L2s (since they are not registered) → Changing the access keys becomes a challenge

Solution: Asset / Keystore Separation Architecture

2 main components:

Keystore Contract

Can be on a L1 / L2

Stores verification key for all wallets owned by user

Stores rules for changing key
Wallet Contract

Exist on both L1 and L2

Communicate with each other across different systems to retrieve verification key stored in keystore contract

Asset/Keystore Separation Architecture

Summary: User has a main contract (keystore contract) that holds all the keys and rules, and separate contracts (wallet contracts) on different systems that talk to each other to get the correct key from the main contract

Implementing Asset / Keystore Separation Architecture

Light version: Check only to update keys

Each wallet stores the verification key locally

Each wallet contains a function that can be called to check a cross-chain proof of the keystore’s current state → Update its locally stored verification key to match

When a wallet is used for the first time on a specific L2: Necessary to call the function to get current verification key from keystore

Upside:

(a) Minimizes use of cross-chain proofs (expensive)

(b) Since all funds can only be spent with current keys → Wallet security is maintained
Heavy version: Check for every txn

Cross-chain proof showing key currently in keystore is necessary for every txn

Upside:

(a) Less systemic complexity

(b) Keystore updating is cheap

Downside:

(a) Expensive per txn

(b) Not easily compatible with ERC-4337 → Does not currently support cross-contract reading of mutable objects during validation

Summary:

Light version checks for key updates periodically, reducing the reliance on cross-chain proofs but incurring higher gas costs for key changes
Heavy version checks the keystore for every transaction, which is cheaper for keystore updates but requires more engineering effort to optimize cross-chain proof costs and may face compatibility challenges with certain standards

What Does Cross-chain Proof Look Like?

Scenario: Keystore is on Linea, wallet is on Kakarot

Full proof of the keys to the wallet consists of
- Proof proving current Linea state root → Given the current Ethereum state root that Kakarot knows
- Proof proving the current keys in the keystore → Given current Linea state root
2 challenges for implementation
- What kind of proofs to use
- How does the L2 learn the recent L1 (Ethereum) state root + How does the L1 learn the L2 state root → What is the delay for this

Potential Proof Schemes

Merkle proofs
General-purpose zk-SNARKs
Special-purpose proofs (e.g. with KZG)
Verkle proofs (between KZG and zk-SNARKs for infrastructure workload and cost)
No proofs and rely on direct state reading

Evaluation of Proof Schemes

Aggregation <> Cross-chain Proofs

Aggregation: Aggregate all proofs supplied by users within each block into a big meta-proof that combines all of them

Possible for SNARKs and KZG

Not possible for Merkle branches (can be combined but the cost is not worth)

Aggregation is only worth it when the scheme has a substantial number of users → Realistically it’s okay for v1 to leave aggregation out

Direct State Reading

Only for L2 reading L1
Modify L2s → Let them make static calls to contracts on L1 directly
Can be done with an opcode or precompile → Allows calls into L1 → Provide destination address, gas, and calldata → Return output
- This essentially allows the L2 to understand what the L1 state is by specifying the information that they require
Static calls → Cannot change any L1 state
If keystore is on L1 + L2s integrate L1 static-call functionality → No proofs are required at all
If L2s don’t integrate L1 static-calls → If keystore is on L2 (which it may eventually have to be) → Once L1 gets too expensive for users to use → Proofs will be required

How Does L2 Learn the Recent Ethereum State Root?

All L2s have some functionality to access the recent L1 state
This functionality is needed to process messages coming in from L1 to L2 (most notably deposits)
If an L2 has a deposit feature → Use that L2 as-is to move L1 state roots into a contact on L2
- Have a contract on L1 call the BLOCKHASH opcode → Pass it to L2 as deposit message
Full block header can be received + state root extracted on the L2 side
It is however, better to have every L2 have an explicit way to access either the full recent L1 state / L1 state roots directly

Methods for L2s to Read L1s

Main challenge with optimizing how L2s receive recent L1 state roots → Simultaneously achieving safety and low latency

L2s implement ‘direct reading of L1’ functionality in a lazy way → Only reading finalized L1 state roots → Delay will normally be 15 mins
- In extreme case of inactivity leaks → Delay can be several weeks
L2s can be designed to read much more recent L1 state roots
- But if L1 reverts (which can happen during inactivity leaks) → Even with single slot finality → L2 need to be able to revert as well → Technically challenging from a software engineering perspective
- Optimism has this capability
Use deposit bridge to bring L1 state roots into L2
- Simple economic viability might require a long time between deposit updates
Oracles are not acceptable
- Wallet key management: Very security-critical low-level functionality
- Should depend on at most a few pieces of very simple, cryptographically trustless low-level infrastructure

Methods for L1s to Read L2s

Optimistic rollups: State roots take 1 week to reach L1
- Because of fraud proof delay
- On zk-rollups: Takes a few hours → Still slow due to proving times and economic limits
Pre-confirmations from sequencers, attesters, etc.
- Not an acceptable solution for L1 reading L2
- Level of security of L2 → L1 communication must be absolute
- Only state roots that L1 should trust → State roots that have been accepted as final by L2’s state-root-holding contract on L1

Duration for L1 <> L2 Communication

Some of the methods mentioned above for trustless cross-chain operations are unacceptably slow for many DeFi use cases → These need faster bridges with more imperfect security models
For the use case of updating wallet keys → Longer delays are more acceptable → It’s not the txns that are getting delayed by hours → It’s the key changes
- Just have to keep the old keys around longer
If user is changing keys because the keys are stolen → There is a significant period of vulnerability → Can be mitigated by having a freeze function
Best latency-minimizing solution → For L2s to implement direct reading of L1 state roots in an optimal way → Each L2 block (or state root computation log) contains a pointer to the most recent L1 block → If L1 reverts → L2 revert
Keystore contracts should be placed either on mainnet / on L2s that are zk-rollups and can quickly commit to L1

For chains that hold wallets with keystores that are rooted on Ethereum / L2: How much connection to Ethereum is needed?

Answer: Not that much
Not limited to rollup → Wallets can be held on L3 / validium too
As long as keystores are either on L1 / zk-rollup
Requirement: Chain needs to have direct access to Ethereum state roots + Technical and social commitment to be willing to reorg if Ethereum reorgs + hard fork if Ethereum hard forks
Research problem: Identify to what extent it is possible for a chain to have this form of connection to multiple other chains
- Node operators and community will have double the technical and political dependencies
- Sounds similar to spillover effects from leveraging the same set of resources for multiple applications (similar to the concept of rehypothecation)
- End of the day: Can use the technique to connect to a few other chains → But at increasing cost

Preserving Privacy

If a keystore manages multiple wallets, we want to make sure that:
- It is not publicly know that those wallets are all connected to each other
- Social recovery guardians don’t learn what the other managed addresses are
A few issues
- Merkle proofs cannot be used → They do not preserve privacy
- KZG / SNARKs: Proof needs to provide a blinded version of the verification key without revealing location of verification key
- Aggregation: Aggregator should not learn the location in plaintext
  - It should receive blinded proofs
- Light version cannot be used
  - It creates a privacy leak
  - If many wallets get updated at the same time due to an update procedure → Timing leaks the information that those wallets are likely related
  - Can’t really comprehend why this might be the case
- SNARKs: Proofs are information-hiding by default
  - Aggregator produces recursive SNARK to prove SNARKs → Currently, this process is quite slow
- Direct reading L1 from L2: Does not preserve privacy

Additional Resources

The Three Transitions | Vitalik
Holding assets across multiple chains | Safe
The need for wide adoption of social recovery wallets
zk-SNARKs
Privacy applications of zk-SNARKs
KZG commitments | Dankrad
Verkle trees