Stateless Clients: A Path to Decentralization in Ethereum - YQ

As Ethereum usage increases, running a full node becomes more resource intensive and bandwidth intensive. This results in fewer people being able to run full nodes, reducing the decentralization of the network. Additionally, Ethereum struggles to scale as transaction demand increases, leading to network congestion and high gas fees.

Stateless clients proposed by Vitalik in 2017 offer a potential solution to both the decentralization challenges facing Ethereum. The key idea behind stateless clients is to reduce the storage and bandwidth requirements for running a full node, making it feasible for more people to participate and decentralize the network. This essay will provide an in-depth look at how stateless clients work and their potential benefits and drawbacks.

What is the Ethereum State?

To understand stateless clients, we first need to understand the concept of "state" in Ethereum. The Ethereum state refers to the current status of all accounts, contracts, balances, nonces, and storage in the Ethereum world. It can be thought of as a database that stores all relevant information about the Ethereum network at a given point in time.

The state is persisted in a Merkle Patricia trie, which is essentially a modified Merkle tree that stores key-value pairs. The root hash of this trie summarizes the entire state. After each new block, the state updates based on the transactions in that block. The new state root hash is included in the block header.

As more accounts, contracts, and transactions are added over time, the Ethereum state grows larger and larger. Today, the state size is over 1TB and increases by tens of gigabytes per year. This growing state underlies the issues with decentralization.

Why State Growth Causes Problems

The increasing Ethereum state size causes several key problems:

Longer sync times for new nodes - It takes an extremely long time for a new node to sync up by processing all historic state changes. This hinders decentralization by making it harder to run new full nodes. Syncing up a new node from genesis currently takes multiple days, up to weeks, on consumer hardware. This represents a major barrier to efficiently spinning up new nodes and allowing more participants to join the network.
Increased hardware requirements - Larger state requires more storage, memory, and processing power to store, access, and update. This blocks less well-resourced users from running nodes. At a minimum, running a fully synced Ethereum node now requires an SSD with 1-2TB of capacity. This is out of reach for many potential node operators.
More bandwidth usage - Broadcasts of new blocks must also include the updated state, requiring more bandwidth. This increases costs for node operators. Currently the state dominates most block broadcasts, so block sizes continue growing. More bandwidth translates to higher costs for node operators.
Slower block verification - Reading and updating a larger state makes block verification slower, limiting transaction throughput. Each transaction requires multiple storage reads and writes to update balances, nonces, contract state, etc. A larger state means more reads/writes per block, reducing how many transactions can be processed per second.
Permanent storage costs - Once data is added to the state, it must be stored forever. This creates unbounded state growth. There is currently no mechanism to actively delete old and unused state data. So the state retention costs increase indefinitely as long as Ethereum continues operating.

Stateless Clients Explained

Stateless clients provide a way to verify new blocks without needing access to the full Ethereum state. They utilize cryptographic proofs called "witnesses" that prove the validity of state changes in a block, without having the underlying state data.

Here's how stateless clients work at a high level:

The client stores only block headers and state roots, not full state data. Block headers contain metadata like the root hash of the state trie after that block is processed.
When verifying a new block, the client receives a "witness" along with the block. This witness is a set of Merkle proofs that demonstrate specific state updates from transactions are valid.
The witness contains Merkle proofs of specific state values needed to process transactions. For example, account balances or contract storage updated.
The client uses the witness to ensure the transactions are valid against the last known state root. The proofs authenticate that the state changes match the previous root.
If valid, the client updates to the new state root provided in the block header. This new state root will be used to verify the next block.

By using witnesses to verify state instead of storing the full state locally, stateless clients gain several advantages:

Very fast sync time - no need to replay historic state changes. A stateless client can sync almost instantly with just the block headers.
Low storage requirements - state roots are only 32 bytes. Instead of hundreds of GB of state, only block headers are needed.
Less bandwidth - only block headers and witnesses transferred, not full state. Bandwidth usage is minimized.
Quick verification - witnesses contain only small relevant state subsets. Only the updated accounts/storage touched are proved.
Easy light client support - light clients can easily verify proofs. The light client model is very compatible with stateless verification.

Challenges with Stateless Clients

While stateless clients enable some major benefits, there are also significant technical challenges to overcome:

Witness size - witnesses could be too large to transmit efficiently. If full Merkle proofs are used, they may exceed block size limits.
Witness creation - generating optimal witnesses is complex for block proposers. Proposers must assemble the right proof fragments to verify each transaction.
No witness incentives - providing witnesses earns no direct rewards. Unlike mining, there is no built-in incentive structure for witness creation.
Temporary data - witnesses prove state at one point in time, requiring regeneration. Witnesses cannot be reused as the state progresses.
State storage - someone still needs to maintain the full state to produce witnesses. Stateless verification relies on stateful witness generation.
Complex applications - some contracts may rely on large state subsets, bloating witnesses. For example, contracts that update many storage slots per transaction.

Possible Solutions

Researchers have proposed various solutions to address these challenges:

Verkle trees - special data structures to reduce witness sizes. Verkle trees use succinct cryptographic commitments to minimize proof size.
Witness caches - proposers could maintain recent witnesses to reuse. Caching witnesses that are likely to be relevant again amortizes creation costs.
Protocol incentives - reward mechanisms for providing useful witnesses. New incentive structures could compensate witness creation.
Intermediate state roots - track roots over time to avoid regenerating proofs. Maintain partial roots could reuse witness fragments.
State rent - require payments to maintain state long term, pruning unused state. Rent forces cleanup of stale storage to limit proof size.
Partitioned witness model - split state handling between proposers and verifiers. Have some dedicated proposer nodes generate witnesses.

There are tradeoffs between these approaches and further research is needed to discover optimal implementations. Fortunately, the rapid innovation happening in zero knowledge cryptography could open up new possibilities for efficient stateless clients.

Potential Impact

If the technical obstacles can be overcome, stateless clients could significantly advance Ethereum:

Faster syncs and verification to support higher transaction throughput. Stateless validation will drastically speed up block processing.
Reduced resource requirements to run nodes, improving decentralization. Laptops and hobbyists could realistically run full nodes.
Better support for light clients like mobile wallets. State proofs are highly compatible with the light client model.
Smoother introduction of sharding, with stateless verification between shards. Cross-shard transactions can utilize efficient state proofs.
Ability to delete and prune old state data that is no longer useful. State growth can be actively managed instead of unbounded.
More flexibility for node operators to customize state based on needs. Nodes could tailor state retention policies to use cases.
Transition to a model where computation and bandwidth matter more than storage. Architecture shifts towards a more cloud-friendly model.

There are also some potential risks, like increased vulnerability to DDoS attacks and blockchain history only being reliably stored by a few node operators. However, cryptographic proofs could reduce these risks. Overall, stateless clients are one of the most promising approaches to overcoming Ethereum's current limitations.

Conclusion

Ethereum's growing state size poses challenges for decentralization as adoption increases. Stateless clients present a way out by enabling nodes to verify transactions without the full blockchain state. This could eventually allow mobile phones to run Ethereum nodes, greatly increasing decentralization.