With the Merge now behind us, Ethereum’s next big milestone will be the Shanghai hard fork which will include EIP-4844. The upgrade, estimated to occur in March 2023, is likely to include an improvement to the protocol that will significantly boost the throughput of Layer 2s and bring transaction costs down to a fraction of a penny.
EIP-4844, also known as Proto-danksharding, is an exciting proposal for Ethereum that has the potential to deliver big scalability gains, especially for rollups, while also laying the groundwork that’s necessary to implement sharding. The improvement proposal, authored by Dankrad Feist and Proto Lambda, introduces a lot of new things to Ethereum and begins to address the main scaling bottleneck faced today: data availability.
But what does the term data availability even mean? How does EIP-4844 achieve greater scalability? And what are the new changes that need to be introduced to make this a reality?
The current Ethereum roadmap relies on rollups to push scalability forward until sharding is implemented. Rollups are innovative in that they take storage and execution of a blockchain off-chain, reducing the burden on Ethereum’s Layer 1 (L1) while also realizing savings for execution and storing the ‘state’.
Basically, the sequencer or verifier of a rollup creates a transaction on L1 that includes all transactions, state roots and commitments from the Layer 2 (L2), posts this information via calldata and is used by rollup nodes to process lots of transaction off-chain, then store it on the blockchain with just one on-chain transaction.
Rollups are becoming increasingly popular because of the much cheaper gas fees as compared to mainnet. But while the transaction costs on the cheapest rollups are currently under 10 cents, the fees are not as low as they could be. L2s can do even better!
As rollups become even more popular, vast amounts of data will be generated, causing the size of the blockchain to grow. This data includes the balances of accounts, the state of different smart contracts and their code, which needs to be available so that anyone can verify that the rollup is functioning how it should, all account balances are correct, the off-chain system is solvent, and so on.
Since only one honest party is needed for rollups to be secure, the data needs to be posted onto Ethereum’s L1 so it can be accessed or downloaded by all those who need it in ample time, allowing anyone to reconstruct the correct state of a rollup. However, the data burden will eventually make it more difficult to run a node, which reduces decentralization.
Right now, rollups post data on L1 to make it easier for new nodes joining the network to derive the chain, but posting that data can be expensive. The method in which rollup operators post data onto L1 is through calldata (which refers to all data passed to a smart contract at the time of execution), accounting for 90% or more of the transaction costs on L2s.
Any kind of data can be passed as calldata, including the data required to execute other transactions, but the Ethereum Virtual Machine (EVM) is not built in a way to efficiently process rollup data with low costs. Rollups use this method to just utilize L1 as storage, but gas is charged for using calldata and the EVM assumes that this data will be processed by a smart contract, which it isn’t.
Techniques have been implemented to reduce the size of calldata, such as compression, but this is just a short-term win. The histogram below shows that calldata per block is most often between the 20 to 50 kB range. Given that calldata is charged at 16 gas per byte, the costs of transacting on rollups can only be reduced so much as the amount of data and activity on L2s continues to grow.
Source: Dune Analytics
Why can’t we just make calldata cheaper?
This is exactly what a different proposal (EIP-4488) proposes, but this approach has one major downside. Every single byte of calldata must be stored by Ethereum nodes indefinitely. With the continued growth of L2s, this slowly but surely increases the storage requirements for running a node to sync with the network and hurts decentralization.
To overcome this problem, EIP-4844 makes data that all rollups use less expensive and in turn enables higher capacity for these scaling solutions.
In the current state of Ethereum, all nodes have to download and execute everything. With rollups, nodes only download the data and do not have to execute all transactions that have ever happened. Eventually, with Danksharding, nodes will only have to sample a part of the data and know it was correct.
Introduced in February 2022, EIP-4844 is a stepping stone towards introducing the infrastructure required for the final stage mentioned above, where nodes no longer have to download all the data. While the proposal adds more data capacity for rollups, it still requires everyone to download everything.
We can think of EIP-4844 as adding another separate layer (although strictly speaking it is part of the consensus layer) to Ethereum: the data layer, which permits improved data availability.
This data availability layer is strictly for the rollups that want to post data on to L1. The advantage of this modular approach to blockchains is that the layers shown in the diagram above can be scaled independently of each other, allowing different teams in Ethereum to specialize in their respective areas.
By introducing this dedicated data layer, as well as a new transaction format, rollups are provided with a cheaper alternative to submitting calldata. The immediate benefits of EIP-4844 to users is that the capacity of rollups can be increased by 10 to 100 times, reducing fees significantly without sacrificing decentralization.
Other than making some short-term scalability gains, the new blob transaction format also sets the stage for Danksharding without actually sharding the transactions.
But how does this new transaction type lead to lower rollup costs?
What is a Blob Transaction?
Rather than posting data to L1 as calldata, the new blob transactions allow data to be stored much more efficiently using large fixed-size blobs, and to curtail the growth in the size of blocks to a maximum of 2MB, there’s a limit of 16 blobs that can be included per block.
A blob transaction is just a transaction that pays the fee and includes a commitment to efficiently prove any data might exist with that blob, but the data itself is separated into something called a ‘sidecar’. The data then gets posted through the data layer rather than the L1, and it provides lower fees for users since blob space and data space are not as expensive as L1 block space.
Similar to EIP-4444, which is a mechanism for pruning the state of the Ethereum blockchain, EIP-4844 basically says that the data only needs to be available for a sufficient amount of time (one month) for honest actors to join the network, get the full state and challenge the sequencer, after which it can be pruned.
So while introducing this new transaction type may increase the average block size, the blob data doesn’t need to be stored forever by the beacon nodes (where other decentralized protocols can be tasked with long-term storage), since the main purpose of the data is for the fraud proofs.
While part of the beacon chain and fully downloaded by all consensus nodes, these blobs are not accessible from the EVM (only commitments to the blobs are). Essentially, future sharding will only require changes to the beacon node/consensus layer, while freeing up the execution layer to work on other initiatives in parallel.
The sidecar has a different life cycle than calldata since it is stored in the consensus layer and nodes can prune it after a certain amount of time. After the dispute period has passed, a standard node has no reason to keep that data, so it can be pruned and this is one of the main reasons blob space is priced cheaper than Ethereum’s block space. The blob sidecar eventually ends up with L2 verifiers to sync the L2, so they can permissionlessly reconstruct the chain.
Why is blob space not as expensive as L1 block space?
One major change to Ethereum with EIP-4844 is the introduction of a multi-dimensional fee market based on EIP-1559. With the addition of the new blob transaction type, there are now two resources that will have separate gas prices and limits: gas and blobs.
As a result, these distinct resources will be priced independently, leading to separate fee markets for data and execution. Effectively, this means that blob transactions will no longer compete with the gas usage of other transactions on Ethereum’s L1, which causes inefficient pricing for rollups when they are submitting calldata.
The blob fee adjustment mechanism will work similar to EIP-1559, targeting blocks that are 1MB on average (up from just 50-100 kB of data that Ethereum blocks can carry currently). And since it’ll just be rollups that are going to use this separate fee market, the costs of transacting will drop massively and also foster competition between different rollup solutions.
But to introduce this new transaction type, a ceremony must take place for the new cryptography that enables these commitments to be included in blobs: the KGZ ceremony.
The KZG Ceremony
Blob transactions are very similar to EIP-1559 transactions but one addition to the network is a new cryptographic primitive, known as a KZG commitment scheme.
The KZG commitment scheme is required since the EVM needs to access the commitments for each blob (instead of the full data). What this new cryptography does is allow you to send little parts of data instead of the entire database to the fraud proof to convince the fraud provider that the data matches the little parts that were posted.
The KZG ceremony will kick off during Devcon 6 in a week's time to generate the parameters required for this new cryptography. Ceremonies like this, also known as a trusted setup, have been used in the past to add consensus or privacy improvements to a protocol.
The process basically involves a group of contributors generating different secrets to get some data, the output of which is used every time some cryptographic protocol is run. For the final output to be secure, just one person needs to be honest and not publish their secret.
The KZG ceremony aims to have the largest number of contributors out of any trusted setup by making it possible to participate from anywhere in the world from your browser, instead of a complicated software package.
After Devcon 6, there’ll be a two-month public participation period where anyone can add their randomness to the ceremony and make the setup as close as possible to being trustless. Once the final output is ready, Ethereum clients will implement it ready for the Shanghai upgrade in March 2023.
EIP-4844 kick starts some major changes to Ethereum and introduces most of the groundwork needed to implement Danksharding, such as adding a new data layer, a new transaction format, a multi-dimensional fee market, and new cryptography.
Taken together, these changes reduce the data costs associated with rollups and open up big gains for scalability. But Proto-danksharding is just a starter for Ethereum’s scalability, with Danksharding being the main course where a lot of long-term improvements come into play.
Where next for EIP-4844? The devnet is currently running and can be accessed here. Following the devnet, a testnet will soon be launched to illustrate how it works in practice.
Once any issues have been ironed out, the EIP-4844 update is expected to go live in March 2023. So in just under six months’ time, rollup users should be enjoying the benefits of much lower fees, supporting Ethereum’s scaling effort while Danksharding is being worked out.