Ming Wu

Posted on Jun 05, 2022Read on Mirror.xyz

Rethinking the Architecture of Decentralized Storage (1.)

Arweave and Filecoin have provided permanent data storage in a decentralized way. Those are places where users can store their data without worries about some evil central manipulations. However, the current user experience is still very bad. First of all, no matter how large data the users want to upload to the system, the transactions may need tens of minutes to complete and get finality. This largely limits the scope of application types that are suitable to be built on such systems. Some applications have to resort to Layer2 solution to work around this issue, e.g., everPay on Arweave. Secondly, it is hard for such systems to smoothly integrate into the token ecosystems of existing Layer1 blockchains. For example, both Filecoin and Arweave have their own wallet and account address schemes. This introduces barriers for users who are only familiar with Metamask wallet and Ethereum account address. And furthermore, since the ledger system in Arweave does not support (EVM compatible) smart contract, it forces the developers to rebuild the wheel of token systems on top of the Arweave’s storage infrastructure through some mechanism called Smartweave. Such efforts include RedStone, Verto, and everPay. However, comparing the already flourishing token ecosystems in existing blockchains like Ethereum, BSC, Solana, etc., it is questionable for the meaning of rebuilding such things on a storage infrastructure which is designed to handle large amount of non-financial data.

The reason of the above issues is mainly due to the limitation of the ledger system inside the infrastructures. Take Arweave as example, it consists of a decentralized ledger with consensus to process transactions and a storage network incentivized by the consensus. Each mining node contributes to the consensus and storage at the same time. And the logic of both the consensus and the storage are mixed in the same codebase. The ledger of Arweave employs the standard Nakamoto consensus which is similar to Bitcoin and Ethereum, no wonder it suffers from the long transaction confirmation latency. This sounds ridiculous since the blockchain technologies have evolved for so many years and the consensus latency has been improved significantly from an hour to seconds. And unfortunately, the Arweave ledger does not have the smart contract built in, which significantly hinders the development of the token ecology on it. I believe this is because when Arweave project created, its major target is for storage rather than token economy.

The transaction latency of different blockchains.

Then how to address the issues? One way is to redesign the ledger system in Arweave and reimplement it from the same codebase, and do hardfork upgrade by the Arweave team. However, building an advanced Layer1 blockchain ledger system is far from trivial. It is worth huge amount of funds and an excellent engineering team. It is too uncertain that Arweave team will have enough passion to put efforts on this direction. And it also does not make sense to rely on a single team to fulfill all kinds of technological advances. So, it sounds hopeless? No, we have another option.

If we observe the existing decentralized storage system carefully, we can see two separate logics in the system. One is the storage network that takes charge of the maintenance of the huge amount of the data, the other is the consensus-based ledger that is responsible for incentivizing the nodes in the storage network for their contribution to the data maintenance. It is not necessary to implement these two logics monolithically in a single system. The ledger logic already exists in many existing blockchain systems in the industry. The storage network therefore can communicate with the smart contract of such blockchain through a better interface and let the smart contract fulfill the incentives for the storage nodes through issuing ERC20 tokens.

The architecture of modular decentralized storage with storage network grafting on existing Layer1 blockchain.

I am not going to tell you how to implement this in this article. It is conceivably doable. I just want to argue that it will be the right architecture for the decentralized storage system and show the advantages that this loosely-coupled and modular architecture design brings. First, with this modular architecture, the storage network does not rely on any specific Layer1 blockchain anymore. It can flexibly choose which blockchain ledger it wants to connect to. Therefore, it can easily leverage any technology evolution happened on Layer1 blockchain in future years, e.g., higher transaction throughput and lower latency. In addition, through this way, the storage network can choose the blockchain ledger that has prosperous token ecosystem and smoothly integrate into it without the need on some heavy cross-chain bridges. Secondly, the storage network can even choose to connect to multiple Layer1 blockchain ledgers at the same time and commit different data to different ledgers so that the data commits to different ledgers are independent and parallel, therefore, the throughput of the data commit can be linearly scaled in this way. Thirdly, since the tokens used to incentivize the storage nodes are ERC20 tokens issued from the smart contract, the users can directly use Metamask wallet to interact with the decentralized storage system as long as the Layer1 ledger is EVM-compatible. This will greatly increase the boundary of users that can be reached.