From CEX to CCEX with Summa Part 2 - 0x7ec1

This post was written by Enrico Bottazzi

Special thanks to Yi-Hsiu Chen (Coinbase), Shashank Agrawal (Coinbase), Stenton Mayne (kn0x1y), Michelle Lai and Kostas Chalkias (Mysten Labs) for review and discussion.

Part 1 introduces the main concepts behind the Summa protocol.

Part 2 dives into a full Proof of Solvency flow.

Part 2 - Summa Protocol

This section analyzes the flow of an Exchange performing Proof of Solvency using Summa.

The core idea to enable the transition from CEX to CCEX is to build ZK programs that enforce the constraints that define the Solvency of an Exchange. The proof generated by the Exchange running these programs is then verified either automatically by a smart contract or by a user.

Each step describes in detail the core cryptographic components of the protocol imposing these cryptographic constraints. When possible, benchmarks related to the performance of such components are also provided.

The actors involved in the Protocol are the Exchange, the users of the Exchange, and the Summa Smart Contract (SSC) (to be deployed on a EVM-compatible blockchain).

The Smart Contract, and therefore the protocol, supports any cryptocurrency beyond ETH or ERC20.

The protocol is made of 2 macro-phases:

AddressOwnership: In this phase, the Exchange is required to provide a signature proof that they control a specific set of addresses and submit it to the SSC. This phase happens asynchronously to a Proof of Solvency Round.
Proof of Solvency Round: In this phase, the Exchange needs to prove its solvency at a specific timestamp t

Leveraging zkSNARKs enforces computational integrity guarantee while protecting sensitive data of the Exchange which is used as input to the proofs. Summa is designed to protect Exchanges’ business intelligence data such as:

The number of its users
The individual balances of these users
The aggregated balances of any group of users
The aggregated balances of the whole user base, namely the total amount deposited on the Exchange
The pattern of changes of these data across time

In the following example, we’ll describe an Exchange performing¹ a full Proof of Solvency flow involving multiple currencies at once.

AddressOwnership

In this phase, the Exchange has to prove ownership of a certain set of addresses. This information is later used in the Proof of Solvency round run at t to infer Exchange owns 0x123 -> 0x123 owns 20 ETH at t -> Exchange owns 20 ETH at t

The Exchange needs to sign an off-chain arbitrary message like "these funds belong to XYZExchange" for each of these addresses, and then submit these signatures, together with the addresses and the message, to the SSC.

The SSC operates optimistically by storing the triples {signature, address, message} within its state without performing any verification of their correctness. Any external actor can verify the correctness of those signatures and, if anything wrong is spotted, kick off a dispute.

The ownership of an address can be proven only once and "reused" across any number of Proof of Solvency rounds (although providing it at each round would decrease the likelihood of a friend attack). If the Exchange moves the funds to a new address for any reason, this procedure has to be run again only for this new address.

This phase happens asynchronously to a Proof of Solvency round.

Up to now, only crypto addresses have been taken into account. But what if the Exchange is holding reserves in fiat currencies? In that case, the Proof of Ownership of these assets can still be carried out by, inevitably, having to trust the bank. In such a scenario, the bank would need to sign a certificate that attests that XYZExchange holds x$ in their bank. This certificate can be used during a Proof of Solvency Round (next section).

Proof of Solvency Round

In this phase, both the assets and the liabilities are snapshotted at a specific timestamp t to kick off a Proof of Solvency Round. Within a round, the Exchange needs to provide a ZK ProofOfSolvency that constrains their assets to be greater than their liabilities at t. Furthermore, the Exchange is required to generate a ProofOfInclusion for each user, which proves that the user has been accounted for correctly within the liabilities tree.

1. Snapshot

In order for a Proof of Solvency round to start, the Exchange has to snapshot the state of its liabilities and its assets at a specific timestamp t.

For the liabilities, it means building a Merkle Sum Tree² out of the database containing the users' entries at t . The logic of building the Merkle Sum Tree is the one previously described.

For the assets, it means fetching, from an archive node, the balances of the addresses controlled by the Exchange, as proven in AddressOwnership, at the next available block at t for each asset involved in the Proof of Solvency.

This operation happens entirely locally on the Exchange premises. No ZK program is involved at this step. No data is shared with the public.

Building the Merkle Sum Tree doesn’t require auditing or oversight. Any malicious operation that the Exchange can perform here, such as:

Adding users with negative balances
Excluding users
Understating users’ balances

will be detected when the Proof of Inclusion (step 3) is handed over to individual users for verification.

Note that the protocol doesn’t have to worry if the Exchange is adding fake users to the Merkle Sum Tree. Each user added to the tree would increase the liabilities of the Exchange which is against their interest. This is true as long as 1) the user balance is not negative and 2) the accumulated sum of the balances doesn’t overflow the prime field. These constraints are enforced at step 3.

2. Proof of Solvency

In order to prove³ their Solvency at time t, the Exchange needs to provide cryptographic proof that constraints the assets controlled by the Exchange at t to be greater than the liabilities at t.

It is necessary to avoid the liabilities denominated in a cryptocurrency being backed by assets denominated in a different currency. This may result in a solvency “only on paper”, that may crash due to a lack of liquidity or rate volatility. Because of this, each asset is compared against the total liabilities denominated in the same cryptocurrency.

The Proof of Solvency is generated leveraging the following ZK Circuit.

inputs

The private inputs penultimate_level_left_hash, penultimate_level_left_balances[], penultimate_level_right_hash and penultimate_level_right_balances[] represent the two nodes in the penultimate level of the Merkle sum tree and can be extracted from the Merkle Sum Tree data structure build in the previous step.
The public input ccex_asset_sums[] represents the amount of assets owned by the Exchange for each cryptocurrency that is part of the Round as per the assets Snapshot performed in the previous step.

constraints

Perform a hashing of the penultimate nodes of the tree to get the Root Node (root_hash and root_balances[]). root_balances[] is an intermediary value that represents an array of the balances stored in the Root Node. In particular, Summa uses Poseidon hashing, which is a very efficient hashing algorithm when used inside zkSNARKs.
Checks that the liability sums for each cryptocurrency in the root_balances[] are less than the respective ccex_asset_sums[] passed as input

In the example, the Exchange is generating a Proof of Solvency for multiple assets N_ASSETS, therefore the length of the arrays penultimate_level_left_balances[], penultimate_level_right_balances[], ccex_asset_sums[] , and root_balances[] is equal to N_ASSETS.

(public) output

root_hash of the Merkle Sum Tree

After the proof is being generated locally by the Exchange, it is sent for verification to the SSC along with its public inputs ccex_asset_sums[], root_hash and the timestamp.

SSC verifies the validity of the proof. On successful verification, the Contract stores the public inputs.

The immutability of a Smart Contract guarantees that people have consistent views of such information. If the same data were published on a centralized server, these would be subject to modifications from a malicious exchange. This attack is described in Broken Proofs of Solvency in Blockchain Custodial Wallets and Exchanges, Chalkias, Chatzigiannis, Ji - 2022 - Paragraph 4.4.

At this point, it's worth noting that no constraints on ccex_asset_sums[] are performed within the Circuit nor within the Smart Contract. Instead, Summa adopts an optimistic approach in which these data are accepted as they are. As in the case of AddressOwnership, external actors can kick off a dispute if the Exchange is trying to claim ownership of assets in excess of what they actually own.

Spotting a malicious exchange is very straightforward: it only requires checking whether the assets controlled by the Exchange Addresses at the next available block at timestamp t match ccex_asset_sums[].

3. Proof of Inclusion

Up to this point the Exchange has proven its solvency, but the liabilities could have been calculated maliciously. For example, an Exchange might have arbitrarily excluded "whales" from the liabilities tree to achieve dishonest proof of solvency.

Proof of Inclusion means proving that a user, identified by their username and balances denominated in different currencies, has been accounted for correctly in the liabilities. In practice, it means generating a ZK proof that an entry username -> balanceEth, balanceBTC, ... is included in a Merkle sum tree with a root equal to the one published onc-hain in the previous step.

The Proof of Inclusion is generated⁴ leveraging the following zk Circuit.

inputs

The private inputs username and balances[] represent the data related to the user whose proof of inclusion is being generated.
The private inputs path_indices[], path_element_hashes[] and path_element_balances[][] represent the Merkle Proof for the user leaf node.
The public input leaf_hash is generated by hashing the concatenation of username and balances[].

Note that it would have been functionally the same to expose username and balances[] as public inputs of the circuit instead of leaf_hash but that would have made the proof subject to private data leaks if accessed by an adversarial actor. Instead, by only exposing leaf_hash, a malicious actor that comes across the proof cannot access any user-related data.

constraints

For the first level of the Merkle Tree, leaf_hash and balances represent the current Node while path_element_hashes[0] and path_element_balances[0][] represents the sibling Node.
- Performs the hashing between the current Node and the sibling Node H(leaf_hash, balances[0], ..., balances[n], path_element_hashes[0], path_element_balances[0][0], path_element_balances[0][n]) to get the hash of the next Node. In particular, path_indices[0] is a binary value that indicates the relative position between the current Node and the sibling Node.
- Constrains each value in balances[] and path_element_balances[0][] to be within the range of 14 bytes to avoid overflow and negative values being added to the tree.
- For each currency i performs the sum between balances[i] of the current Node and the path_element_balances[0][i] of the sibling Node to get the balances of the next Node.
For any remaining level j of the Merkle Tree, the next Node from level j-1 represents the current node, while path_element_hashes[j] and path_element_balances[j][] represents the sibling Node
- Performs the hashing between the current Node and the corresponding sibling Node to get the hash of the next Node
- Constrains each balance of current Node and each balance of the corresponding sibling Node to be within the range of 14 bytes to avoid overflow and negative values being added to the tree.
- For each currency i perform the sum between the balances of the current Node and the balances of the sibling Node to get the balances of the next Node.

In the example, the Exchange is generating a Proof of Solvency for multiple assets (N_ASSETS). All the users' information is stored in a Merkle Sum Tree with height LEVELS. path_indices and path_element_hashes are arrays of length LEVELS. path_element_balances is a bidimensional array in which the first dimension is the LEVELS and the second is N_ASSETS .

(public) output

root_hash of the Merkle Sum Tree result of the last level of hashing.

The proof is generated by the Exchange and shared with each individual user. Nothing is recorded on a blockchain in this process. The proof doesn't reveal to the receiving user any information about the balances of any other users, the number of the users of the Exchange or even the aggregated liabilities of the Exchange.

The verification of the π of Inclusion happens locally on the user device. It involves verifying that:

The cryptographic proof is valid
The leaf_hash, public input of the circuit, matches the combination H(username, balance[0], ..., balance[n]) of the user with balances as snapshotted at t
The root_hash, public output of the circuit, matches the one published on-chain in step 3.

If any user finds out that they haven't been included in the Merkle Sum Tree, or have been included with an understated balance, a warning related to the potential non-solvency of the Exchange has to be raised, and a dispute should open.

The rule is simple: if enough users request a Proof of Inclusion and they can all verify it, it becomes evident that the Exchange is not lying or understating its liabilities. If just one user cannot verify their π of Inclusion, it means that the Exchange is lying about its liabilities (and, therefore, its solvency).

At the current state of ZK research, the user has to verify the correct inclusion inside the Merkle Sum Tree in each Proof of Solvency Round. An experimental feature using more advanced Folding Schemes, such as Nova, would allow users to verify their correct inclusion in any round up to the current round with a single tiny proof.

What makes for a good Proof of Solvency

Summa provides the cryptography layer required for an Exchange to run a Proof of Solvency. But that's not all; there are further procedures outside of the Summa protocol that determine the legitimacy of a Proof of Solvency process.

Incentive Mechanism

As explained in the Proof of Inclusion section, the more users verify their correct inclusion in the Liability Tree, the more sound this proof is.

If not enough users verify their Proof of Inclusions, a malicious Exchange can manipulate or discard the liabilities of users and still be able to submit a valid Proof of Solvency without being detected. The probability of this to happen is denoted as the failure probability.

The Failure Probability is common to any Proof of Solvency scheme as described by Generalized Proof of Liabilities, Chalkias and Ji - section 5. A finding of the paper is that within an Exchange of 150𝑀 users, only 0.05% of users verifying inclusion proofs can guarantee an overwhelming chance of detecting an adversarial Exchange manipulating 0.01% of the entries in the Liabilities Tree.

To reduce the Failure Probability, the Exchange is invited to run programs to incentivize users to perform the Proof of Inclusion Verification.

For example, the Exchange can provide a discount trading fee for a limited period of time for each user that successfully performs the Proof Verification. On top of that, the percentage of users that performed such verification can be shared with the public as a metric of the soundness of such a Proof of Solvency process.

Proof of Inclusion Retrieval

The finding related to Failure Probability described in the previous paragraph relies on the hypothesis that the Exchange doesn’t know which users will verify their inclusion proof in advance. Instead, if the Exchange knows this information, they could exclude from the Liabilities Tree those users who won’t verify their correct inclusion. This would lead to a higher failure probability. But how would the Exchange know?

If the process of retrieving the Proof of Inclusion is performed on demand by the user, for example, passing through the Exchange UI, the Exchange gets to know which users are actually performing the verification. If this process is repeated across many rounds, the Exchange can forecast with high probability the users who are not likely to perform verification.

A solution to this issue is to store the proofs on a distributed file system such as IPFS (remember that the proof doesn’t reveal any sensitive data about the Exchange’s user)

Users would fetch the data from a network of nodes. As long as these nodes don’t collude with the Exchange, the Exchange won’t know which proof has been fetched and which has not. An even more exotic solution is to rely on Private Information Retrieval techniques.

This solution necessitates that the Exchange generates all Proofs of Inclusion simultaneously. Even though this operation is infinitely parallelizable, it introduces an overhead for the Exchange when compared to the on-demand proof generation solution. A further cost for Exchange involves the storage of such proofs.

Frequency

Proof of Solvency refers to a specific snapshot in time. Even though the Exchange might result in solvent at t nothing stops them from going insolvent at t+1s. The Exchange can potentially borrow money just to perform the Proof of Solvency and then return it as soon as it is completed.

Increasing the frequency of Proof of Solvency rounds makes such attacks impractical. BitMEX performs it on a bi-weekly basis. While this is already a remarkable achievement, given the technology provided by Summa, this can be performed on a per-minute basis.

From a performance point of view, the main bottleneck is the creation and update of the Merkle Sum Tree. This process can be sped up by parallelization being performed on machines with many cores. Surprisingly, Prover time is not a bottleneck, given that proof can be generated in the orders of seconds (or milliseconds) on any consumer device.

Another solution to avoid such attacks is to enforce proof of solvency in the past. Practically, it means that the Exchange is asked to perform a proof of solvency round related to a randomly sampled past timestamp.

Independent Dispute Resolution Committee

The whole Proof of Solvency flow requires oversight on the actions performed by the Exchange at three steps:

When the Exchange is submitting the AddressOwnership proof, the validity of the signatures must be checked
When the Exchange is submitting the ProofOfSolvency, the validity of the asset_sums used as input must be checked
When the users verifies their ProofOfInclusion, the validity of the user data used as input must be verified.

The action of performing the first two verification might be overwhelming for many users. Instead, a set of committees (independent of Summa and any of the Exchange) might be assigned to perform such verification and raise a flag whenever malicious proof is submitted.

While the first two verifications can be performed by anyone, the last one can only be validated by the user that is receiving the proof itself, since he/she is the only one (beyond the Exchange) that has access to their user data. Note that the Exchange can unilaterally modify the users' data in their database (and even in the interface that the users interact with). Because of that, the resolution of a dispute regarding the correct accounting of a user within the liabilities tree is not a trivial task as described by Generalized Proof of Liabilities, Chalkias and Ji - section 4.2

A solution to this is to bind these data to a timestamped commitment signed by both the User and the Exchange. By signing such data, the user would approve its status. Therefore any non-signed data added to the liabilities tree can be unanimously identified as malicious.

UX

Once a user receives a Proof of Inclusion, there are many ways in which the verification process can be performed. For example, the whole verification can be performed by clicking a magic verify button. Given that the premise of a Proof of Solvency protocol is to not trust the Exchange, it is likely that a user won't trust the black-boxed API that the Exchange provides to verify the proof.

A more transparent way to design the verification flow is to allow the users to fork a repo and run the verification code locally. A similar approach is adopted by both BitMEX and Binance.

While this latter approach is preferred, it also may seem intimidating and time-consuming for many users.

A third way would be to have a user-friendly open-source interface (or, even better, many interfaces) run by independent actors that allow the verification of such proofs. In such a scenario, the soundness of the verification process is guaranteed by the auditability of the code and by the independent nature of the operator, without sacrificing the UX.

Alternatively, the verification function can be exposed as a view function in the Summa Smart Contract. In such case, the benefit would be twofold:

The code running the verification is public so everyone can audit it
There are many existing interfaces, such as Etherscan, that allow users to interact with the Smart Contract and call the function.

Conclusion

This blog post presented the detailed implementation of how an Exchange can start providing Proof of Solvency to their users as a first step towards a fully Cryptographically Constrained Exchange (CCEX). By doing so, the users of the Exchange can benefit from the privacy and performance of a Centralized Exchange (in terms of transaction settlement speed and close to zero fee), while still having cryptographic-based proof that their deposits are covered.

A follow-up blog post will provide a more practical tutorial on how to leverage Summa Backend to perform the operations described before.

The path toward establishing an industry-wide standard for proof of solvency requires the definition of a protocol that is agreed upon by Exchanges, Cryptographers, and Application Developers. The goal is to collaborate with Exchanges during a Beta program to bring Summa to production and, eventually, come up with a EIP to define a standard.

Complete this Google Form if your Exchange (or Custodial Wallet) is interested in joining the program.

Furthermore, if you are interested in sharing feedback or simply entering the community discussion, join the Summa Solvency Telegram Chat.

Summa is made possible because of contributions from JinHwan, Alex Kuzmin, Enrico Bottazzi.

Benchmark Notes:

All the benchmarks related to the round are provided considering an Exchange with 500k users performing a Proof of Solvency for 20 different cryptocurrencies. The benches are run on a MacBook Pro 2023, M2 Pro, 32GB RAM, 12 cores. All the benchmarks can be reproduced here running:

cd zk_prover

cargo bench
461.08s to build a Merkle Sum Tree from scratch. At any subsequent round, the process only requires leaf updates therefore the required time is significantly reduced.
460.20s to generate the proof of 1568 bytes. The proof costs 395579 gas units for onchain verification.
3.61s to generate the proof of 1632 bytes. The verification of the proof takes 6.36ms.