Introduction
With the high transaction fees on Ethereum, scalability has been top-of-mind for many cryptocurrency projects. A popular solution is to separate program execution from the Layer-1 consensus mechanism, by offloading the former to off-chain operators. The latest Ethereum Foundation roadmap proposed by Vitalik even portrays a world where the majority of transactions happen in “Layer 2” (L2) and are only periodically settled on the main Ethereum 2.0 chain.
Even though this idea is gaining traction, it isn’t new. In 2017, approaches like Plasma and state-channels were conceived as a way to improve scalability in the face of increasing usage of dApps like Cryptokitties. But even with newer approaches like rollups, data availability remains a major challenge. Because even though it is possible to separate the computation from the consensus layer for greater transaction throughput, it adds complexity to the smart contract execution model, particularly for composable contracts. Moreover, more scalability often comes at the cost of decentralization. [1]
In this post, I argue that data availability is the key consideration when it comes to scalability. I’ll start by defining the concept of data availability. I’ll then discuss in more detail how this problem relates to scaling blockchains. Finally, I’ll survey the solutions in use today, as well as some on the horizon.
What is “Data Availability”?
CIA. For most, the acronym conjures up the image of a well-dressed spy in some foreign locale, playing cloak-and-dagger. But CIA is also an acronym for a key tenet of information security: confidentiality, integrity, availability. Of those, availability is the most relevant today for blockchain scalability.
To understand what data availability means, we have to first define two important terms: state and state transition. Say you have a hand-written ledger where you keep track of who owes you money. What the ledger looks like at any single moment in time is called its “state.”
You just won a bet with a friend and so want to add an entry to the ledger. Afterward, the ledger will have one additional entry and thus will differ from before. Writing a name and amount is an example of a state transition: or how we move from one state to the next.
You may be wondering what this has to do with blockchains, but the above system describes how a computer roughly works at a low level. It reads some data from memory or storage, runs some program based on that data, and then outputs the result, perhaps by displaying it to the user or writing it to disk. Said in terms of the above description, a computer program takes the prior state and performs a state transition, creating a new state as a result.
If we think about this (admittedly oversimplified) model of a computer, we see that both state and state transitions are essential. How can we run a “program” without data to operate on? And what is the point of a program that can’t move from one state to another (as in our simple ledger)?
Data Availability and Blockchain Scalability
One way to model a blockchain like Ethereum is as a distributed “world computer.” Instead of a single machine, blockchain networks consist of many machines that keep track of the same global state and agree on how transitions from one state to another should occur. Each new block in the chain is therefore based on consensus between the nodes operating the network. Each new block contains the new ledger state. It also contains the individual transactions that describe the state transition from the prior block to the current one. This way, other nodes can verify that the state transition and new state are valid. Otherwise, they reject it and the block won’t be added to the chain. [2]
Blockchains are slow compared to centralized networks because every node has to process every state update, which is inefficient. [3] As we mentioned earlier, there have been different proposals to remove this requirement and separate execution from consensus. Here are a few schemes that have been proposed.
- Channels — Two parties interact off-chain, settling the net result on-chain after some time has passed. In case of a dispute, either party can exit at any time. This has evolved as a fairly robust solution, even with multiple parties (see Lightning Network), but it assumes all parties generally remain online and is difficult to generalize to account for smart contract interactions.
- Plasma — An operator processes transactions into blocks on a separate sidechain. This sidechain is connected to the main chain via a smart contract that manages user deposits and entry into/exit from the sidechain. If the operator misbehaves, users can submit a fraud-proof to redeem their deposit on the main chain. Because it is less decentralized, Plasma can enable significantly higher transaction throughput. However, in the event of a mass exit from the Plasma chain, congestion on the main chain may prevent some users from claiming their assets in time.
- Rollups — Similar to Plasma, in a rollup users submit transactions to untrusted operators off-chain. Unlike Plasma, however, rollup operators publish the resulting state of every single block onto the main chain. This allows for anyone to verify the correctness of the state transition, and it does provide higher throughput because nodes on the main chain are not executing every transaction. However, because rollups are storing data on-chain, they are somewhat limited in the scalability benefits they provide — about an order of magnitude less than something like Plasma. Also, the state of the main chain grows at a much faster rate.
Each of these approaches reduces the burden of execution from every node to a smaller number of operators who process transactions but do not settle them. This improves efficiency and throughput. In a scenario where all actors are honest, any of these approaches could work today. Of course, if that were the case, why would we even be using a blockchain? Any scalability solution has to account for the case where one or more of the participants and/or operators are dishonest. And there are two types of misbehavior that we would like to prevent. For example, we would like to ensure the integrity (remember the “I” in CIA) of state transitions. In other words, we want to make sure that no party can “cheat” the rules of the system by, say, double-spending. The solutions to that problem are generally classified as either cryptographic or crypto-economic.
1. Cryptographic — A cryptographic protocol based on math limits the actions users can take to only what is allowed by the protocol. An example is zk-rollup, which uses zero knowledge proofs to prove the validity of the transactions within a given block.
2. Crypto-economic — Some economic incentive (or penalty) encourages all protocol participants to play by the rules. For example, Plasma users submit fraud proofs to exit from the sidechain, causing the operator to be slashed. Optimistic rollups also use fraud proofs to hold the rollup operators accountable in that scheme.
But a more subtle way an operator can misbehave is by withholding state. This prevents other participants from verifying the validity of a transaction. Without that information, how could we prove wrongdoing and enforce a penalty on the misbehaving operator? It’s like destroying evidence before a trial to escape conviction. So full data availability is critical to ensuring all participants in a decentralized system can hold each other accountable.
This is one reason why rollups have become the scaling solution du jour. Their solution is to store all of the transaction data on Ethereum (though execution is still off-chain). However, this provides more limited benefit compared to the other approaches. ZK rollups, for example, can improve ETH transaction throughput from 15–20 TPS to around 1000–2000. It’s an improvement, but still far from supporting even a fraction of Visa’s transaction volume. Can we do better?
Potential Solutions
There is no “silver bullet” for scalability that simultaneously gives unlimited transaction throughput and full data availability while remaining decentralized. But there are ways to reduce the burden of having to store the entire history of transactions, or that enable more efficient batch verification of state transitions. Below is an incomplete list.
More Efficient Methods for Data Recovery/Storage — Using error correction techniques like Reed-Solomon codes (and newer approaches like Coded Merkle Trees), we can reconstruct blocks from only a subset of the data. This is the same method used when creating CDs that allows us to listen even though the disc may have a couple of scratches. NEAR Protocoland Polkadot use these methods to help address the problem of data availability on their chains.
Data Availability as a Separate Component — We already discussed separating consensus from program execution. What if we could take that same idea of modularization and apply it to data availability as well? This is the idea behind LazyLedger, a blockchain that is optimized entirely to be a data availability layer to support L2 rollups. Or blockchains like Arweave(which are built for permanent storage) could be used to store data from other chains. Solana is doing this today.
Recursive SNARKs — Zero-knowledge proofs guarantee computational integrity. They don’t provide data availability. However, composing SNARKs recursively CAN reduce the burden on both light clients and full nodes by verifying every state transition that ever occurred. This is similar to the cryptographic idea of Incrementally-Verifiable Computation, and is also one of the fundamental ideas underpinning blockchains like Mina.
Conclusion
As a former investor, I’ve heard the claim many times that “product x” is the perfect solution for “problem y”. The question that I’ve learned to ask is: “under what assumptions?” It’s rare that a seemingly obvious solution to a long-standing problem hasn’t already been considered. Separating computation from consensus in a distributed network is a natural idea. And there are many ways to ensure the integrity of that computation. But ensuring the question of data availability is often where you find critical assumptions which underpin the security of the entire system. As DeFi applications move from the main Ethereum chain to Layer-2, we as users need to be fully aware of the various trade-offs so we can make informed decisions. And as builders of the next generation web, we must endeavor to strike a balance between scalability with decentralization with the awareness that there is no such thing as a free lunch.
Thanks to Anna Rose, Richard Chen, and Emre Tekisalp for reviewing this post
End Notes
[1] For a deeper discussion on scalability and decentralization, see this excellent post which proposes a scalability metric that accounts for network centralization
[2] This description only applies to “full nodes”, or nodes that fully participate in consensus. There is a second category of network nodes called light clients or simple payment verifier (SPV) clients. These clients don’t participate in consensus. Rather, they simply keep track of a single piece of information per block — the block header — which they can use to verify that a given transaction was included inside of a given block. Critically, light clients have no way to tell whether or not the transactions that led to that state were valid. They have to trust the full node that they are connecting to is providing the right data instead of invalid blocks. This is why the Bitcoin community in particular strongly promotes running full nodes because the light client security model assumes a large, diverse set of node operators.
[3] For smart contract blockchains like Ethereum, where contract execution happens on-chain, it can also be insecure. See Verifier’s Dilemma