Note that this article is part of the book Real-World Cryptography.
What are ZK-SNARKs and how do they work? This is a question I’ve had for years, and always felt like the resources I found gave no clear intuition as to how all of that stuff worked. So today, after a breakthrough in my own understanding, I thought it would be good to re-share what I’ve learned in a more understandable picture. The intent of this piece is to give you an intuitive way of thinking about these things, and note the gaps that you can fill for yourself (if you want to).
Getting terminology out of the way
The first part of the question is pretty easy to answer. ZK-SNARKs no matter what their funny name might imply, are simply zero-knowledge proofsthat are:
- general purpose
- and succinct
“Huh, what are all these words?” you ask, intimidated by their vagueness.
First, zero-knowledge proofs are cryptographic proofs that you know something, without revealing the something (zero-knowledgeness). That “something” is usually called the witness, but this detail doesn’t matter much. There are a lot of resources about zero-knowledge proofs so I won’t explain much about how they work, or what their exact cryptographic properties are (completeness, soundness, zero-knowledgeness). Zero-knowledge proofs are often seen used to prove that you know the discrete logarithm in some base of an element of some group (e.g. what is x in gˣmod p), or similarly-limited statements.
“Limited yes, but still useful!” yells Schnorr, inventor of the Schnorr signature scheme which is fabricated by taking a zero-knowledge proof of the knowledge of a discrete logarithm, and making it non-interactive. A zero-knowledge proof (or ZKP) is an interactive protocol between a prover (who knows the witness) and a verifier (who has to be convinced). An interactive protocol sucks in the real world, as it often limits the number of potential use-cases of the primitive, and slows down protocols depending on the number of round trips that need to happen between the prover and the verifier. Fortunately, some ZKPs can be constructed without interaction with a verifier. In other words, a prover can simply create a proof, and that proof can be verified by anyone at any point in time later without further help from the prover. When ZKPs are made non-interactive, we simply call them non-interactive zero-knowledge proofs or NIZKs. I talk more about the link between signatures and zero-knowledge proofs here.
ZKPs and NIZKs can also be constructed on much more general statements like “I know an input to some function such that the execution gives some output”, or more specifically “I know a in f(a, b) = c”. If this still doesn’t make sense think about the usual example given to illustrate general-purpose ZKPs: “I know the solution of some specific sudoku puzzle”.
We’re almost there: ZK-SNARKs are general-purpose non-interactive zero-knowledge proofs, and more! They are also succinct, meaning that the proofs they produce are small in size and are fast to verify, which makes them so special they deserve to be called ZK-SNARKs. Not every modern proof system deserves that special classification, for example STARKs don’t 🙁
As a recap:
- zero-knowledge proof: a cryptographic proof that you know something, without revealing the something
- non-interactive: a proof that was constructed without the help of a verifier
- general purpose: a proof of a more general statement, like the knowledge of secret inputs or outputs of a program
- succinct: a small proof that is fast to verify
The actual stuff
But not only did you ask, “what are ZK-SNARKs?”, but you also asked about “how do they work?”
And oh boy, this is a complex subject to answer. First and foremost, there are many, many schemes, too many of them, and so I’m not sure exactly how to answer that question. But I have some idea of how some of them work, and so I imagine that most of them follow that pattern, or improve on it. So let me explain…
First and foremost, there are many many zk-SNARK schemes, too many of them really.
But most build on this type of construction:
- A proving system, allowing a prover to prove something to a verifier, which I’ll explain in this post.
- A translation or compilation of a program to something the proving system can prove, which I’ll explain in part 2 of this post.
The first part is not too hard to understand, while the second part sorts of requires a graduate course into the subject.
So let’s take a look at the first part first.
Proving your knowledge of a constrained polynomial
The main idea of zk-SNARKs is that they are all about proving that you know some polynomial f(x) that has some roots.
By roots I mean that the verifier has some values in mind (for example, 1and 2) and the prover must prove that the secret polynomial they have in mind evaluates to 0 for these values (for example, f(1) = f(2) = 0).
By the way, a polynomial that has 1 and 2 as roots (in our example) can be written as f(x)=(x-1)(x-2)h(x) for some polynomial h(x).
(If you’re not convinced try to evaluate that at x=1 and x=2.)
So we say that the prover must prove that they know an f(x) and h(x) such that f(x) = t(x)h(x) for some target polynomial t(x) = (x-1)(x-2) (in the example that 1 and 2 are the roots that the verifier wants to check).
But that’s it, that’s what zk-SNARKs proving systems usually provide: something to prove that you know some polynomial.
I’m repeating this because the first time I learned about that it made no sense to me: how can you prove that you know some secret input to a program, if all you can prove is that you know a polynomial.
Well, that’s why the second part of a zk-SNARK is so difficult: it’s about translating a program into a polynomial. But more on that later.
Back to our proving system, how does one prove that they know such a function f(x)?
Well they just have to prove that they know an h(x) such that you can write f(x) as f(x) = t(x)h(x). Ugh… Not so fast here.
We’re talking about zero-knowledge proofs right?
How can we prove this without giving out f(x)?
The answer is in the following three tricks:
- Homomorphic commitments. A commitment scheme similar to the ones we’ve seen used in other zero-knowledge proofs.
- Bilinear pairings. A mathematical construction that has some interesting properties, more on that later.
- The fact that different polynomials evaluate to different values most of the time.
So let’s go through each of them shall we?
Homomorphic commitments to hide parts of the proof
The first trick is to use commitments to hide the values that we’re sending to the prover.
But not only do we hide them, we also want to allow the verifier to perform some operations on them so that they can verify the proof. Specifically verify that if the prover commits on their polynomial f(x) as well as h(x), then we have
com(f(x)) = com(t(x)) com(h(x)) = com(t(x)h(x))
where the commitment com(t(x)) is computed by the verifier as the agreed constraint on the polynomial.
These operations are called homomorphic operations and we couldn’t have performed them if we had used hash functions as commitment mechanisms.
Thanks to these homomorphic commitment, we can “hide values in the exponent” (for example, for a value v then send the commitment gv mod p) and perform useful computations like equality: observe that if a = bc then gᵃ = gᵇ gᶜ = gᵇ⁺ᶜ.
Wait, this is not what we wanted though… we wanted gᵃ = gᵇᶜ.
Bilinear pairings to improve our homomorphic commitments
gᵃ = (gᵇ)ᶜ = gᵇᶜ gets us there, but only if c is a known value and not a commitment (for example, gᶜ). Unfortunately this is a limitation for our proving protocol, as there will be multiplication operations between commitments. This is where bilinear pairings can be used to unblock us, and this is the sole reason why we use bilinear pairings in a zk-SNARK (really just to be able to multiply the values inside the commitments).
I don’t want to go too deep into what bilinear pairings are, but just know that it is just another tool in our toolkit that:
- Takes two values of our group (formed by the values generated by graised to different powers modulo p) and place them in another group.
- Allows for multiplication by moving stuff from one group to the other, we can multiply things that couldn’t be multiplied previously.
So using e as the typical way of writing a bilinear pairing, we have e(g₁, g₂) = h₃ and we can use it to perform multiplications hidden in the exponent via this one equation:
e(gᵇ, gᶜ) = e(g)ᵇᶜ
Again, we use bilinear pairings to make our commitments not only homomorphic for the addition, but also for the multiplication.
Note that bilinear pairings are used in other places in cryptography, and are slowly making their way as a more common building block: they can be seen in homomorphic encryption schemes (which makes sense after what we’ve seen here) but also signatures schemes like BLS.
Where does the succinctness come from?
Finally, the succinctness of zk-SNARKs comes from the fact that two functions that differ will evaluate to different points most of the time.
What this means for us is that if my f(x) is not really equal to t(x)h(x), meaning that I don’t have a polynomial f(x) that really has the roots we’ve chosen with the verifier, then evaluating f(x) and t(x)h(x) at a random point r will not give out the same result most of the time. In other words for almost all r, we have f(r) ≠ t(r)h(r). This is known as the Schwartz-Zippel lemma, which I pictured in the following illustration.
The Schwartz-Zippel lemma says that two different polynomials of degree n can intersect in at most n points. In other words, two different polynomials will differ in most points.
Knowing this, it is enough to prove that com(f(r)) = com(t(r)h(r)) for some random point r. This is why zk-SNARK proofs are so small: by comparing points in a group you end up comparing much larger polynomials!
But this is also the reason behind the “trusted setup” needed in most zk-SNARK constructions.
If a prover knows the random point r that will be used to check the equality, then they can forge an invalid polynomial that will still verify the equality.
So a trusted setup is about:
1. Creating a random value r.
2. Committing different exponentiation of it
so that they can be used by the prover to compute their polynomial without knowing the point r.
3. Destroying the value r.
Does the second point make sense?
If my polynomial as the prover is f(x) = 3x² + x then all I have to do is compute
to obtain a commitment of my polynomial evaluated at that random point r(without knowing r).
From programs to polynomials
So far the constraints on the polynomial that the prover must find is that it has some roots (some values that evaluate to 0 with our polynomial).
But how do we translate a more general statement into a polynomial knowledge proof?
Typical statements in cryptocurrencies (which are the applications currently making the most use of zk-SNARKs these days) are of the form:
- prove that a value is in the range [0, 264] (this is called a range proof)
- prove that a (secret) value is included in some given (public) Merkle tree
- prove that the sum of some values is equal to the sum of some other values
- and so on…
And herein lies the difficult part.
As I said earlier, converting a program execution into the knowledge of a polynomial “sorts of requires a graduate course into the subject.”
What is going to happen next is:
1. Our program will first get converted into an arithmetic circuit, a circuit made of math!
2. That arithmetic circuit will get converted into a system of equations that are of a certain form (called rank-1 constraint system or R1CS).
3. We’ll then use a trick to convert our system of equations into a polynomial.
This’ll be in part 2 of this series.
Thanks to Alex Pruden for feedback. You can read content like this, and more, in the book Real-World Cryptography.