Skip to content
This repository was archived by the owner on Apr 18, 2025. It is now read-only.

Commit 6f199a1

Browse files
roynalnarutodarth-cynoel2004
authored
Update aggregator doc (#1365)
* Adjust documentation * Fix typo * Update README.md To emphase the "continuous" in aggregated data * doc: more inline documentation for RecursionCircuit * rephrase comment * Add more context to comment * Add more context to comment * rephrase comment * rephrase comment and add more context * nit-pick --------- Co-authored-by: Ray Gao <[email protected]> Co-authored-by: Ho <[email protected]>
1 parent 2d73894 commit 6f199a1

File tree

3 files changed

+232
-82
lines changed

3 files changed

+232
-82
lines changed

aggregator/README.md

+136-47
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
11
Proof Aggregation
22
-----
33

4-
![Architecture](./figures/architecture.jpg)
4+
# Mechanism
5+
Aggregation allows larger amounts of data to be verified on-chain using fewer proofs.
6+
Currently, chunks (a list of continuous blocks) are first aggregated into a batch, then multiple batches are aggregated using a recursive scheme into a bundle.
7+
The `bundle` is the current apex entity that will be verified on-chain.
8+
9+
<!-- ![Architecture](./figures/architecture.jpg) -->
510
# Params
611
|param|meaning |
712
|:---:|:---|
813
|k | number of valid chunks|
9-
|n | max number of chunks per batch|
14+
|n | max number of chunks per batch (hard-coded)|
1015
|t | number of rounds for the final hash $\lceil32\times n/136\rceil$ |
1116

12-
Currently `n` is hard coded to `10`.
1317
# Structs
1418

1519
## Chunk
1620

17-
A __chunk__ is a list of continuous blocks. It consists of 5 hashes:
21+
A __chunk__ is a list of L2 `blocks` and will be proven by the `ChunkCircuit` (this is in fact the ZkEVM `SuperCircuit`). It consists of 5 hashes:
1822
- state root before this chunk
1923
- state root after this chunk
2024
- the withdraw root of this chunk
@@ -47,93 +51,119 @@ If $k< n$, $(n-k)$ padded chunks are padded to the list. A padded chunk has the
4751

4852
## Batch
4953

50-
A __batch__ consists of continuous chunks of size `k`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`.
54+
A __batch__ is a list of continuous `chunks` of size `k` that will be aggregated using the `BatchCircuit`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`. The batch is represented by the preimage fields to the `batch_hash`, which is constructed as:
55+
```
56+
batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_message_popped || batch_data_hash || versioned_hash || parent_batch_hash || last_block_timestamp || z || y)
57+
```
58+
All preimage fields' values are provided to the batch through the `BatchHeader` struct, so it can correctly construct the hash state transition from `parent_batch_hash` to `batch_hash` (for current batch).
59+
60+
Note that there are also implicit constraints between state roots before/after batch and the state roots of the chunks it has aggregated:
61+
```
62+
prev_state_root := c_0.prev_state_root
63+
post_state_root := c_k.post_state_root
64+
```
65+
66+
## BatchHeader
67+
The current schema for batch header is:
68+
69+
|Field|Bytes|Type|Index|Comments|
70+
|:---|:---|:---|:---|:---|
71+
|version | 1 | uint8| 0| The batch version|
72+
|batchIndex | 8 | uint64| 1| The index of the batch|
73+
|l1MessagePopped | 8 | uint64| 9| Number of L1 messages popped in the batch|
74+
|totalL1MessagePopped | 8 | uint64| 17| Number of total L1 messages popped after the batch|
75+
|dataHash | 32 | bytes32| 25| The data hash of the batch|
76+
|blobVersionedHash | 32 | bytes32| 57| The versioned hash of the blob with this batch’s data|
77+
|parentBatchHash | 32 | bytes32| 89| The parent batch hash|
78+
|lastBlockTimestamp | 8 | uint64| 121| The timestamp of the last block in this batch|
79+
|blobDataProof | 64 | bytes64| 129| The blob data proof: z (32), y (32)|
80+
81+
## Continuous batches
82+
A list of continuous batches $b_1, \dots, b_k$ satisfy
83+
```
84+
b_i.batch_hash == b_{i+1}.parent_batch_hash AND
85+
b_i.post_state_root == b_{i+1}.prev_state_root
86+
```
87+
for $i \in [1, k-1]$.
88+
Unlike chunks aggregation, the last layer of recursive batch aggregation can accept an arbitrary number of batches. There's no explicit upper limit. Instead, the number of rounds of recursion can solely be defined by the latency target on L1 for those batches. As a result, continuous batches are never padded.
89+
90+
## Bundle
91+
A __bundle__ is a list of continuous `batches` that will be aggregated recursively using the `RecursionCircuit`. The __bundle__ is the current apex entity whose proof will be verified on-chain.
5192

5293
# Circuits
5394

5495
## Chunk circuit
5596

56-
Circuit proving the relationship for a chunk is indeed the zkEVM circuit. It will go through 2 layers of compression circuit, and becomes a __snark__ struct. We do not list its details here. Abstractly, a snark circuit has the following properties:
97+
Circuit proving the relationship for a chunk is the zkEVM circuit. It will go through 2 layers of compression circuit, and becomes a __snark__ struct. We do not list its details here. Abstractly, a snark circuit has the following properties:
5798
- it takes 44 elements as public inputs
5899
- 12 from accumulators
59100
- 32 from public input hash
60101

102+
## Batch Circuit
61103

62-
![Architecture](./figures/hashes.jpg)
63-
64-
## Aggregation Circuit
65-
66-
We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks, and obtain a total of `n` snarks.
67-
68-
In the above example, we have `k = 2` valid chunks, and `2` padded chunks.
69-
70-
The padded snarks are identical the the last valid snark, so the aggregator does not need to generate snarks for padded chunks.
104+
We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks (by repeating the last non-padding chunk), and obtain a total of `n` snarks.
105+
Additionally, the batch circuit has to ascertain the correct hash transition from the parent batch.
71106

72107
### Configuration
73108

74-
There will be three configurations for Aggregation circuit.
75-
- FpConfig; used for snark aggregation
76-
- KeccakConfig: used to build keccak table
77-
- RlcConfig: used to compute RLC of hash inputs
109+
There are several configuration subcomponents for batch circuit.
110+
- FpConfig; used for snark aggregation.
111+
- KeccakConfig: used to build keccak table.
112+
- RlcConfig: used to compute RLC of hash inputs.
113+
- BlobDataConfig: used for representing the zstd-encoded form of batch data, with `4096 * 31` rows. Each row is a byte value. An EIP-4844 blob consists of `4096 * 32` bytes, where we set the most-significant byte in each 32-bytes chunk as `0` to guarantee that each 32-bytes chunk is a valid BLS12-381 scalar field element.
114+
- BatchDataConfig: used for representing the raw batch bytes, effectively constructing the random challenge point `z` for the KZG opening proof.
115+
- DecoderConfig: implements an in-circuit zstd-decoder that decodes blob data into batch data
116+
- BarycentricEvaluationConfig: used for evaluating the interpolated blob polynomial at an arbitrary challenge point `z`, where both `z` and the evaluation `y` are included in the `BatchHeader`.
78117

79118
### Public Input
80-
The public input of the aggregation circuit consists of
119+
The public input of the batch circuit consists of
81120
- 12 elements from accumulator
82-
- 32 elements of `batch_pi_hash`
121+
- 2 elements of `parent_state_root` (split by hi/lo)
122+
- 2 elements of `parent_batch_hash`
123+
- 2 elements of `current_state_root`
124+
- 2 elements of `current_batch_hash`
125+
- 1 element of `chain_id`
126+
- 2 elements of `current_withdraw_root`
83127

84-
### Statements
85-
For snarks $s_1,\dots,s_k,\dots, s_n$ the aggregation circuit argues the following statements.
86-
87-
1. batch_data_hash digest is reused for public input hash. __Static__.
128+
Note that `parent_state_root` is the same as `chunk[0].prev_state_root` and `current_state_root` is the same as `chunk[k].post_state_root`. When these chunk fields are assigned into keccak preimages, their cells are constrained against the public input to ensure equality. If any public input appears in the preimage of the `batch_hash`, their corresponding assigned preimage cells will be equality constrained as well.
88129

89-
2. batch_pi_hash used same roots as chunk_pi_hash. __Static__.
90-
```
91-
batch_pi_hash := keccak(chain_id || chunk_1.prev_state_root || chunk_n.post_state_root || chunk_n.withdraw_root || batch_data_hash || z || y || versioned_hash)
92-
```
93-
and `batch_pi_hash` matches public input.
130+
### Statements
131+
For snarks $s_1,\dots,s_k,\dots, s_n$ the batch circuit argues the following statements.
94132

95-
3. batch_data_hash and chunk[i].pi_hash use a same chunk[i].data_hash when chunk[i] is not padded
133+
1. batch_data_hash digest is reused for batch hash. __Static__.
96134

135+
2. batch_data_hash and chunk[i].pi_hash use a same chunk[i].data_hash when chunk[i] is not padded
97136
```
98137
for i in 1 ... n
99138
chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash || chunk_txdata_hash)
100139
```
101-
102140
This is done by computing the RLCs of chunk[i]'s data_hash for `i=0..k`, and then check the RLC matches the one from the keccak table.
103141

104-
4. chunks are continuous when they are not padded: they are linked via the state roots.
105-
142+
3. chunks are continuous when they are not padded: they are linked via the state roots.
106143
```
107144
for i in 1 ... k-1
108145
c_i.post_state_root == c_{i+1}.prev_state_root
109146
```
110-
111-
5. All the chunks use the same chain id. __Static__.
147+
4. All the chunks use the same chain id. __Static__.
112148
```
113149
for i in 1 ... n
114150
batch.chain_id == chunk[i].chain_id
115151
```
116-
117-
6. The last `(n-k)` chunk[i] are padding
152+
5. The last `(n-k)` chunk[i] are padded chunks
118153
```
119154
for i in 1 ... n:
120155
if is_padding:
121156
chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells
122157
```
123158
This is done via comparing the `data_rlc` of `chunk_{i-1}` and ` chunk_{i}`.
124-
7. the hash input length is correct
125-
- hashes[0] has 200 bytes
126-
- hashes[1..N_SNARKS+1] has 168 bytes input
127-
- batch's data_hash length is 32 * number_of_valid_snarks
128-
8. batch data hash is correct w.r.t. its RLCs
129-
9. is_final_cells are set correctly
159+
6. the hash input length is correct
160+
- hashes[0] has 193 bytes (`batch_hash` preimage)
161+
- hashes[1..N_SNARKS+1] has 168 bytes input (`chunk_pi_hash` preimages)
162+
- batch's data_hash length is 32 * number_of_valid_snarks (`batch_data_hash` preimage)
130163

131164
### Handling dynamic inputs
132-
133-
134165
![Dynamic_inputs](./figures/hash_table.jpg)
135166

136-
137167
Our keccak table uses $2^{19}$ rows. Each keccak round takes `300` rows. When the number of round is less than $2^{19}/300$, the cell manager will fill in the rest of the rows with dummy hashes.
138168

139169
The only hash that uses a dynamic number of rounds is the last hash.
@@ -158,3 +188,62 @@ For the output of the final data hash
158188

159189
Additional checks for dummy chunk
160190
- if `is_padding` for `i`-th chunk, we constrain `chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells`
191+
192+
## Recursion Circuit
193+
194+
`RecursionCircuit` aggregates $N$ SNARKs from a generic circuit (called `AppCircuit`). It achieves this aggregation by repeatedly combine each `AppCircuit`'s SNARK with a SNARK generated from last round of aggregation (hence the name `recursion`). In each round of recursion, the Recursion circuit verifies a SNARK from the `AppCircuit` and its SNARK from the “previous” round. For the first round of aggregation, a dummy SNARK is generated to combine with the first `AppCircuit` SNARK. Essentially, we have:
195+
196+
$RC_{Snark}(N) \colonequals verify(App_{Snark}(N)) \bigvee verify(RC_{Snark}(N-1))$
197+
198+
where $RC$ indicates the Recursion circuit.
199+
With `accumulator_indices` , Recursion Circuit can merge `AppCircuit` ’s accumulator, in case `AppCircuit` itself is also a Batch Circuit.
200+
201+
### StateTransition Trait
202+
The `AppCircuit` must follow a layout that `RecursionCircuit` accepts. The layout is described in the `StateTransition` trait which describes a data which can transition from prev state to current state, with methods clearly indicating the indices of accumulators, states (prev and post) and additional exported PI fields.
203+
204+
```rust
205+
pub trait StateTransition: Sized {
206+
type Input: Clone;
207+
type Circuit: CircuitExt<Fr>;
208+
209+
// Describes the state transition
210+
fn state_transition(&self, round: usize) -> Self::Input;
211+
212+
// Count of the fields used to represent state. The public input consists of twice
213+
// this number as both the previous and current states are included in the public input.
214+
fn num_transition_instance() -> usize
215+
216+
// Other counts of instance variables
217+
fn num_additional_instance() -> usize
218+
fn num_instance() -> usize
219+
fn num_accumulator_instance() -> usize
220+
221+
// Location of accumulator, state variables and additional exported PIs
222+
fn accumulator_indices() -> Vec<usize>
223+
fn state_prev_indices() -> Vec<usize>
224+
fn state_indices() -> Vec<usize>
225+
fn additional_indices() -> Vec<usize>
226+
}
227+
```
228+
229+
### Public Inputs
230+
All parts of $PI$ in `AppCircuit` is also put into the $PI$ of recursion circuit, the recursion circuit has a single column of $PI$ with following layout:
231+
232+
```markdown
233+
`accumulator` | `preprocessed_digest` | `init_states` | `final_states` | `round`
234+
```
235+
236+
- `accumulator` accumulates all of the accumulators from the $N$ $snark_{app}$, all the accumulators exported from the $PI$ of these snarks (if there is any), and accumulators generated by the $N$ steps verification of snarks from recursion circuit.
237+
- `preprocessed_digest` represents the Recursion Circuit itself. There would be an unique value for every recursion circuit which can bundle (any number of) snarks from specified `AppCircuit`
238+
- `init_states` represent the initial state $S_0$.
239+
- `final_states` represent the final state, along with the exported $PI$ from $S_N$.
240+
- `round` represents the number of batches being bundled recursively, i.e. $N$.
241+
242+
### Statements
243+
To verify the $k_{th}$ snark, we have 3 PIs from the current circuit, the snark of $k_{th}$ `AppCircuit` , and the snark of $(k-1)_{th}$ recursion circuit respectively. We named it $PI$, $PI_{app}$ and $PI_{prev}$. We have following equality constraints for them:
244+
245+
- if $N > 0$, $PI(preprocessed\_digest) = PI_{prev}(preprocessed\_digest)$: ensure the snark forprevious recursion circuitis the same circuit of current one
246+
- if $N > 0$, $PI(round) = PI_{prev}(round) + 1$: ensure the round number is increment so the first snark from app circuit has round = 0
247+
- $PI_{app}(final\_states) = PI(final\_states)$: transparent pass the PI to app circuit
248+
- if $N > 0$, $PI(init\_states) = PI_{prev}(init\_states)$, else $PI(init\_states) = PI_{app}(init\_states)$c: propagate the init state, and for first recursion, the init state part of PI is passed to app circuit
249+
- $PI_{app}(init\_states) = PI_{prev}(final\_states)$: the init state part of PI for app circuit must bechainedwith previous recursion round

aggregator/src/batch.rs

-3
Original file line numberDiff line numberDiff line change
@@ -131,9 +131,6 @@ impl<const N_SNARKS: usize> BatchHeader<N_SNARKS> {
131131
/// - the first k chunks are from real traces
132132
/// - the last (#N_SNARKS-k) chunks are from empty traces
133133
/// A BatchHash consists of 2 hashes.
134-
/// - batch_pi_hash := keccak(chain_id || chunk_0.prev_state_root || chunk_k-1.post_state_root ||
135-
/// chunk_k-1.withdraw_root || batch_data_hash || z || y || versioned_hash)
136-
///
137134
/// - batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_message_popped ||
138135
/// batch_data_hash || versioned_hash || parent_batch_hash || last_block_timestamp || z || y)
139136
/// - batch_data_hash := keccak(chunk_0.data_hash || ... || chunk_k-1.data_hash)

0 commit comments

Comments
 (0)