-
Notifications
You must be signed in to change notification settings - Fork 61
feat: update batched merkle tree with changelogs #1677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for i in 0..self.changelog.len() { | ||
let existing = self.changelog[i]; | ||
if existing.hash_chain_index == hash_chain_index && | ||
existing.pending_batch_index == pending_batch_index { | ||
// Replace existing entry | ||
self.changelog[i] = entry; | ||
return; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not necessary, the cyclic vec takes care of overwriting existing values.
// Common implementation that works for both test and non-test modes | ||
|
||
/// Checks if a changelog entry is applicable to the current state | ||
pub fn is_changelog_entry_applicable(&self, entry: &crate::changelog::BatchChangelog) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could it make sense to move all changelog related functions into a separate changelog.rs
file to make the diff as clean as possible?
} | ||
|
||
/// Find all changelog entries that are applicable to the current state | ||
pub fn find_applicable_changelog_entries(&self, current_root: &[u8; 32], current_seq: u64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably more efficient to .filter
instead of allocating a new vector
let zeroed_entry = crate::changelog::BatchChangelog { | ||
old_root: [0u8; 32], | ||
new_root: [0u8; 32], | ||
leaves_hash_chain: [0u8; 32], | ||
hash_chain_index: 0, | ||
pending_batch_index: 0, | ||
_padding: [0u8; 5], | ||
expected_seq: 0, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zeroing out entries is expensive, what about adding a field that signals is_inserted or not?
} | ||
|
||
// 6. Return the batch append event. | ||
Ok(MerkleTreeEvent::BatchAppend(event)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to return all events so that the program can emit them via cpi every event must be a separate instruction (we can bundle more than 1 instruction into the noop cpi) -> need to change return type to Vec for all update from queue methods.
-> need to test that photon can deal this
leaves_hash_chain: [0u8; 32], | ||
hash_chain_index: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fields leaves_hash_chain
and hash_chain_index
are likely not useful since we mix changelogs from input and output queue thus we won't have reliable access to the leaves_hash_chain
s in the output queue account.
hash_chain_index: 0, | ||
pending_batch_index: 0, | ||
_padding: [0u8; 5], | ||
expected_seq: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should also send the expected sequence number in the instruction data. I am not sure that it is possible to calculate the expected sequence number onchain correctly if we mix both input and output queues in the same changelog.
@@ -50,6 +50,10 @@ pub struct QueueBatches { | |||
impl QueueBatches { | |||
/// Returns the number of ZKP batches contained within a single regular batch. | |||
pub fn get_num_zkp_batches(&self) -> u64 { | |||
// Prevent division by zero in test cases | |||
if self.zkp_batch_size == 0 { | |||
return 1; // Provide a minimal safe value for tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should panic in this case, but the case shouldn't be possible for checks during initialization.
// Before initializing, set the data to a known state | ||
account.data.fill(0); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary
// Parse or create changelog from remaining data | ||
let changelog = if metadata.changelog_capacity > 0 && !remaining_data.is_empty() { | ||
// Try to parse the changelog from the account data | ||
ZeroCopyCyclicVecU64::<crate::changelog::BatchChangelog>::from_bytes(remaining_data)? | ||
} else { | ||
// If we can't parse, return an error - we shouldn't create temporary buffers | ||
// in from_bytes as they won't outlive the function | ||
return Err(ZeroCopyError::Size.into()); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should launch batched Merkle trees with a placeholder so that we can always assume that at least the length bytes exist in the account.
With this assumption we can just use from_bytes_at
without any conditions to deserialize the changelog zero copy vec.
if !self.is_changelog_entry_applicable(entry) { | ||
return Err(BatchedMerkleTreeError::OldRootMismatch); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't throw an error just skip changelogs that are not applicable.
let mut events = Vec::new(); | ||
let mut applied_any = true; | ||
|
||
// Continue processing as long as we're making progress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok to not apply any changelogs as well.
let mut applied_any = true; | ||
|
||
// Continue processing as long as we're making progress | ||
while applied_any { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the most efficient way is probably to filter changelogs for not inserted, expected seq to greater or equal, sort them ascendingly, and do all this in iterators so that we don't copy any state.
As soon as we sorted sequence numbers are not consecutive we can stop.
We still need to check that the old root matches in every update.
let queue_batches = QueueBatches { | ||
currently_processing_batch_index: 0, | ||
num_batches: NUM_BATCHES as u64, | ||
batch_size: TEST_DEFAULT_BATCH_SIZE, | ||
bloom_filter_capacity: 20_000 * 8, | ||
zkp_batch_size: TEST_DEFAULT_ZKP_BATCH_SIZE, | ||
..Default::default() | ||
}; | ||
|
||
// Default changelog capacity is 2x the number of zkp batches | ||
let default_changelog_capacity = queue_batches.get_num_zkp_batches() * 2; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls revert to avoid unnecessary diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I think it would be good to refactor the changelog specific functions into a separate file and add contained unit tests there to keep the diff in existing files minimal and make the changes easier to grasp.
Additionally, we can probably share the changelog logic between all the update_tree_from_queue
methods.
No description provided.