Skip to content

feat: update batched merkle tree with changelogs #1677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sergeytimoshin
Copy link
Contributor

No description provided.

Comment on lines +143 to +151
for i in 0..self.changelog.len() {
let existing = self.changelog[i];
if existing.hash_chain_index == hash_chain_index &&
existing.pending_batch_index == pending_batch_index {
// Replace existing entry
self.changelog[i] = entry;
return;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not necessary, the cyclic vec takes care of overwriting existing values.

// Common implementation that works for both test and non-test modes

/// Checks if a changelog entry is applicable to the current state
pub fn is_changelog_entry_applicable(&self, entry: &crate::changelog::BatchChangelog) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it make sense to move all changelog related functions into a separate changelog.rs file to make the diff as clean as possible?

}

/// Find all changelog entries that are applicable to the current state
pub fn find_applicable_changelog_entries(&self, current_root: &[u8; 32], current_seq: u64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably more efficient to .filter instead of allocating a new vector

Comment on lines +204 to +212
let zeroed_entry = crate::changelog::BatchChangelog {
old_root: [0u8; 32],
new_root: [0u8; 32],
leaves_hash_chain: [0u8; 32],
hash_chain_index: 0,
pending_batch_index: 0,
_padding: [0u8; 5],
expected_seq: 0,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zeroing out entries is expensive, what about adding a field that signals is_inserted or not?

}

// 6. Return the batch append event.
Ok(MerkleTreeEvent::BatchAppend(event))
Copy link
Contributor

@ananas-block ananas-block Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to return all events so that the program can emit them via cpi every event must be a separate instruction (we can bundle more than 1 instruction into the noop cpi) -> need to change return type to Vec for all update from queue methods.
-> need to test that photon can deal this

Comment on lines +207 to +208
leaves_hash_chain: [0u8; 32],
hash_chain_index: 0,
Copy link
Contributor

@ananas-block ananas-block Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fields leaves_hash_chain and hash_chain_index are likely not useful since we mix changelogs from input and output queue thus we won't have reliable access to the leaves_hash_chains in the output queue account.

hash_chain_index: 0,
pending_batch_index: 0,
_padding: [0u8; 5],
expected_seq: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also send the expected sequence number in the instruction data. I am not sure that it is possible to calculate the expected sequence number onchain correctly if we mix both input and output queues in the same changelog.

@@ -50,6 +50,10 @@ pub struct QueueBatches {
impl QueueBatches {
/// Returns the number of ZKP batches contained within a single regular batch.
pub fn get_num_zkp_batches(&self) -> u64 {
// Prevent division by zero in test cases
if self.zkp_batch_size == 0 {
return 1; // Provide a minimal safe value for tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should panic in this case, but the case shouldn't be possible for checks during initialization.

Comment on lines +70 to +72
// Before initializing, set the data to a known state
account.data.fill(0);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary

Comment on lines +380 to +388
// Parse or create changelog from remaining data
let changelog = if metadata.changelog_capacity > 0 && !remaining_data.is_empty() {
// Try to parse the changelog from the account data
ZeroCopyCyclicVecU64::<crate::changelog::BatchChangelog>::from_bytes(remaining_data)?
} else {
// If we can't parse, return an error - we shouldn't create temporary buffers
// in from_bytes as they won't outlive the function
return Err(ZeroCopyError::Size.into());
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should launch batched Merkle trees with a placeholder so that we can always assume that at least the length bytes exist in the account.
With this assumption we can just use from_bytes_at without any conditions to deserialize the changelog zero copy vec.

Comment on lines +161 to +163
if !self.is_changelog_entry_applicable(entry) {
return Err(BatchedMerkleTreeError::OldRootMismatch);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't throw an error just skip changelogs that are not applicable.

let mut events = Vec::new();
let mut applied_any = true;

// Continue processing as long as we're making progress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok to not apply any changelogs as well.

let mut applied_any = true;

// Continue processing as long as we're making progress
while applied_any {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the most efficient way is probably to filter changelogs for not inserted, expected seq to greater or equal, sort them ascendingly, and do all this in iterators so that we don't copy any state.
As soon as we sorted sequence numbers are not consecutive we can stop.
We still need to check that the old root matches in every update.

Comment on lines +55 to +66
let queue_batches = QueueBatches {
currently_processing_batch_index: 0,
num_batches: NUM_BATCHES as u64,
batch_size: TEST_DEFAULT_BATCH_SIZE,
bloom_filter_capacity: 20_000 * 8,
zkp_batch_size: TEST_DEFAULT_ZKP_BATCH_SIZE,
..Default::default()
};

// Default changelog capacity is 2x the number of zkp batches
let default_changelog_capacity = queue_batches.get_num_zkp_batches() * 2;

Copy link
Contributor

@ananas-block ananas-block Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls revert to avoid unnecessary diff

Copy link
Contributor

@ananas-block ananas-block left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think it would be good to refactor the changelog specific functions into a separate file and add contained unit tests there to keep the diff in existing files minimal and make the changes easier to grasp.
Additionally, we can probably share the changelog logic between all the update_tree_from_queue methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants