Skip to content

perf(core): snapshots #3211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 112 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
098369e
chore(core) improve import blocks readability
pablodeymo Jun 2, 2025
4c94c78
Simplifying import_blocks
pablodeymo Jun 2, 2025
6346086
hash_no_commit_with_batch
pablodeymo Jun 2, 2025
61f2a8e
QueryPlan WIP
pablodeymo Jun 3, 2025
47d2d42
small changes
pablodeymo Jun 3, 2025
6918fe4
implement query_plan_apply
juanbono Jun 3, 2025
427d9b6
remove dbg statement
juanbono Jun 4, 2025
de7da8d
Comment out dbg in QueryPlan
pablodeymo Jun 4, 2025
245e2dd
Merge branch 'main' into storage_commit_change
pablodeymo Jun 4, 2025
7e4224e
Merge branch 'main' into storage_commit_change
pablodeymo Jun 4, 2025
f5790e7
remove unnecesary pub
pablodeymo Jun 4, 2025
30fae24
WIP full sync with batch writes
pablodeymo Jun 4, 2025
3e34382
finish migration to batch update for sync
Oppen Jun 4, 2025
e4dd995
Merge branch 'main' into storage_commit_change
pablodeymo Jun 5, 2025
d3d7f73
fixing compilation issue
pablodeymo Jun 5, 2025
8d0eb7a
add logging
Oppen Jun 6, 2025
847fc0f
more logs
Oppen Jun 6, 2025
6d24b58
use millis
Oppen Jun 6, 2025
c8a727c
wip: snapshot poc for account storage only
Oppen Jun 9, 2025
5e9fe65
refactor(core): refactor and implement bulk operations in other datab…
iovoid Jun 9, 2025
c5dd336
Co-authored-by: Juan Bono <[email protected]>
pablodeymo Jun 9, 2025
4ca6a19
returning the Trie apply_account_updates_from_trie_batch
pablodeymo Jun 9, 2025
b7075eb
account info snapshots
Oppen Jun 9, 2025
3c5ead5
todo!s in redb
pablodeymo Jun 9, 2025
3c18f72
changing if let in libmdbx
pablodeymo Jun 9, 2025
33b4f5d
tracing
Oppen Jun 9, 2025
dc5d0bc
fix error caused by apply_account_updates_from_trie_batch returning t…
pablodeymo Jun 9, 2025
8a70c08
Merge branch 'perf/snapshots_poc' into perf/storage_batch_and_snapshots
pablodeymo Jun 9, 2025
cc6d49c
fixing errors
pablodeymo Jun 9, 2025
25a5442
integration working
Oppen Jun 9, 2025
37f055b
try sorting and cursors
Oppen Jun 10, 2025
67856b0
progress on split tables, better sort
Oppen Jun 10, 2025
1ceb899
Revert "progress on split tables, better sort"
pablodeymo Jun 11, 2025
8e70278
Revert "try sorting and cursors"
pablodeymo Jun 11, 2025
180f768
Co-authored-by: Mario Rugiero <[email protected]>
pablodeymo Jun 11, 2025
7fb569a
WIP: re-generate snapshot
pablodeymo Jun 11, 2025
fad160e
Revert "WIP: re-generate snapshot"
pablodeymo Jun 11, 2025
8e2d0e9
WIP: tables for logs in libmdbx
pablodeymo Jun 11, 2025
3f4d650
Encodable for AccountInfoWriteLog
pablodeymo Jun 12, 2025
4be34a6
StorageStateWriteLog finished
pablodeymo Jun 12, 2025
f0d0bcb
WIP: storing log in the db
pablodeymo Jun 12, 2025
947dfd3
fix apply_updates method signature in StoreEngine
pablodeymo Jun 12, 2025
5258edb
WIP apply_updates in StoreEngine
pablodeymo Jun 12, 2025
3cd6690
init tables
Oppen Jun 13, 2025
bca0231
simplify and implement log saving
Oppen Jun 13, 2025
d9ba41e
fix: forgot to commit
Oppen Jun 13, 2025
8f44b09
fix: storage log entries
Oppen Jun 13, 2025
fa38d5a
wip: reorg support
Oppen Jun 13, 2025
311f1a6
commit
Oppen Jun 13, 2025
c92ade1
wip: fixing errors to make it compile
pablodeymo Jun 13, 2025
8fe3b87
fix loops to process the first value
pablodeymo Jun 13, 2025
651ee0d
implement the replay
Oppen Jun 13, 2025
fa93589
detect deletions
Oppen Jun 13, 2025
3e684a1
move comment
Oppen Jun 13, 2025
b677cdf
some blocks might not change storage or account info
Oppen Jun 13, 2025
6c1deb5
apply_fork_choice every 1000 blocks on import
Oppen Jun 13, 2025
345b102
derive debug for blocknumhash
Oppen Jun 14, 2025
079b8d5
reverse new_canonical_blocks
Oppen Jun 17, 2025
faf63d3
missing mut
Oppen Jun 17, 2025
b635586
Merge branch 'main' into perf/storage_batch_and_snapshots
pablodeymo Jun 17, 2025
e817511
fix cli,rs
pablodeymo Jun 17, 2025
a67f90a
clippy
pablodeymo Jun 17, 2025
f1dbc45
rework replay/undo algorithm
Oppen Jun 18, 2025
6a1f1ac
Merge branch 'perf/storage_batch_and_snapshots' of github.com:lambdac…
Oppen Jun 18, 2025
f92887f
Merge branch 'main' into perf/storage_batch_and_snapshots
pablodeymo Jun 18, 2025
a10c34f
Merge branch 'perf/storage_batch_and_snapshots' of github.com:lambdac…
Oppen Jun 18, 2025
bff316e
improve match
pablodeymo Jun 18, 2025
654a4b8
remove unnecessary mut
pablodeymo Jun 18, 2025
c995843
fix unwrap_or
pablodeymo Jun 18, 2025
e18d09e
remove unused use
pablodeymo Jun 18, 2025
2efc16d
fix unused index in enumerate()
pablodeymo Jun 18, 2025
a6233c9
fix clippy warning
pablodeymo Jun 18, 2025
835e633
fix: the new algorithm needs to happen *after* updating the canonical…
Oppen Jun 18, 2025
a734313
fix: always update the snapshot when marking a new chain as canonical
Oppen Jun 18, 2025
d0bad7b
Merge branch 'perf/storage_batch_and_snapshots' of github.com:lambdac…
Oppen Jun 18, 2025
201ec4e
refactor codec structs
pablodeymo Jun 18, 2025
40b8089
table! on top of libmdbx.rs file
pablodeymo Jun 18, 2025
1a3e7d0
refactor aux tables for libmdbx
pablodeymo Jun 18, 2025
bc288ea
conditional compilation of encobable methods
pablodeymo Jun 18, 2025
68b67d1
Merge branch 'main' into perf/storage_batch_and_snapshots
pablodeymo Jun 18, 2025
d7692b6
many fixes
Oppen Jun 18, 2025
d0e3ae4
fix build
Oppen Jun 18, 2025
c47332f
attemt to handle missing ancestor
Oppen Jun 18, 2025
8a0880f
fix codecs for log entries
Oppen Jun 18, 2025
a3a26a8
WIP cleanup and replace_value method in libmdbx
pablodeymo Jun 19, 2025
550dd3e
remove async in replace_value
pablodeymo Jun 19, 2025
f9a64b8
remove debub_assert_eq
pablodeymo Jun 19, 2025
e374a67
replay_writes_until_head: remove unnecesary variable
pablodeymo Jun 19, 2025
01a993c
comment removed
pablodeymo Jun 19, 2025
f2a4a82
Merge branch 'main' into perf/storage_batch_and_snapshots
pablodeymo Jun 19, 2025
340782c
fix an edge case + consistent use of constants
Oppen Jun 19, 2025
d0f990c
Merge branch 'perf/storage_batch_and_snapshots' of github.com:lambdac…
Oppen Jun 19, 2025
2d3249e
fix removed entry detection
Oppen Jun 19, 2025
0a67694
conditionally update the snapshot
Oppen Jun 19, 2025
77f35be
replace_value_or_delete
pablodeymo Jun 19, 2025
fe85651
refactor to use helper
Oppen Jun 19, 2025
e2e971b
just in case respect order of checks
Oppen Jun 19, 2025
3d8cfd6
AccountStorageLogEntry with named fields
pablodeymo Jun 19, 2025
e5702d6
clippy
pablodeymo Jun 19, 2025
6a959cd
refactor: update account storage log handling to use AccountStorageLo…
pablodeymo Jun 19, 2025
47c228e
fix errors
pablodeymo Jun 19, 2025
eeab456
add tracing for snapshot misses
Oppen Jun 23, 2025
7f46643
Merge branch 'perf/storage_batch_and_snapshots' of github.com:lambdac…
Oppen Jun 23, 2025
a4b4441
refactor: add feature flag annotations for account info and storage l…
pablodeymo Jun 23, 2025
c9a8834
Batch updates in account info and storage log updates
pablodeymo Jun 23, 2025
aad0d08
fixes
Oppen Jun 23, 2025
9bc0c45
fix
Oppen Jun 23, 2025
d2bd73c
fix
Oppen Jun 23, 2025
a26ea71
add logging for account removal
Oppen Jun 23, 2025
d92b3d8
try not clearing cache, update immutable_cache
Oppen Jun 24, 2025
7eb5b09
and take removal into account
Oppen Jun 24, 2025
207205f
try also not clearing the main cache
Oppen Jun 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion cmd/ethrex/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ pub async fn import_blocks(
utils::read_chain_file(path)
};
let size = blocks.len();
for block in &blocks {
for block in blocks.iter() {
let hash = block.hash();
let number = block.header.number;
info!("Adding block {number} with hash {hash:#x}.");
Expand Down Expand Up @@ -405,6 +405,11 @@ pub async fn import_blocks(

// Make head canonical and label all special blocks correctly.
if let Some(block) = blocks.last() {
_ = store
.reconstruct_snapshots_for_new_canonical_chain(block.hash())
.await
.inspect_err(|error| warn!("Failed to reconstruct snapshot: {}", error));

store
.update_finalized_block_number(block.header.number)
.await?;
Expand Down
11 changes: 5 additions & 6 deletions cmd/ethrex/l2/command.rs
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,7 @@ impl Command {
.await
.expect("Error applying account updates");

let (new_state_root, state_updates, accounts_updates) =
let (new_state_root, state_updates, storage_updates) =
(
account_updates_list.state_trie_hash,
account_updates_list.state_updates,
Expand All @@ -350,13 +350,12 @@ impl Command {

let pseudo_update_batch = UpdateBatch {
account_updates: state_updates,
storage_updates: accounts_updates,
blocks: vec![],
receipts: vec![],
code_updates: vec![],
storage_updates,
..Default::default()
};

store.store_block_updates(pseudo_update_batch).await.expect("Error storing trie updates");
let account_updates: Vec<_> = account_updates.values().cloned().collect();
store.store_block_updates(pseudo_update_batch, &[&account_updates]).await.expect("Error storing trie updates");

new_trie = store.open_state_trie(new_state_root).expect("Error opening new state trie");

Expand Down
42 changes: 37 additions & 5 deletions crates/blockchain/blockchain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,14 @@ impl Blockchain {
let state_updates = apply_updates_list.state_updates;
let accounts_updates = apply_updates_list.storage_updates;
let code_updates = apply_updates_list.code_updates;
let account_info_log_updates = self
.storage
.build_account_info_logs(std::iter::once((block, &account_updates.to_vec())))
.map_err(ChainError::StoreError)?;
let storage_log_updates = self
.storage
.build_account_storage_logs(std::iter::once((block, &account_updates.to_vec())))
.map_err(ChainError::StoreError)?;

// Check state root matches the one in block header
validate_state_root(&block.header, new_state_root)?;
Expand All @@ -355,10 +363,11 @@ impl Blockchain {
blocks: vec![block.clone()],
receipts: vec![(block.hash(), execution_result.receipts)],
code_updates,
account_info_log_updates,
storage_log_updates,
};

self.storage
.clone()
.store_block_updates(update_batch)
.await
.map_err(|e| e.into())
Expand Down Expand Up @@ -449,6 +458,8 @@ impl Blockchain {
let mut total_gas_used = 0;
let mut transactions_count = 0;

let mut account_updates_acc = Vec::with_capacity(blocks_len);

let interval = Instant::now();
for (i, block) in blocks.iter().enumerate() {
// for the first block, we need to query the store
Expand Down Expand Up @@ -484,11 +495,12 @@ impl Blockchain {
total_gas_used += block.header.gas_used;
transactions_count += block.body.transactions.len();
all_receipts.push((block.hash(), receipts));
}

let account_updates = vm
.get_state_transitions()
.map_err(|err| (ChainError::EvmError(err), None))?;
let account_updates = vm
.get_state_transitions()
.map_err(|err| (ChainError::EvmError(err), None))?;
account_updates_acc.push(account_updates);
}

let last_block = blocks
.last()
Expand All @@ -497,6 +509,11 @@ impl Blockchain {
let last_block_number = last_block.header.number;
let last_block_gas_limit = last_block.header.gas_limit;

let account_updates = account_updates_acc
.iter()
.flatten()
.cloned()
.collect::<Vec<_>>();
// Apply the account updates over all blocks and compute the new state root
let account_updates_list = self
.storage
Expand All @@ -505,6 +522,8 @@ impl Blockchain {
.map_err(|e| (e.into(), None))?
.ok_or((ChainError::ParentStateNotFound, None))?;

let account_updates_per_block = blocks.iter().zip(account_updates_acc.iter());

let new_state_root = account_updates_list.state_trie_hash;
let state_updates = account_updates_list.state_updates;
let accounts_updates = account_updates_list.storage_updates;
Expand All @@ -513,12 +532,25 @@ impl Blockchain {
// Check state root matches the one in block header
validate_state_root(&last_block.header, new_state_root).map_err(|e| (e, None))?;

// call helper function to build write logs
let account_info_log_updates = self
.storage
.build_account_info_logs(account_updates_per_block.clone())
.map_err(|e| (e.into(), None))?;

let storage_log_updates = self
.storage
.build_account_storage_logs(account_updates_per_block)
.map_err(|e| (e.into(), None))?;

let update_batch = UpdateBatch {
account_updates: state_updates,
storage_updates: accounts_updates,
blocks,
receipts: all_receipts,
code_updates,
account_info_log_updates,
storage_log_updates,
};

self.storage
Expand Down
20 changes: 11 additions & 9 deletions crates/blockchain/fork_choice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,10 @@ pub async fn apply_fork_choice(
return Err(InvalidForkChoice::UnlinkedHead);
};

let link_block_number = match new_canonical_blocks.last() {
Some((number, _)) => *number,
None => head.number,
};
let link_block_number = new_canonical_blocks
.last()
.map(|(number, _)| *number)
.unwrap_or(head.number);

// Check that finalized and safe blocks are part of the new canonical chain.
if let Some(ref finalized) = finalized_res {
Expand Down Expand Up @@ -99,7 +99,6 @@ pub async fn apply_fork_choice(
}

// Finished all validations.

// Make all ancestors to head canonical.
for (number, hash) in new_canonical_blocks {
store.set_canonical_block(number, hash).await?;
Expand All @@ -112,6 +111,11 @@ pub async fn apply_fork_choice(

// Make head canonical and label all special blocks correctly.
store.set_canonical_block(head.number, head_hash).await?;

store
.reconstruct_snapshots_for_new_canonical_chain(head_hash)
.await?;

if let Some(finalized) = finalized_res {
store
.update_finalized_block_number(finalized.number)
Expand Down Expand Up @@ -176,10 +180,8 @@ async fn find_link_with_canonical_chain(
let parent_hash = header.parent_hash;

// Check that the parent exists.
let parent_header = match store.get_block_header_by_hash(parent_hash) {
Ok(Some(header)) => header,
Ok(None) => return Ok(None),
Err(error) => return Err(error),
let Some(parent_header) = store.get_block_header_by_hash(parent_hash)? else {
return Ok(None);
};

if is_canonical(store, block_number, parent_hash).await? {
Expand Down
40 changes: 34 additions & 6 deletions crates/blockchain/vm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,43 @@ impl StoreVmDatabase {

impl VmDatabase for StoreVmDatabase {
fn get_account_info(&self, address: Address) -> Result<Option<AccountInfo>, EvmError> {
self.store
.get_account_info_by_hash(self.block_hash, address)
.map_err(|e| EvmError::DB(e.to_string()))
let block_for_current_snapshot = self
.store
.get_block_for_current_snapshot()
.map_err(|e| EvmError::DB(e.to_string()))?;

if Some(self.block_hash) == block_for_current_snapshot {
self.store.get_current_account_info(address)
} else {
tracing::warn!(
"account_info snapshot miss: expected: {:?} got: {:?}",
block_for_current_snapshot,
self.block_hash
);
self.store
.get_account_info_by_hash(self.block_hash, address)
}
.map_err(|e| EvmError::DB(e.to_string()))
}

fn get_storage_slot(&self, address: Address, key: H256) -> Result<Option<U256>, EvmError> {
self.store
.get_storage_at_hash(self.block_hash, address, key)
.map_err(|e| EvmError::DB(e.to_string()))
let block_for_current_snapshot = self
.store
.get_block_for_current_snapshot()
.map_err(|e| EvmError::DB(e.to_string()))?;

if Some(self.block_hash) == block_for_current_snapshot {
self.store.get_current_storage(address, key)
} else {
tracing::warn!(
"account_storage snapshot miss: expected: {:?} got: {:?}",
block_for_current_snapshot,
self.block_hash
);
self.store
.get_storage_at_hash(self.block_hash, address, key)
}
.map_err(|e| EvmError::DB(e.to_string()))
}

fn get_block_hash(&self, block_number: u64) -> Result<H256, EvmError> {
Expand Down
2 changes: 1 addition & 1 deletion crates/common/types/account.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ pub struct Account {
pub storage: HashMap<H256, U256>,
}

#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, Eq)]
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, Eq, Hash)]
pub struct AccountInfo {
pub code_hash: H256,
pub balance: U256,
Expand Down
11 changes: 11 additions & 0 deletions crates/networking/p2p/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -457,13 +457,24 @@ impl Syncer {
.set_latest_valid_ancestor(failed_block_hash, last_valid_hash)
.await?;

// We also need to use the correct snapshot.
store
.reconstruct_snapshots_for_new_canonical_chain(last_valid_hash)
.await
.map_err(SyncError::Store)?;
// TODO(#2127): Just marking the failing ancestor is enough for the the Missing Ancestors hive test,
// we want to look at a more robust solution in the future if needed.
}

return Err(error.into());
}

// We also need to use the correct snapshot.
store
.reconstruct_snapshots_for_new_canonical_chain(last_block.hash())
.await
.map_err(SyncError::Store)?;

store
.update_latest_block_number(last_block.header.number)
.await?;
Expand Down
30 changes: 28 additions & 2 deletions crates/storage/api.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
use bytes::Bytes;
use ethereum_types::{H256, U256};
use ethrex_common::Address;
use ethrex_common::types::{
AccountState, Block, BlockBody, BlockHash, BlockHeader, BlockNumber, ChainConfig, Index,
Receipt, Transaction, payload::PayloadBundle,
AccountInfo, AccountState, Block, BlockBody, BlockHash, BlockHeader, BlockNumber, ChainConfig,
Index, Receipt, Transaction, payload::PayloadBundle,
};
use std::{fmt::Debug, panic::RefUnwindSafe};

Expand All @@ -17,6 +18,10 @@ pub trait StoreEngine: Debug + Send + Sync + RefUnwindSafe {
/// Store changes in a batch from a vec of blocks
async fn apply_updates(&self, update_batch: UpdateBatch) -> Result<(), StoreError>;

async fn undo_writes_until_canonical(&self) -> Result<(), StoreError>;

async fn replay_writes_until_head(&self, head: H256) -> Result<(), StoreError>;

/// Add a batch of blocks in a single transaction.
/// This will store -> BlockHeader, BlockBody, BlockTransactions, BlockNumber.
async fn add_blocks(&self, blocks: Vec<Block>) -> Result<(), StoreError>;
Expand Down Expand Up @@ -304,6 +309,27 @@ pub trait StoreEngine: Debug + Send + Sync + RefUnwindSafe {
&self,
) -> Result<Option<[H256; STATE_TRIE_SEGMENTS]>, StoreError>;

async fn setup_genesis_flat_account_storage(
&self,
genesis_block_number: u64,
genesis_block_hash: H256,
genesis_accounts: &[(Address, H256, U256)],
) -> Result<(), StoreError>;

async fn setup_genesis_flat_account_info(
&self,
genesis_block_number: u64,
genesis_block_hash: H256,
genesis_accounts: &[(Address, u64, U256, H256, bool)],
) -> Result<(), StoreError>;

fn get_block_for_current_snapshot(&self) -> Result<Option<BlockHash>, StoreError>;

fn get_current_storage(&self, address: Address, key: H256) -> Result<Option<U256>, StoreError>;

fn get_current_account_info(&self, address: Address)
-> Result<Option<AccountInfo>, StoreError>;

/// Sets storage trie paths in need of healing, grouped by hashed address
/// This will overwite previously stored paths for the received storages but will not remove other storage's paths
async fn set_storage_heal_paths(
Expand Down
Loading