Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(e2e): support multiple aggregators in the e2e tests #2378

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7b05361
feat(e2e): update e2e test command line arguments
jpraynaud Mar 10, 2025
99f5881
feat(e2e): 'MithrilInfrastructure' supports multiple aggregators
jpraynaud Mar 11, 2025
19b4507
refactor(e2e): implement interior mutability for 'Aggregator', 'RunOn…
jpraynaud Mar 12, 2025
6e4b6b9
refactor(relay): support for dialing to peer in relays of e2e test
jpraynaud Mar 13, 2025
4e22900
refactor(e2e): enhance 'MithrilInfrastructure' support for multiple a…
jpraynaud Mar 13, 2025
6906938
feat(e2e): 'RunOnly' supports multiple aggregators
jpraynaud Mar 13, 2025
7e3ac58
feat(e2e): 'Spec' supports multiple aggregators
jpraynaud Mar 13, 2025
7ce4531
feat(e2e): runner supports multiple aggregators
jpraynaud Mar 13, 2025
59233ef
refactor(e2e): enhance naming of aggregators and associated relays
jpraynaud Mar 14, 2025
932360d
refactor(e2e): enhance assertions checks
jpraynaud Mar 14, 2025
40f32af
fix(common): enhance Certificate display implementation
jpraynaud Mar 17, 2025
e9e0d2f
fix(aggregator): integration test for slave uses evolving Mithril sta…
jpraynaud Mar 17, 2025
ed50cc8
fix(common): avoid too low stake in random stake distribution
jpraynaud Mar 17, 2025
b6e18ea
fix(aggregator): slave signer registration stabilization
jpraynaud Mar 17, 2025
f7c754c
feat(relay): implement signer relay modes
jpraynaud Mar 18, 2025
1c7f425
refactor(e2e): use signer relay modes in e2e test
jpraynaud Mar 18, 2025
80c2021
refactor(ci): update e2e tests in CI to use the signer relay modes
jpraynaud Mar 18, 2025
bd12065
refactor(e2e): better naming for aggregators in e2e tests
jpraynaud Mar 18, 2025
8bd1e56
fix(e2e): delegate stakes only from the first aggregator
jpraynaud Mar 18, 2025
be5f9ab
refactor(e2e): remove distinction master/slave aggregator
jpraynaud Mar 19, 2025
4731590
fix(e2e): make genesis bootstrap error retryable
jpraynaud Mar 19, 2025
cad83fb
fixup: feat: update state machine to support slave aggregator mode
jpraynaud Mar 20, 2025
0b21674
fix(e2e): flakiness in the genesis bootstrap of slave aggregators
jpraynaud Mar 20, 2025
42b4d01
refactor(e2e): enhance assertions logs with aggregator name
jpraynaud Mar 20, 2025
752292f
fix(ci): wrong format for next era in some e2e scenarios
jpraynaud Mar 21, 2025
124810c
fix(e2e): era switch done on multiple aggregators
jpraynaud Mar 21, 2025
33ddc53
refactor(aggregator): simplify slave aggregator integration test
jpraynaud Mar 21, 2025
366f910
refactor(e2e): better parameter handling with clap
jpraynaud Mar 21, 2025
618f376
chore(e2e): apply review comments
jpraynaud Mar 21, 2025
9e47514
wip(ci): DO NOT MERGE
jpraynaud Mar 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 18 additions & 7 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -294,23 +294,34 @@ jobs:
strategy:
fail-fast: false
matrix:
mode: ["std"]
mode: ["master-slave"]
era: ${{ fromJSON(needs.build-ubuntu-X64.outputs.eras) }}
next_era: [""]
cardano_node_version: ["10.1.3", "10.1.4", "10.2.1"]
hard_fork_latest_era_at_epoch: [0]
run_id: ["#1", "#2"]
extra_args: [""]
run_id: ["#1", "#2", "#3", "#4", "#5", "#6", "#7", "#8", "#9", "#10"]
extra_args:
[
"--number-of-aggregators=2 --use-relays --relay-signer-registration-mode=passthrough --relay-signature-registration-mode=p2p",
]

include:
# Include a test for the P2P mode
- mode: "p2p"
# Include a test for partial decentralization with master/slave signer registration and P2P signature registration
- mode: "master-slave"
era: ${{ fromJSON(needs.build-ubuntu-X64.outputs.eras)[0] }}
next_era: ""
cardano_node_version: "10.1.4"
hard_fork_latest_era_at_epoch: 0
run_id: "#1"
extra_args: "--number-of-aggregators=2 --use-relays --relay-signer-registration-mode=passthrough --relay-signature-registration-mode=p2p"
# Include a test for full dedentralization P2P signer registration and P2P signature registration
- mode: "decentralized"
era: ${{ fromJSON(needs.build-ubuntu-X64.outputs.eras)[0] }}
next_era: [""]
next_era: ""
cardano_node_version: "10.1.4"
hard_fork_latest_era_at_epoch: 0
run_id: "#1"
extra_args: "--use-p2p-network"
extra_args: "--number-of-aggregators=2 --use-relays --relay-signer-registration-mode=p2p --relay-signature-registration-mode=p2p"
# Include a test for the era switch without regenesis
- mode: "std"
era: ${{ fromJSON(needs.build-ubuntu-X64.outputs.eras)[0] }}
Expand Down
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 8 additions & 8 deletions mithril-aggregator/src/runtime/state_machine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,6 @@ impl AggregatorRuntime {
info!(self.logger, "→ Trying to transition to READY"; "last_time_point" => ?last_time_point);

let can_try_transition_from_idle_to_ready = if self.config.is_slave {
println!("Checking if slave aggregator is at the same epoch as master");
self.runner
.is_slave_aggregator_at_same_epoch_as_master(&last_time_point)
.await?
Expand Down Expand Up @@ -265,18 +264,19 @@ impl AggregatorRuntime {
self.runner
.update_stake_distribution(&new_time_point)
.await?;
if self.config.is_slave {
self.runner
.synchronize_slave_aggregator_signer_registration()
.await?;
}
self.runner.inform_new_epoch(new_time_point.epoch).await?;

self.runner.upkeep(new_time_point.epoch).await?;
self.runner
.open_signer_registration_round(&new_time_point)
.await?;
self.runner.update_epoch_settings().await?;
if self.config.is_slave {
self.runner
.synchronize_slave_aggregator_signer_registration()
.await?;
// Needed to recompute epoch data for the next signing round on the slave
self.runner.inform_new_epoch(new_time_point.epoch).await?;
}
Comment on lines 272 to +279
Copy link
Collaborator

@Alenar Alenar Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how this change help to stabilize the e2e tests ? I'm quite puzzled over the fact that we need to call runner.inform_new_epoch twice.

From what I understand this doesn't impact the methods called between the inform_new_epoch calls:

  • runner.upkeep call should not be impacted
  • open_signer_registration_round do nothing on slave
  • update_epoch_settings should not be impacted as the data registered by the epoch service (protocol parameters and transactions signing config) don't depends on the master aggregator

The functional impacts should be:

  • epoch service will expose an incorrect list of next_signers in the interval between the two inform_new_epoch calls
  • epoch service will be ready earlier since a first inform_epoch calls will be done without needing a roundtrip to the master aggregator

Is the last point the problem on fast network ? Maybe the synchronizer should be able to "edit" the next signers in the epoch_service instead ?

self.runner.precompute_epoch_data().await?;
}

Expand Down Expand Up @@ -940,7 +940,7 @@ mod tests {
runner
.expect_inform_new_epoch()
.with(predicate::eq(new_time_point_clone.clone().epoch))
.once()
.times(2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to change number of calls ?
Modification on the state_machine seems to concern only the slave mode.
Does it mean we are running a slave ?
Test name say that it is a master: "idle_new_epoch_detected_and_master_has_transitioned_to_epoch"

.returning(|_| Ok(()));
runner
.expect_update_epoch_settings()
Expand Down
Loading
Loading