Skip to content

Backfill peer attribution #7762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: unstable
Choose a base branch
from

Conversation

pawanjay176
Copy link
Member

Issue Addressed

Partly addresses #7744

Proposed Changes

Implement similar peer sync attribution like in #7733 for backfill sync.

@pawanjay176 pawanjay176 added the work-in-progress PR is a work-in-progress label Jul 18, 2025
@pawanjay176 pawanjay176 requested a review from jxs as a code owner July 18, 2025 21:48
@pawanjay176 pawanjay176 added syncing fulu Required for the upcoming Fulu hard fork labels Jul 18, 2025
@jimmygchen jimmygchen added fusaka-devnet-3 and removed fulu Required for the upcoming Fulu hard fork labels Jul 22, 2025
@dapplion
Copy link
Collaborator

Can you implement max retries too before adding this to backfill?

 Please enter the commit message for your changes. Lines starting
@pawanjay176 pawanjay176 changed the base branch from unstable to fusaka-devnet-3 July 24, 2025 00:53
@pawanjay176 pawanjay176 removed the work-in-progress PR is a work-in-progress label Jul 24, 2025
@jimmygchen jimmygchen added the ready-for-review The code is ready for review label Jul 24, 2025
@jimmygchen
Copy link
Member

I've tested this running a supernode on fusaka-devnet-2 and it's no longer OOMing during backfill. Memory usage is consistently around 8 - 10G, and active sync requests metric remain low.

Looking great @pawanjay176

@jimmygchen
Copy link
Member

I'll do a round of review shortly.
@dapplion it would be helpful to have your review too 🙏

jimmygchen added a commit that referenced this pull request Jul 24, 2025
Squashed commit of the following:

commit 646033b
Author: Jimmy Chen <[email protected]>
Date:   Thu Jul 24 14:13:55 2025 +1000

    Use self-hosted runners for Fusaka devnet testing.

commit 718e703
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 19:41:28 2025 -0700

    lint

commit 5246a20
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 19:38:24 2025 -0700

    Add max retry logic

commit d5bcf9f
Merge: 432f68c 9911f34
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 17:49:41 2025 -0700

    Merge branch 'unstable' into peer-attribution-backfill

commit 432f68c
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 17:49:06 2025 -0700

    Cleanup

     Please enter the commit message for your changes. Lines starting

commit 9bd0f28
Author: Pawan Dhananjay <[email protected]>
Date:   Fri Jul 18 16:46:07 2025 -0500

    Add retries on backfill

commit 0293d0a
Author: Pawan Dhananjay <[email protected]>
Date:   Fri Jul 18 14:26:44 2025 -0500

    Address some of lion's comments from the other PR
Copy link

mergify bot commented Jul 24, 2025

Some required checks have failed. Could you please take a look @pawanjay176? 🙏

@mergify mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Jul 24, 2025
@jimmygchen jimmygchen force-pushed the peer-attribution-backfill branch from 65352a6 to 41ad1a9 Compare July 24, 2025 13:07
jimmygchen added a commit that referenced this pull request Jul 24, 2025
Squashed commit of the following:

commit 41ad1a9
Author: Jimmy Chen <[email protected]>
Date:   Thu Jul 24 22:28:27 2025 +1000

    Run devnet sync test on `Kurtosis` runner.

commit 646033b
Author: Jimmy Chen <[email protected]>
Date:   Thu Jul 24 14:13:55 2025 +1000

    Use self-hosted runners for Fusaka devnet testing.

commit 718e703
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 19:41:28 2025 -0700

    lint

commit 5246a20
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 19:38:24 2025 -0700

    Add max retry logic

commit d5bcf9f
Merge: 432f68c 9911f34
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 17:49:41 2025 -0700

    Merge branch 'unstable' into peer-attribution-backfill

commit 432f68c
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Jul 23 17:49:06 2025 -0700

    Cleanup

     Please enter the commit message for your changes. Lines starting

commit 9bd0f28
Author: Pawan Dhananjay <[email protected]>
Date:   Fri Jul 18 16:46:07 2025 -0500

    Add retries on backfill

commit 0293d0a
Author: Pawan Dhananjay <[email protected]>
Date:   Fri Jul 18 14:26:44 2025 -0500

    Address some of lion's comments from the other PR
@jimmygchen jimmygchen changed the base branch from fusaka-devnet-3 to unstable July 24, 2025 13:21
@jimmygchen
Copy link
Member

@pawanjay176 I changed the base branch to unstable because I've already merged this into fusaka-devnet-3 for testing.

@mergify mergify bot added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Jul 24, 2025
let range_req = entry.get_mut();
if let Some(blocks_result) = range_req.responses(&self.chain.spec) {
if let Err(CouplingError::PeerFailure {
action,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like action will only be penalized if it's the last retry. That feels wrong as if the peer did something bad any retry attempt should be penalized regardless of ordering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ExceededMaxRetries error also gets downscored, but yeah this logic looks pretty contrived, i'm still trying to iterate on it.

@@ -173,7 +173,8 @@ jobs:
# Tests checkpoint syncing to a live network (current fork) and a running devnet (usually next scheduled fork)
checkpoint-sync-test:
name: checkpoint-sync-test-${{ matrix.network }}
runs-on: ubuntu-latest
# Use self-hosted runner for Fusaka devnet testing as GitHub hosted ones aren't able to keep up with the chain.
runs-on: ${{ github.repository == 'sigp/lighthouse' && matrix.network == 'devnet' && fromJson('["self-hosted", "linux", "Kurtosis", "large"]') || 'ubuntu-latest' }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a separate PR to fix and test this. #7804

}
}
CouplingError::InternalError(msg) => {
debug!(?batch_id, msg, "Block components coupling internal error");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be warn or error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants