Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth: add connection manager to drop connections when needed #31476

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

cskiraly
Copy link
Collaborator

WIP: this is still wip for a few reasons:

  • conditions of dropping a peer should be improved, including also inbound peer drop.
  • I've observed a deadlock in one test, still to be checked if the new module was the reason or not.

As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed.

As mentioned in #31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on:

  • the state of geth (e.g. sync or not)
  • current number of peers
  • peer level metrics

This PR adds a very slow churn using a random drop.

@cskiraly cskiraly requested review from fjl and zsfelfoldi as code owners March 24, 2025 08:26
@@ -69,6 +69,7 @@ const (
DiscUnexpectedIdentity
DiscSelf
DiscReadTimeout
DiscDropped
Copy link
Contributor

@fjl fjl Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't add here since these values are defined by the protocol.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use 0x04 (Too many peers), at least for now when we random disconnect.
Later after integrating metrics, we might change to 0x03 (Useless peer), when dropping peers that are actually useless.

@fjl fjl changed the title p2p/connmanager: WIP - Add connection manager to drop connections when needed p2p: add connection manager to drop connections when needed Mar 24, 2025
@fjl
Copy link
Contributor

fjl commented Mar 24, 2025

If you want to track the sync status of the node, this code will have to live outside of package p2p. It'll likely need to be created from eth.Ethereum instead.

@cskiraly cskiraly changed the title p2p: add connection manager to drop connections when needed eth/connmanager: add connection manager to drop connections when needed Mar 26, 2025
@fjl fjl changed the title eth/connmanager: add connection manager to drop connections when needed eth: add connection manager to drop connections when needed Mar 26, 2025
cskiraly added 15 commits March 28, 2025 13:12
Dropping peers randomly with a slow pace to create some artificial
churn.

Signed-off-by: Csaba Kiraly <[email protected]>
Better positioned in package eth to access relevant data about
connection quality.
Signed-off-by: Csaba Kiraly <[email protected]>
if !syncing {
// If a drop was already scheduled, Schedule does nothing.
if cm.maxDialPeers-numDialed <= peerDropThreshold {
cm.peerDropDialedTimer.Schedule(cm.clock.Now().Add(peerDropInterval))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer random schedule. Otherwise it will be possible to predict exactly the next slot opening time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I wanted to introduce it also to decouple the two timers. Will add soon

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about even having two timers. It's not really necessary. We could just have one random timer and make a decision based on the current state.

Copy link
Collaborator Author

@cskiraly cskiraly Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could use a single one, but I opted for the two separate timers because the two peer pools are well separated.

  • Their limit is defined separately, which basically means that there are dedicated slots for dialed and inbould connections.
  • Their ingress processes are independent, one being discovery based outgoing dials, another being incoming connections based on our advertised ID.
  • Their fill speed is largely different, one being throttled (2x free dial slots outstanding dial attempts), while the other is not throttled (just depends on how much our ENR is getting found and how many nodes are actively hunting for dial candidates).
  • The role of having free slots in one or the other serves a different purpose in the system: free dial slots just meaning we can find new peers hoping to get better connected, while free inbound slots giving other nodes the possibility to join the system.

We do connect the two pools in our download/fetch/tx propagation scheduling, where AFAIK we don't differentiate at the moment between dialed and inbound peers, but I would avoid using a single timer when there is value in keeping the two pools separate.

While I'm not planning to set different intervals for the two right now, this might be the case later on. Note that we could even autotune these intervals based on connection acquisition rate stats, if we would like to ... although I think that would be overcomplicating things for no reason right now.

The simplest implementation would be a single always running randomized timer checking all conditions including sync status and exclusions on tick. But I think we will be going towards more complex drop conditions, for which the double timer with start and stop seems a better base to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants