-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Disconnects/Reconnects from TestComm #87
Conversation
nodes := []simplex.NodeID{{1}, {2}, {3}, []byte("lagging")} | ||
net := newInMemNetwork(t, nodes) | ||
startDisconnect := uint64(5) | ||
endDisconnect := uint64(17) // TODO: create test where the lagging node is the leader after replication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively we can parametrize the startDisconnect and endDisconnect indices, and run through them in parallel. We don't need that much, I think a startDisconnect that starts from 0 and can go up to 5, and an endDisconnect that starts from, say, 10, and goes up to 20 should be sufficient. Overall 50 tests that can run in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this would work for certain cases.
require.False(t, isLaggingNodeLeader, "the lagging node should not be the leader")
We require this because after the node re-connects it needs to be sent an fcert to kick start replication. I can still create the tests but we'd need to skip when endDisconect+1
is the lagging node. Should I still do it?
replication_test.go
Outdated
missedSeqs := uint64(0) | ||
// normal nodes continue to make progress | ||
for i := startDisconnect; i < endDisconnect; i++ { | ||
bb.triggerNewBlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we trigger new block when we have a empty round? I don't think we should do that. I know it's a no-op if the control buffer is full, but we're needlessly filling up the control buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to trigger this so that the node that is disconnected doesn't hang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hang on what? We wait for a new block without the lock so we are open for new messages.
Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah nevermind, fixed
Signed-off-by: Sam Liokumovich <[email protected]>
No description provided.