-
Notifications
You must be signed in to change notification settings - Fork 180
RUST-1842 Update prose tests for mongos deprioritization during retryable ops #1397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
185616d
Assert that events occurred on the same/different mongos instance(s)
JamieTsai1024 f04d6ee
Add condition for CommandEvent::Succeeded
JamieTsai1024 5a25e6f
Temporary logs to verify code paths
JamieTsai1024 42957d3
Log considered servers for selection
JamieTsai1024 dde0525
Replace close connection for different_mongos tests
JamieTsai1024 aac382c
Remove close_connection for failpoints on different mongos retryable …
JamieTsai1024 74b5e44
Rewrite test for retryable reads/writes on different mongos
JamieTsai1024 21fae26
Fix retry on same mongos tests
JamieTsai1024 4f354e1
Simplify logs for deprioritization check
JamieTsai1024 75e6fee
Fix clippy lint check
JamieTsai1024 05cfe21
Replace println with dbg
JamieTsai1024 ad61ffd
Add logs
JamieTsai1024 038cdfb
Only run retry read test
JamieTsai1024 6eb7a22
Remove debugging logs
JamieTsai1024 295c958
Add logs to show deprioritization
JamieTsai1024 1fbaed5
Rename s0 to client
JamieTsai1024 23253b1
Add comment explaining test rewrite for retry operations on different…
JamieTsai1024 fee5f26
Move comment
JamieTsai1024 a2f855a
Correct commented explanation
JamieTsai1024 a4d4c27
Remove debugging logs
JamieTsai1024 22fadca
Remove whitespace
JamieTsai1024 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, can you add a comment here explaining why we set the failpoints this way rather than with separate clients? and ditto elsewhere
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! let me know if you have any suggestions on the explanation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these details aren't quite accurate - the important distinction to note is that we're using the same client to set the failpoints on each mongos as we are for the find operation. The fundamental problem that we were encountering was a race between server discovery, which happens in the background after a client is created, and the server selection process for find, which was previously happening right after creating the client. Server discovery goes roughly as follows:
client
gets created with two mongos addresses (localhost:27017
andlocalhost:27018
) and stores each of these in its topology with an initial server type ofUnknown
. (Unknown
servers are not eligible to be selected for operations)client
sends ahello
message to each mongos and waits for a replyhello
message with information about itself, andclient
uses this information to update its server type fromUnknown
toMongos
Executing an operation (in this case,
enable_fail_point
) on each individual mongos forces the client to complete its discovery of that mongos and select it for the operation. This means that once we get to the find operation,client
has a list of twoMongos
servers to select from. On the contrary, when we were creating a new client for each call toenable_fail_point
and then the find operation, each of those clients was restarting the server discovery process from scratch.The details here can be a little tricky to understand, so let me know if you have any questions about this and we can walk through it in more detail!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the detailed explanation, Isabel! I hadn’t fully understood how server discovery works in the background or how using separate clients was restarting that process. I also realize now that some of my original terminology wasn’t quite accurate (e.g., implying it was about a single mongos instead of the client's discovery state), so I appreciate the correction.
I’ve updated the comment to reflect that. Let me know if it looks good now or if I should tweak anything further - would be happy to chat about it more if my understanding is still off!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! thanks for making those changes.