Skip to content

Commit 23253b1

Browse files
committed
Add comment explaining test rewrite for retry operations on different mongos
1 parent 1fbaed5 commit 23253b1

File tree

2 files changed

+26
-0
lines changed

2 files changed

+26
-0
lines changed

src/test/spec/retryable_reads.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,19 @@ async fn retry_read_different_mongos() {
172172
);
173173
return;
174174
}
175+
176+
// NOTE: This test places all failpoints on a single mongos server to avoid flakiness caused by
177+
// incomplete server discovery.
178+
//
179+
// In MongoDB versions 4.2 and 4.4, the SDAM process can be slow or non-deterministic,
180+
// especially immediately after creating the cluster. The driver may not have sent "hello"
181+
// messages to all connected servers yet, which means some mongos instances may still be in
182+
// the "Unknown" state and not selectable for retryable reads.
183+
//
184+
// This caused test failures because the retry logic expected to find a second eligible server,
185+
// but the driver was unaware of its existence. By placing all failpoints on a single mongos
186+
// host, we ensure that server selection and retries happen within a single fully discovered
187+
// router, avoiding issues caused by prematurely filtered or undiscovered servers.
175188
client_options.hosts.drain(2..);
176189
client_options.retry_reads = Some(true);
177190

src/test/spec/retryable_writes.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -317,6 +317,19 @@ async fn retry_write_different_mongos() {
317317
);
318318
return;
319319
}
320+
321+
// NOTE: This test places all failpoints on a single mongos server to avoid flakiness caused by
322+
// incomplete server discovery.
323+
//
324+
// In MongoDB versions 4.2 and 4.4, the SDAM process can be slow or non-deterministic,
325+
// especially immediately after creating the cluster. The driver may not have sent "hello"
326+
// messages to all connected servers yet, which means some mongos instances may still be in
327+
// the "Unknown" state and not selectable for retryable writes.
328+
//
329+
// This caused test failures because the retry logic expected to find a second eligible server,
330+
// but the driver was unaware of its existence. By placing all failpoints on a single mongos
331+
// host, we ensure that server selection and retries happen within a single fully discovered
332+
// router, avoiding issues caused by prematurely filtered or undiscovered servers.
320333
client_options.hosts.drain(2..);
321334
client_options.retry_writes = Some(true);
322335
let hosts = client_options.hosts.clone();

0 commit comments

Comments
 (0)