Test | Add flaky test quarantine zone #3856

mdaigle · 2025-12-19T20:54:09Z

At the moment, we have a number of flaky tests in our pipelines (a couple of which I'll take credit for) that cause random failures, necessitate pipeline reruns, and cheapen the value of our regression testing suite. When tests fail regularly, we pay less attention when we see a failing test. A common pattern to deal with flaky tests is to establish a "quarantine zone" where these tests can live temporarily until they're improved. The quarantine zone is separate from the main build pipelines and does not block main builds, but runs regularly so that we can keep an eye on how flaky tests are performing.

To establish a quarantine zone, I'm planning to use the "flaky" category introduced in this PR: #3488. Tests in this category will be run in a separate testing stage immediately after the corresponding Unit/Functional/Manual testing stage (e.g. Unit => UnitFlaky => Functional => FunctionalFlaky ...). The flaky test stage will not block the pipeline, and failures will be ignored using the continueOnError property: steps.task definition | Microsoft Learn

The ActiveIssue tag will remain reserved for tests that cannot pass due to a platform specific bug, pipeline issue, or other limitation that causes the test to always fail.

I started off by tagging our top offenders based on this dashboard. As we discover more flaky tests, the DRI (or anyone) can open a PR adding the flaky tag to the offending tests. Flaky tests can be addressed either via prioritization during sprint planning or when working on a related section of code.

…be evaluated?

Copilot

Pull request overview

This PR simplifies test filtering to support a quarantine zone for flaky tests by updating the --filter expressions in test execution targets. The changes remove platform-specific category exclusions and standardize filtering to only exclude tests marked as "failing" or "flaky" across all test types (Functional and Manual).

Simplifies test filters to exclude only "failing" and "flaky" categories
Removes platform-specific category filters that are no longer needed
Applies consistent filtering logic across Functional and Manual test targets for both Windows and Unix platforms

build.proj

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.

eng/pipelines/common/templates/steps/run-all-tests-step.yml

build.proj

eng/pipelines/common/templates/steps/run-all-tests-step.yml

build.proj

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

eng/pipelines/common/templates/steps/run-all-tests-step.yml

build.proj

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 13 comments.

build.proj

eng/pipelines/common/templates/steps/run-all-tests-step.yml

eng/pipelines/common/templates/steps/build-and-run-tests-netfx-step.yml

eng/pipelines/common/templates/steps/build-and-run-tests-netcore-step.yml

build.proj

eng/pipelines/common/templates/steps/build-and-run-tests-netfx-step.yml

eng/pipelines/common/templates/steps/build-and-run-tests-netcore-step.yml

build.proj

edwardneal · 2025-12-20T12:04:07Z

Can we also include the diagnostic testing please? These seemed to be flaky on ARM64.

paulmedynski

Approved, assuming the Copilot comments will be addressed.

…laky-test-quarantine

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

eng/pipelines/common/templates/steps/build-and-run-tests-netcore-step.yml

eng/pipelines/common/templates/steps/build-and-run-tests-netfx-step.yml

eng/pipelines/common/templates/steps/build-and-run-tests-netcore-step.yml

eng/pipelines/common/templates/jobs/ci-code-coverage-job.yml

codecov · 2026-01-06T21:38:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.68%. Comparing base (ec842b1) to head (bddb84e).
⚠️ Report is 18 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (ec842b1) and HEAD (bddb84e). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (ec842b1) HEAD (bddb84e)

addons 1 0

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #3856       +/-   ##
===========================================
- Coverage   90.82%   69.68%   -21.15%     
===========================================
  Files           6      254      +248     
  Lines         316    64067    +63751     
===========================================
+ Hits          287    44642    +44355     
- Misses         29    19425    +19396

Flag	Coverage Δ
addons	`?`
netcore	`69.68% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Skip flaky tests. Try removing platform filters which should already …

fa22414

…be evaluated?

Copilot AI review requested due to automatic review settings December 19, 2025 20:54

Copilot started reviewing on behalf of mdaigle December 19, 2025 20:54 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

build.proj Outdated Show resolved Hide resolved

build.proj Outdated Show resolved Hide resolved

build.proj Outdated Show resolved Hide resolved

build.proj Outdated Show resolved Hide resolved

mdaigle added 3 commits December 19, 2025 14:07

Make filter configurable in build.proj

dba39e7

Add quarantined testing steps.

13b684c

Fix closing quote.

8a4f8e0

Copilot AI review requested due to automatic review settings December 19, 2025 22:40

Copilot started reviewing on behalf of mdaigle December 19, 2025 22:41 View session

Fix closing quote.

4d14a1e

Copilot AI reviewed Dec 19, 2025

View reviewed changes

mdaigle added 2 commits December 19, 2025 15:00

Fix quote wrapping for filter argument.

ae19c47

Fix continueOnError indent.

29ecebc

Copilot AI review requested due to automatic review settings December 19, 2025 23:20

Copilot started reviewing on behalf of mdaigle December 19, 2025 23:20 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

mdaigle changed the title ~~Test | Add quarantine zone~~ Test | Add flaky test quarantine zone Dec 19, 2025

mdaigle added 2 commits December 19, 2025 16:06

Tag flaky tests.

3413a96

Tag xevent test as flaky.

4c4cf1e

Copilot AI review requested due to automatic review settings December 20, 2025 00:13

Copilot started reviewing on behalf of mdaigle December 20, 2025 00:14 View session

mdaigle marked this pull request as ready for review December 20, 2025 00:14

mdaigle requested a review from a team as a code owner December 20, 2025 00:14

Copilot AI reviewed Dec 20, 2025

View reviewed changes

paulmedynski self-assigned this Dec 22, 2025

paulmedynski previously approved these changes Dec 22, 2025

View reviewed changes

mdaigle added 3 commits January 5, 2026 10:02

Fix indentation.

2fe2941

Mark DiagnosticTest as flaky.

c56e1b0

Merge branch 'main' of github.com:dotnet/SqlClient into dev/mdaigle/f…

bde6d56

…laky-test-quarantine

mdaigle added 2 commits January 5, 2026 11:03

Remove stray preprocessor directive.

c846ec8

Don't collect code coverage for flaky tests.

a4a1864

mdaigle dismissed paulmedynski’s stale review via a4a1864 January 5, 2026 21:58

Add preliminary cleanup in case previous runs left files behind.

13661d3

Copilot AI review requested due to automatic review settings January 6, 2026 00:12

Copilot started reviewing on behalf of mdaigle January 6, 2026 00:13 View session

Copilot AI reviewed Jan 6, 2026

View reviewed changes

mdaigle requested a review from paulmedynski January 6, 2026 18:48

paulmedynski reviewed Jan 6, 2026

View reviewed changes

eng/pipelines/common/templates/jobs/ci-code-coverage-job.yml Show resolved Hide resolved

Use parameter to control code cov upload.

bddb84e

cheenamalhotra approved these changes Jan 6, 2026

View reviewed changes

samsharma2700 approved these changes Jan 6, 2026

View reviewed changes

mdaigle merged commit e243fbe into main Jan 6, 2026
250 checks passed

mdaigle deleted the dev/mdaigle/flaky-test-quarantine branch January 6, 2026 23:09

Test | Add flaky test quarantine zone #3856

Test | Add flaky test quarantine zone #3856

Uh oh!

Conversation

mdaigle commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

edwardneal commented Dec 20, 2025

Uh oh!

paulmedynski left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mdaigle commented Dec 19, 2025 •

edited

Loading

codecov bot commented Jan 6, 2026 •

edited

Loading