Skip to content

xUnit v3 tests hang intermittently on .NET 10 with RuntimeAsync enabled #126325

@ericstj

Description

@ericstj

Description

xUnit v3 test runs hang intermittently on .NET 10 (net10.0) when RuntimeAsync is enabled (the default). The hang occurs after all tests have passed — the test host process stalls indefinitely instead of exiting. Setting DOTNET_RuntimeAsync=0 reliably prevents the hang.

We have been unable to isolate a standalone reproduction outside of the MCP C# SDK test suite, but the issue reproduces at ~20% rate per run on a 2-vCPU Linux machine and has been observed in CI across multiple runs.

Reproduction

Repository: https://github.com/modelcontextprotocol/csharp-sdk (branch: main)

Environment:

  • .NET SDK: 10.0.100-preview.5 (or later preview)
  • OS: Linux (Ubuntu), 2 vCPUs (the constraint increases repro rate)
  • xUnit: v3 3.2.2 (xunit.v3 package), runner xunit.runner.visualstudio 3.1.5
  • Test project: tests/ModelContextProtocol.Tests (~158 test files, ~1865 tests)

Steps:

git clone https://github.com/modelcontextprotocol/csharp-sdk.git
cd csharp-sdk

# Constrain to 2 CPUs to increase repro rate (~20% per run)
taskset -c 0,1 dotnet test tests/ModelContextProtocol.Tests \
  -f net10.0 -c Release \
  --blame-hang-timeout 3min \
  --filter "(Execution!=Manual)"

Expected: Test process exits after all tests complete.
Actual: Process hangs after all tests pass. The --blame-hang-timeout eventually kills it.

Workaround — confirms RuntimeAsync is the trigger:

DOTNET_RuntimeAsync=0 taskset -c 0,1 dotnet test tests/ModelContextProtocol.Tests \
  -f net10.0 -c Release \
  --blame-hang-timeout 3min \
  --filter "(Execution!=Manual)"

With DOTNET_RuntimeAsync=0, no hangs were observed across 6+ runs in the same conditions where enabled runs hung ~1 in 6.

Note on standalone repro: We attempted to reproduce this outside the MCP test suite with:

  • A standalone xUnit v3 project containing 2000 async tests with process I/O, channels, IAsyncLifetime, and TaskCompletionSource patterns
  • A console app directly exercising ExecutionContext.Run + async void with GC stress

Neither reproduced the hang (0/10 runs each). The issue appears to require a specific interaction pattern present in the full test suite.

Dump Analysis

We have analyzed multiple crash/hang dumps and found two distinct stuck patterns. Both share the characteristic that all user-level tests have completed successfully.

Pattern 1: async void state machine lost by GC (CI, Ubuntu, Release)

Observed in multiple CI runs on ubuntu-latest (Release configuration).

Symptoms:

  • All ~1865 tests report as PASSED
  • Test host process stalls for minutes after test completion
  • The "stuck" test varies between runs (e.g., CreateAsync_ValidProcessInvalidServer_StdErrCallbackInvoked, CreateAsync_StdErrCallbackThrows_DoesNotCrashProcess)

Dump findings:

  • xUnit's RunTest method uses an async void runTest(object? state) lambda inside ExecutionContext.Run()
  • 34 RunTest state machines are present on the heap, but only 33 corresponding runTest inner state machines exist
  • The missing runTest state machine appears to have been GC'd without completing
  • The finished TaskCompletionSource in the affected RunTest is stuck at WaitingForActivation forever — nothing will ever signal it

Hypothesis: Under RuntimeAsync, the async void state machine created inside ExecutionContext.Run() may not be properly rooted, allowing GC to collect it before completion. Without the state machine running to completion, the TaskCompletionSource that RunTest awaits is never signaled.

Pattern 2: All tasks complete but runner stuck (WSL, latest dump)

Symptoms:

  • All 20 RunTest tasks show RanToCompletion
  • All 13 RunTestCollection tasks show RanToCompletion
  • ProjectAssemblyRunner.Run is stuck awaiting a Task<Object> that remains WaitingForActivation

Dump findings:

  • ThreadPool is alive with 2 idle worker threads
  • No pending ThreadPool work items
  • 102 orphaned MCP session message loops still alive (but these are background tasks that should not block shutdown)
  • The stuck Task<Object> has no continuations or pending work — it is simply never completed

Key xUnit Code Pattern

The relevant xUnit v3 pattern (simplified from XunitTestRunnerBase.RunTest):

async ValueTask<...> RunTest(...)
{
    var finished = new TaskCompletionSource<decimal>();
    
    // async void lambda passed to ExecutionContext.Run
    async void runTest(object? state)
    {
        try
        {
            // ... run the test ...
            finished.TrySetResult(executionTime);
        }
        catch (Exception ex)
        {
            finished.TrySetException(ex);
        }
    }
    
    ExecutionContext.Run(executionContext, runTest, null);
    
    var executionTime = await finished.Task;  // ← hangs here
}

Under RuntimeAsync, if the runTest state machine is not properly rooted after the first await inside the async void method, it could be collected by the GC, leaving finished permanently in WaitingForActivation.

Environment Details

Component Version
.NET SDK 10.0.100-preview.5 (or later)
Runtime .NET 10.0 (RuntimeAsync enabled by default)
OS Ubuntu 24.04 (CI), also reproduced under WSL2
CPU constraint 2 vCPUs (taskset -c 0,1)
xUnit v3 3.2.2
xUnit runner xunit.runner.visualstudio 3.1.5
Test project ModelContextProtocol.Tests

Summary

  • What: xUnit v3 test host hangs after all tests pass on .NET 10
  • When: RuntimeAsync enabled (default on .NET 10)
  • Workaround: DOTNET_RuntimeAsync=0
  • Root cause hypothesis: async void state machine inside ExecutionContext.Run() not properly rooted under RuntimeAsync, allowing GC to collect it mid-execution
  • Repro rate: ~20% on 2-vCPU Linux machine

We have applied DOTNET_RuntimeAsync=0 as a workaround in our CI pipeline (modelcontextprotocol/csharp-sdk). We will revert this workaround once a fix is available.

/cc @dotnet/runtime-async


Tags: area-System.Threading runtime-async

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions