Skip to content

IPC: handle disconnects and protocol corruption gracefully#8602

Open
Evangelink wants to merge 1 commit into
mainfrom
dev/amauryleve/ipc-disconnect-resilience
Open

IPC: handle disconnects and protocol corruption gracefully#8602
Evangelink wants to merge 1 commit into
mainfrom
dev/amauryleve/ipc-disconnect-resilience

Conversation

@Evangelink
Copy link
Copy Markdown
Member

Summary

Hardens the Microsoft.Testing.Platform IPC layer (named-pipe transport between the test host and its in-process clients) so that a peer disconnect or a corrupt/short header byte no longer takes down the host or client process.

Audit items addressed

Part of the P0+P1 exception-handling audit (items #4#10, IPC).

Changes

NamedPipeServer.cs

  • Replaced ApplicationStateGuard.Unreachable() throws in the header read path with graceful disconnect handling (tolerate short reads, treat mid-header EOF as a normal disconnect).
  • Added bounds check currentMessageSize <= 0 → log warning and return cleanly instead of crashing on a corrupt header.
  • Wrapped server-side WriteAsync/FlushAsync/WaitForPipeDrain in try/catch (IOException/ObjectDisposedException) → set clientDisconnected and exit the loop after resetting buffers, rather than tearing down the host.

NamedPipeClient.cs

  • Symmetric write-side hardening: catches the same IOException/ObjectDisposedException and routes through the existing _environment.Exit(GenericFailure) path used by the read-EOF handler.
  • Short-read tolerance on the response header; bounds check currentMessageSize <= 0 → exit on corruption.
  • Wrapped response Deserialize in try/catch (excluding OperationCanceledException) so protocol corruption exits cleanly instead of bubbling an undecorated deserialization exception.

Test

  • New regression test NamedPipeServer_InvalidMessageSizeHeader_DoesNotCrashHost sends a zero-byte size header via a raw NamedPipeClientStream and asserts the server stays alive instead of throwing ApplicationStateGuard.Unreachable.
  • All existing Microsoft.Testing.Platform.UnitTests IPC tests still pass on net9.0.

Notes

  • No public API changes.
  • Behavior matches the existing GenericFailure exit code used for the read-EOF path on the client side.
  • WaitConnectionAsync FailFast is intentionally preserved; only loop-body IO faults get graceful handling.

Replace ApplicationStateGuard.Unreachable() throws in IPC header/payload reads with graceful disconnect handling. Tolerate short reads, treat mid-header EOF as graceful disconnect, validate currentMessageSize > 0, and catch IOException/ObjectDisposedException during write/flush/drain so a peer disconnect cannot crash the host or client process.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 26, 2026 13:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Microsoft.Testing.Platform named-pipe IPC transport so disconnects and certain protocol-corruption scenarios don’t crash the host/client process (replacing prior “unreachable” paths with graceful exits).

Changes:

  • Server: tolerate short/EOF header reads, reject non-positive message sizes, and handle write-side disconnects without FailFast.
  • Client: add write-side disconnect handling, tolerate short response headers, and treat response deserialization failures as a generic IPC failure exit.
  • Tests: add a regression test to ensure an invalid (0) message-size header doesn’t crash the host.
Show a summary per file
File Description
test/UnitTests/Microsoft.Testing.Platform.UnitTests/IPC/IPCTests.cs Adds regression coverage for invalid message-size header handling on the server.
src/Platform/Microsoft.Testing.Platform/IPC/NamedPipeServer.cs Makes the server loop tolerant to mid-header EOF/short reads and write-side disconnects; logs and exits cleanly on certain corrupt headers.
src/Platform/Microsoft.Testing.Platform/IPC/NamedPipeClient.cs Adds write-side disconnect handling, short-read header handling, and exits cleanly on corruption/deserialization errors.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 4

Comment on lines +175 to +178
if (currentMessageSize <= 0)
{
// Protocol corruption: message size must be positive. Drop the connection.
await _logger.LogWarningAsync($"Pipe {PipeName.Name} received invalid message size {currentMessageSize}; closing connection.").ConfigureAwait(false);
Comment on lines +223 to +227
if (currentMessageSize <= 0)
{
// Protocol corruption: message size must be positive.
_environment.Exit((int)ExitCode.GenericFailure);
throw new InvalidOperationException($"Received invalid IPC message size {currentMessageSize}.");
Comment on lines 230 to 232
missingBytesToReadOfCurrentChunk = currentReadBytes - sizeof(int);
missingBytesToReadOfWholeMessage = currentMessageSize;
currentReadIndex = sizeof(int);
Comment on lines +242 to +275
var serverEnvironment = new SystemEnvironment();
bool callbackInvoked = false;

NamedPipeServer server = new(
pipeNameDescription,
_ =>
{
callbackInvoked = true;
return Task.FromResult<IResponse>(VoidResponse.CachedInstance);
},
serverEnvironment,
new Mock<ILogger>().Object,
new SystemTask(),
_testContext.CancellationToken);

try
{
Task waitConnection = server.WaitConnectionAsync(_testContext.CancellationToken);

using (var raw = new System.IO.Pipes.NamedPipeClientStream(".", pipeNameDescription.Name, System.IO.Pipes.PipeDirection.InOut, System.IO.Pipes.PipeOptions.Asynchronous))
{
await raw.ConnectAsync(_testContext.CancellationToken);
await waitConnection;

// Write a zero-length message size header (invalid per protocol).
byte[] invalidHeader = BitConverter.GetBytes(0);
await raw.WriteAsync(invalidHeader, 0, invalidHeader.Length, _testContext.CancellationToken);
await raw.FlushAsync(_testContext.CancellationToken);
}

// The server's internal loop should exit cleanly within the disposal timeout. If the loop
// crashed via FailFast the test process would be terminated instead of running to completion.
Assert.IsFalse(callbackInvoked, "Server callback must not run for an invalid message header.");
}
}

currentReadBytes += additionalBytes;
missingBytesToReadOfCurrentChunk = currentReadBytes;
@Evangelink
Copy link
Copy Markdown
Member Author

Test Coverage Gaps

The PR introduces 8 new error paths but only tests 1 (server-side zero message size). Missing test scenarios:

Client-side (4 untested paths):

  1. Write-side IOException: Server closes pipe during WriteAsync → verify _environment.Exit(GenericFailure) called
  2. Mid-header disconnect: Server sends 2 bytes then closes → verify _environment.Exit and IOException thrown
  3. Negative message size: Server sends -1 header → verify _environment.Exit and InvalidOperationException thrown
  4. Deserialization failure: Server sends corrupted payload → verify _environment.Exit called

Server-side (3 untested paths):

  1. Mid-header disconnect: Client sends 2 bytes then closes → verify graceful loop exit
  2. Negative message size: Client sends -1 header → verify warning logged and loop exits (existing test only covers zero, not negative)
  3. Byte-count overflow: Client declares N bytes but writes >N bytes → verify warning logged and loop exits
  4. Write-side IOException: Client disconnects during server reply → verify graceful loop exit

All 8 scenarios are concrete failing interleavings that the new code explicitly handles. Recommend adding these tests to prevent regressions.

Generated by Expert Code Review (on open) for issue #8602 · ● 4M ·

// If currentRequestSize is 0, we need to read the message size
if (currentMessageSize == 0)
{
// We need at least sizeof(int) bytes to parse the message-size header. A pipe read can
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is adding a lot of additional code. Is there a concrete bug that makes it worth adding this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants