Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Error Type Accuracy in TaskFailedException FailureDetails with .NET-Isolated app #3070

Merged
merged 5 commits into from
Mar 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@
<PackageVersion Include="Microsoft.CodeAnalysis" Version="3.9.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.Workspaces.MSBuild" Version="3.9.0" />
<PackageVersion Include="Microsoft.Diagnostics.Tracing.TraceEvent" Version="2.0.65" />
<PackageVersion Include="Microsoft.DurableTask.Abstractions" Version="1.8.1" />
<PackageVersion Include="Microsoft.DurableTask.Generators" Version="1.0.0-preview.1" />
<PackageVersion Include="Microsoft.DurableTask.SqlServer.AzureFunctions" Version="1.5.0" />
<PackageVersion Include="Microsoft.Extensions.Configuration" Version="9.0.0" />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,20 @@ internal class DurableSerializationException : Exception
// We set the base class properties of this exception to the same as the parent,
// so that methods in the worker after this can still (typically) access the same information vs w/o
// this exception type.
internal DurableSerializationException(Exception fromException) : base(fromException.Message, fromException.InnerException)
internal DurableSerializationException(Exception fromException) : base(CreateExceptionMessage(fromException), fromException.InnerException)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to understand why this change was made - the reason that I only did the JSON serialization in ToString() originally was because I didn't want to alter the actual properties of the exception - before this change, if the functions worker needed to access the properties of the user code exception, they were functionally unmodified, and it wasn't until the exception was converted to a string for gRPC transmission that the serialization happened.
Can you help me understand what was the negative effect of this decision and what does serializing the message property improve?

Copy link
Collaborator Author

@nytian nytian Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customers want more detailed information about exceptions within the FailureDetails of a TaskFailedEvent, including the specific error type. Currently, this detail is missing, and thus cx will always get unknown type inside FailureDetails.

Why this is missing, is because we only send the exception message at DurableSerializationException. For some reason, the ToString() is only called by grpc log, while the grpc worker only pass exception message and inner Failure. (Check class here)

Thus, when DurableSerializationException went to the OutOfProcMiddleware GetFailureDetails, we will fail for TryExtractSerializedFailureDetailsFromException and TrySplitExceptionTypeFromMessage, and cause the type becomes unknown.

I don't believe I've altered the original user exception; I'm only updating DurableSerializationException, which I assume is used for passing errors via gRPC, if I understand your question correctly. My main concern is that serializing the entire failure details might be excessive and redundant—for example, the error trace and inner exception will be included in the message while also being part of the full exception. However, since I suppose TryExtractSerializedFailureDetailsFromException should be used, I'll leave it as is.

Copy link
Contributor

@andystaples andystaples Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.
I think what I was missing was this:

For some reason, the ToString() is only called by grpc log, while the grpc worker only pass exception message and inner Failure

Very surprising, TBH.

When I was talking about the "properties of the user code exception" I meant the properties of the exception exposed to the ooproc worker - before my change that added DurableSerializationException, the native user code exception type was thrown all the way up until the gRPC logging step. I was trying to be extra careful not to change behavior of the worker, especially for "special case" exception types like OutOfMemoryException.

Anyway, based on the test case (which looks great, btw) and your own experimentation, I'm good to approve this PR. Thanks for investigating this!

{
this.fromException = fromException;
}

public override string ToString()
{
TaskFailureDetails? failureDetails = TaskFailureDetailsConverter.TaskFailureFromException(this.fromException);
return this.Message;
}

// Serilize FailureDetails to JSON
private static string CreateExceptionMessage(Exception ex)
{
TaskFailureDetails? failureDetails = TaskFailureDetailsConverter.TaskFailureFromException(ex);
return JsonConvert.SerializeObject(failureDetails);
}

Expand Down
33 changes: 32 additions & 1 deletion test/e2e/Apps/BasicDotNetIsolated/ActivityErrorHandling.cs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,22 @@ public static async Task<HttpResponseData> CatchHttpStart(
return await client.CreateCheckStatusResponseAsync(req, instanceId);
}

[Function("CatchActivityExceptionFailureDetails_HttpStart")]
public static async Task<HttpResponseData> CatchFailureDetailsHttpStart(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req,
[DurableClient] DurableTaskClient client,
FunctionContext executionContext)
{
ILogger logger = executionContext.GetLogger("CatchActivityExceptionFailureDetails_HttpStart");

string instanceId = await client.ScheduleNewOrchestrationInstanceAsync(
nameof(CatchActivityExceptionFailureDetails));

logger.LogInformation("Started orchestration with ID = '{instanceId}'.", instanceId);

return await client.CreateCheckStatusResponseAsync(req, instanceId);
}

[Function("RetryActivityException_HttpStart")]
public static async Task<HttpResponseData> RetryHttpStart(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req,
Expand Down Expand Up @@ -96,11 +112,26 @@ public static async Task<string> CatchActivityException(
return output;
}
catch (TaskFailedException ex)
{
{
return ex.Message;
}
}

[Function(nameof(CatchActivityExceptionFailureDetails))]
public static async Task<TaskFailureDetails?> CatchActivityExceptionFailureDetails(
[OrchestrationTrigger] TaskOrchestrationContext context)
{
try
{
await context.CallActivityAsync<string>(nameof(RaiseException), context.InstanceId);
return null;
}
catch (TaskFailedException ex)
{
return ex.FailureDetails;
}
}

[Function(nameof(RetryActivityFunction))]
public static async Task<string> RetryActivityFunction(
[OrchestrationTrigger] TaskOrchestrationContext context)
Expand Down
1 change: 1 addition & 0 deletions test/e2e/Tests/E2ETests.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.DurableTask.Abstractions"/>
<PackageReference Include="Microsoft.Extensions.Configuration" />
<PackageReference Include="Microsoft.Extensions.Configuration.EnvironmentVariables" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" />
Expand Down
25 changes: 24 additions & 1 deletion test/e2e/Tests/Tests/ErrorHandlingTests.cs
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
// Copyright (c) .NET Foundation. All rights reserved.
// Copyright (c) .NET Foundation. All rights reserved.
// Licensed under the MIT License. See License.txt in the project root for license information.

using System.Net;
using Microsoft.DurableTask;
using Newtonsoft.Json;
using Xunit;
using Xunit.Abstractions;

Expand Down Expand Up @@ -71,6 +73,27 @@ public async Task OrchestratorWithCaughtActivityException_ShouldSucceed()
Assert.Contains("This activity failed", orchestrationDetails.Output);
}

[Fact]
public async Task OrchestratorWithCaughtActivityExceptionFailuredetails_ContainRightErrorType()
{
using HttpResponseMessage response = await HttpHelpers.InvokeHttpTrigger("CatchActivityExceptionFailureDetails_HttpStart", "");

Assert.Equal(HttpStatusCode.Accepted, response.StatusCode);
string statusQueryGetUri = await DurableHelpers.ParseStatusQueryGetUriAsync(response);

await DurableHelpers.WaitForOrchestrationStateAsync(statusQueryGetUri, "Completed", 30);

var orchestrationDetails = await DurableHelpers.GetRunningOrchestrationDetailsAsync(statusQueryGetUri);
// Deserialize the output to FailureDetails
var failureDetails = JsonConvert.DeserializeObject<TaskFailureDetails>(orchestrationDetails.Output);

// Check FailureDetails contains the right error type and error message,
// Here it should be the same one as the activity function Raise Exception throws.
Assert.NotNull(failureDetails);
Assert.Equal("System.InvalidOperationException", failureDetails.ErrorType);
Assert.Equal("This activity failed", failureDetails.ErrorMessage);
}

[Fact]
[Trait("MSSQL", "Skip")] // Durable Entities are not supported in MSSQL/Dotnet Isolated, see https://github.com/microsoft/durabletask-mssql/issues/205
public async Task OrchestratorWithCaughtEntityException_ShouldSucceed()
Expand Down
Loading