-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Error Type Accuracy in TaskFailedException FailureDetails with .NET-Isolated app #3070
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
@@ -11,14 +11,20 @@ internal class DurableSerializationException : Exception | |||
// We set the base class properties of this exception to the same as the parent, | |||
// so that methods in the worker after this can still (typically) access the same information vs w/o | |||
// this exception type. | |||
internal DurableSerializationException(Exception fromException) : base(fromException.Message, fromException.InnerException) | |||
internal DurableSerializationException(Exception fromException) : base(CreateExceptionMessage(fromException), fromException.InnerException) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to understand why this change was made - the reason that I only did the JSON serialization in ToString() originally was because I didn't want to alter the actual properties of the exception - before this change, if the functions worker needed to access the properties of the user code exception, they were functionally unmodified, and it wasn't until the exception was converted to a string for gRPC transmission that the serialization happened.
Can you help me understand what was the negative effect of this decision and what does serializing the message property improve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Customers want more detailed information about exceptions within the FailureDetails
of a TaskFailedEvent, including the specific error type. Currently, this detail is missing, and thus cx will always get unknown
type inside FailureDetails.
Why this is missing, is because we only send the exception message at DurableSerializationException. For some reason, the ToString() is only called by grpc log, while the grpc worker only pass exception message and inner Failure. (Check class here)
Thus, when DurableSerializationException went to the OutOfProcMiddleware GetFailureDetails, we will fail for TryExtractSerializedFailureDetailsFromException and TrySplitExceptionTypeFromMessage, and cause the type becomes unknown
.
I don't believe I've altered the original user exception; I'm only updating DurableSerializationException, which I assume is used for passing errors via gRPC, if I understand your question correctly. My main concern is that serializing the entire failure details might be excessive and redundant—for example, the error trace and inner exception will be included in the message while also being part of the full exception. However, since I suppose TryExtractSerializedFailureDetailsFromException
should be used, I'll leave it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
I think what I was missing was this:
For some reason, the ToString() is only called by grpc log, while the grpc worker only pass exception message and inner Failure
Very surprising, TBH.
When I was talking about the "properties of the user code exception" I meant the properties of the exception exposed to the ooproc worker - before my change that added DurableSerializationException, the native user code exception type was thrown all the way up until the gRPC logging step. I was trying to be extra careful not to change behavior of the worker, especially for "special case" exception types like OutOfMemoryException.
Anyway, based on the test case (which looks great, btw) and your own experimentation, I'm good to approve this PR. Thanks for investigating this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
@@ -11,14 +11,20 @@ internal class DurableSerializationException : Exception | |||
// We set the base class properties of this exception to the same as the parent, | |||
// so that methods in the worker after this can still (typically) access the same information vs w/o | |||
// this exception type. | |||
internal DurableSerializationException(Exception fromException) : base(fromException.Message, fromException.InnerException) | |||
internal DurableSerializationException(Exception fromException) : base(CreateExceptionMessage(fromException), fromException.InnerException) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
I think what I was missing was this:
For some reason, the ToString() is only called by grpc log, while the grpc worker only pass exception message and inner Failure
Very surprising, TBH.
When I was talking about the "properties of the user code exception" I meant the properties of the exception exposed to the ooproc worker - before my change that added DurableSerializationException, the native user code exception type was thrown all the way up until the gRPC logging step. I was trying to be extra careful not to change behavior of the worker, especially for "special case" exception types like OutOfMemoryException.
Anyway, based on the test case (which looks great, btw) and your own experimentation, I'm good to approve this PR. Thanks for investigating this!
BTW - I see the failure in mssql E2E tests - NBD for me. I'll need to investigate this at some point, but definitely not related to these changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Directory.Packages.props
Outdated
@@ -64,6 +64,7 @@ | |||
<PackageVersion Include="Microsoft.CodeAnalysis" Version="3.9.0" /> | |||
<PackageVersion Include="Microsoft.CodeAnalysis.Workspaces.MSBuild" Version="3.9.0" /> | |||
<PackageVersion Include="Microsoft.Diagnostics.Tracing.TraceEvent" Version="2.0.65" /> | |||
<PackageVersion Include="Microsoft.DurableTask.Abstractions" Version="1.8.0" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be 1.8.1
, which is the latest version?
Issue describing the changes in this PR
partialy solves #2689
This PR ensures that when orchestrators catch exceptions, the FailureDetails.ErrorType will match the type of the original exception thrown by the activity or sub-orchestration function.
For example, if an activity function throws an InvalidOperationException, the FailureDetails.ErrorType in the resulting TaskFailedException should be "InvalidOperationException" .
Pull request checklist
pending_docs.md
release_notes.md
/src/Worker.Extensions.DurableTask/AssemblyInfo.cs
dev
andmain
branches and will not be merged into thev2.x
branch.