fix: AOT interop with managed .NET runtimes #1392

jpnurmi · 2025-09-29T10:58:03Z

When chaining signal handlers in AOT mode, detect whether the .NET runtime converts a signal to a managed exception and transfers execution to the managed exception handler. In this case, Sentry Native should abort crash handling because the exception is caught and handled in managed code.

try
{
    var s = default(string);
    var c = s.Length;
}
catch (NullReferenceException exception)
{
    // the exception is caught and handled in managed code. in AOT mode, execution
    // should continue normally without sentry-native's crash handling kicking in 
}

See also:

Note

Detect .NET runtime converting signals to managed exceptions and skip native crash handling; add JIT/AOT tests and changelog entry.

Inproc backend (Linux):
- Add get_stack_pointer and get_instruction_pointer for multiple architectures to read from ucontext_t.
- When CHAIN_AT_START, compare IP/SP before/after invoking prior handler; if changed, treat as managed exception and abort native handling.
Tests:
- Refactor JIT runners (run_jit_*) and add AOT runners (run_aot_*), including AOT publish and execution.
- Update fixture Program.cs to use null-forgiving s! and conditionally rethrow via args ("managed-exception").
- Add separate test_aot_signals_inproc and rename JIT test; adjust skip reasons/messages.
Changelog:
- Add Unreleased note: fix AOT interop with managed .NET runtimes.

^{Written by Cursor Bugbot for commit 060b18b. This will update automatically on new commits. Configure here.}

src/backends/sentry_backend_inproc.c

supervacuus

Thanks for finding the diff to make the other implementations work @jpnurmi. Since this deviates significantly, we should document the change as clearly as possible (for our internal use) to ensure understanding that we have shifted the focus to the current AOT signal/exception interface or adapt the tests to cover the relevant area.

src/backends/sentry_backend_inproc.c

tests/fixtures/dotnet_signal/Program.cs

tests/test_dotnet_signals.py

CHANGELOG.md

tests/fixtures/dotnet_signal/test_dotnet.csproj

Co-authored-by: Mischan Toosarani-Hausberger <[email protected]>

src/backends/sentry_backend_inproc.c

supervacuus

I wonder if we can isolate the SIGABRT in case of an unhandled exception on AOT/Mono as well.

src/backends/sentry_backend_inproc.c

tests/fixtures/dotnet_signal/Program.cs

supervacuus · 2025-09-30T13:45:49Z

src/backends/sentry_backend_inproc.c

+                SENTRY_DEBUG("runtime converted the signal to a managed "
+                             "exception, we do not handle the signal");
+                return;
+            }


This is absolutely correct, but the only side-effect currently visible is for the logging toggle. Similarly, to how we "leave" the signal handler before chaining, we must also re-enable logging immediately after "leaving" and disable it again before re-entering, because if it were a managed code exception, we want logging to remain enabled.

We can also move the entire sig_slot assignment down below the CHAIN_AT_START code, to make the path dependencies more obvious.

However, I think both have a lower priority than figuring out the signaling sequence of both runtimes and how we can align them.

tests/test_dotnet_signals.py

jpnurmi · 2025-10-01T09:34:24Z

I'm trying to fix the scenario where Mono's signal handler detects a managed exception, it modifies the context to transfer execution to a managed exception handler, and then execution returns to Sentry Native's signal handler. In this case, Sentry Native needs to detect that Mono wanted execution to continue, and abort crash handling.

In case of a real native crash, though, if we invoke Mono's signal handler first and Mono's native crash handling decides to call _exit(), then Sentry Native misses the crash. 🙁

supervacuus · 2025-10-01T09:48:43Z

I'm trying to fix the scenario where Mono's signal handler detects a managed exception, it modifies the context to transfer execution to a managed exception handler, and then execution returns to Sentry Native's signal handler. In this case, Sentry Native needs to detect that Mono wanted execution to continue, and abort crash handling.

Isn't that what you're trying to do here all along? Or is there yet another difference when you use pure Mono?

In case of a real native crash, though, if we invoke Mono's signal handler first and Mono's native crash handling decides to call _exit(), then Sentry Native misses the crash. 🙁

Were you able to observe this? Because this only happens when crash_chaining is disabled. I cannot imagine that crash or signal chaining is off by default (especially not on Android or Linux).

supervacuus · 2025-10-01T09:54:52Z

Isn't that what you're trying to do here all along? Or is there yet another difference when you use pure Mono?

Btw, if it is the latter, then this is also the reason why I suggested that CLR JIT support can be dropped altogether. When I started this project (which was over a year ago), the primary goal was to determine how much the handler interaction between the various runtime implementations converges. I started with CLR JIT as a baseline. However, if we primarily have downstream usage for another implementation that diverges entirely in signal handling, then we can either drop the current implementation or add another handler strategy.

jpnurmi · 2025-10-01T09:58:48Z

Were you able to observe this? Because this only happens when crash_chaining is disabled. I cannot imagine that crash or signal chaining is off by default (especially not on Android or Linux).

I tried creating a test case using mcs + mono --aot on Linux. Mono's native crash reporter kicks in when we call Mono's signal handler for a native crash, and execution ends there...

=================================================================
        Native Crash Reporting
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

=================================================================
        Native stacktrace:
=================================================================
        0x62c7d67295fa - mono :
        0x62c7d66c7e8a - mono :
        0x62c7d671cad0 - mono :
        0x728ba68491a6 - /tmp/pytest-of-jpnurmi/pytest-55/cmake0/libcrash.so : native_crash
        0x40961618 - Unknown

=================================================================
        Native Crash Reporting
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

=================================================================
        Native stacktrace:
=================================================================
        0x62c7d67295fa - mono :
        0x62c7d66c7e8a - mono :
        0x62c7d671cad0 - mono :
        0x728ba68491a6 - /tmp/pytest-of-jpnurmi/pytest-55/cmake0/libcrash.so : native_crash
        0x40961618 - Unknown

=================================================================
        Telemetry Dumper:
=================================================================
Pkilling 0x125944111036096x from 0x125944120812288x
Entering thread summarizer pause from 0x125944120812288x
Finished thread summarizer pause from 0x125944120812288x.
Failed to create breadcrumb file (null)/crash_hash_0x3652010b5

Waiting for dumping threads to resume

=================================================================
        Basic Fault Address Reporting
=================================================================
Memory around native instruction pointer (0x728ba68491a6):0x728ba6849196  ff ff ff f3 0f 1e fa 55 48 89 e5 b8 0a 00 00 00  .......UH.......
0x728ba68491a6  c7 00 64 00 00 00 90 5d c3 f3 0f 1e fa 55 48 89  ..d....].....UH.
0x728ba68491b6  e5 48 83 ec 30 64 48 8b 04 25 28 00 00 00 48 89  .H..0dH..%(...H.
0x728ba68491c6  45 f8 31 c0 48 c7 45 d8 00 40 00 00 48 8b 45 d8  [email protected].

=================================================================
        Managed Stacktrace:
=================================================================
          at <unknown> <0xffffffff>
          at dotnet_signal.Program:native_crash <0x000a7>
          at dotnet_signal.Program:Main <0x000e8>
          at <Module>:runtime_invoke_void_object <0x00091>
=================================================================

jpnurmi · 2025-10-01T10:11:34Z

Were you able to observe this? Because this only happens when crash_chaining is disabled. I cannot imagine that crash or signal chaining is off by default (especially not on Android or Linux).

I tried creating a test case using mcs + mono --aot on Linux. Mono's native crash reporter kicks in when we call Mono's signal handler for a native crash, and execution ends there...

No wait, it's the newly added IP/SP check that prevents native crash handling, too. 🤦 How the heck do we distinguish between these.......

supervacuus · 2025-10-01T10:19:19Z

No wait, it's the newly added IP/SP check that prevents native crash handling, too. 🤦 How the heck do we distinguish between these.......

I was wary of checking ucontext modifications along the signal chain. I didn't have the time to review the implementation, but I can.

supervacuus · 2025-10-01T10:41:01Z

No wait, it's the newly added IP/SP check that prevents native crash handling, too. 🤦 How the heck do we distinguish between these.......

Try, as a first step, to switch the order of the handler chain for Mono (and drop your current IP/SP check or even the CHAIN_AT_FIRST strategy entirely). The way it seems to be operating makes more sense if their handlers get installed last. In "old" Mono, there were managed-language side functions that could (un)install signal handlers at specific points (which could be controlled from the sentry-dotnet around the native SDK initialization) to control the chain being:

DFL <- Native SDK <- mono handler

rather than what we have now:

DFL <- mono handler <- Native SDK

Not sure if they are still exposed in the dotnet/runtime mono fork, but we can certainly try to achieve something similar. Then we would have their handler first and might not need an alternative strategy inside our handler; maybe not even for CLR (but one step at a time).

jpnurmi · 2025-10-02T15:53:32Z

Swapping the order of the signal handlers would work. I was able to confirm the theory on Linux, even though I had to patch Mono to either

make Mono.Runtime.RemoveSignalHandlers defined on desktop
https://github.com/mono/mono/blob/0f53e9e151d92944cacab3e24ac359410c606df6/mcs/class/corlib/Mono/Runtime.cs#L55-L63
or export mono_runtime_install/cleanup_handlers symbols
https://github.com/mono/mono/blob/0f53e9e151d92944cacab3e24ac359410c606df6/mono/mini/mini-runtime.h#L546-L556

to make it possible to swap the order in either managed or native code, respectively. However, that's just Linux, which is not relevant for Sentry .NET on Android or iOS. The problem is that there's no such type as Mono.Runtime on either Android or iOS... 🤔

supervacuus · 2025-10-03T09:10:26Z

to make it possible to swap the order in either managed or native code, respectively. However, that's just Linux, which is not relevant for Sentry .NET on Android or iOS. The problem is that there's no such type as Mono.Runtime on either Android or iOS...

We should do this in the Native SDK, similar to how we can change the invocation sequence in the handler; we can construct the signal chain up to a point during the setup of the signal handlers (rather than at signal-handling time). I can follow up on this topic next week.

getsentry/sentry-native#1392 fixes #3954

jpnurmi · 2025-10-21T09:25:54Z

The main purpose of debugging on Linux was just to understand Mono's behavior. 🙂

Anyway, sentry-dotnet has new integration tests for Android:

I have also temporarily hacked sentry-dotnet's build system to pick sentry-android from a local Maven repo instead of downloading from remotes:

getsentry/sentry-dotnet/jpnurmi/android-maven-local

This assumes locally built and published (gradlew publishToMavenLocal) of both:

Furthermore, I reverted this old change and switched sentry-dotnet back to CHAIN_AT_START for testing purposes:

Revert signal handler chaining order (PR #3694) sentry-dotnet#3871

With local builds and all above changes temporarily combined in sentry-dotnet's jpnurmi/android-chain-at-start branch, I can confirm that this PR fixes the NullReferenceException test case while the CrashType.Native test case still passes on both arm64 and x86_64.

They are irrelevant for sentry-dotnet or .NET on Android, and there are no tests checking if they even work.

supervacuus

Happy to see that the adaptation works downstream ❤️

Please move the log flush (Flush logs in a crash-safe...) below the point where we call sentry__page_allocator_enable() but outside the #ifdef because that should happen on all platforms. There is no reason to flush the logs if we don't know there is a terminal signal to handle, and it also requires the allocator, which isn't safe before we enable the page-allocator.

Otherwise, I would either document why we hide the behavior for managed exceptions that aren't handled in C# code now (since this has a more limited scope than "all managed exceptions" and essentially hides behavior) or adapt the test handling accordingly.

…tion)

jpnurmi · 2025-10-22T12:15:17Z

android.Tests.ps1

android-arm64.log

android-x86_64.log

P.S. These tests were only executed in Release mode, but I've prepared a patch to make them execute in both Debug and Release:

test: run mobile integration tests in Release and Debug modes sentry-dotnet#4666

For what it's worth, they did seem to pass locally on both x86_64 and arm64, even though the NullReferenceException leakage only occurs in Release-optimized code. It's good to have Debug mode integration tests, nevertheless, to make sure we capture native crashes as expected, even with the IP/SP check in place.

supervacuus · 2025-10-22T13:33:24Z

P.S. These tests were only executed in Release mode

Yes, I have seen this. Release is more critical because that is where most problems appear. However, it is sensible to track the behavior on Debug too, so we don't surprise users with changing behavior during development.

but I've prepared a patch to make them execute in both Debug and Release:

Perfection 💯

It's good to have Debug mode integration tests, nevertheless, to make sure we capture native crashes as expected, even with the IP/SP check in place.

Agreed, and also because we want to track changes in behavior or have feedback when we add or extend the test dimensions.

supervacuus · 2025-10-22T14:49:07Z

The only thing left now is to add the unhandled-managed-exception run to the AOT test case.

tests/fixtures/dotnet_signal/Program.cs

Co-authored-by: Mischan Toosarani-Hausberger <[email protected]>

jpnurmi · 2025-10-22T15:54:52Z

The only thing left now is to add the unhandled-managed-exception run to the AOT test case.

Well, this is interesting. Unwinding the chained SIGABRT on Linux crashes in backtrace when called as a fallback from here:

sentry-native/src/backends/sentry_backend_inproc.c

Lines 497 to 502 in 516c150

    
           // if unwinding from a ucontext didn't yield any results, try again with a 
        
           // direct unwind. this is most likely the case when using `libbacktrace`, 
        
           // since that does not allow to unwind from a ucontext at all. 
        
           if (!frame_count) { 
        
               frame_count = sentry_unwind_stack(NULL, &backtrace[0], MAX_FRAMES); 
        
           }

Not sure how to debug this, because running in a debugger changes the behavior. 🤨

jpnurmi · 2025-10-22T18:05:13Z

Not sure if skipping this fallback in case of chained signal handlers is the right thing to do, but it helps avoid the crash...

fix: Unwind from local stack if ucontext fails #387

codecov · 2025-10-22T18:05:23Z

Codecov Report

❌ Patch coverage is 67.85714% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.35%. Comparing base (516c150) to head (ac6877e).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1392      +/-   ##
==========================================
- Coverage   83.48%   83.35%   -0.14%     
==========================================
  Files          58       58              
  Lines        9648     9660      +12     
  Branches     1511     1512       +1     
==========================================
- Hits         8055     8052       -3     
- Misses       1439     1451      +12     
- Partials      154      157       +3

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

supervacuus

It is fair to exclude the non-user-context backtrace fallback. It is questionable whether that stack trace, had it not crashed, would provide any helpful information. I would comment on why you don't do it.

src/backends/sentry_backend_inproc.c

jpnurmi · 2025-10-27T16:23:22Z

Looking forward to making this available as an opt-in for starters:

feat: native signal handler strategy on Android sentry-dotnet#4676

Thanks so much for the help and guidance! ❤️

jpnurmi added 4 commits September 29, 2025 12:50

fix: interop with managed .NET runtimes

c1bd96a

Update CHANGELOG.md

45dccd6

Clarify the comment

02ebf36

Remove hard-coded RID

81293d8

jpnurmi marked this pull request as ready for review September 29, 2025 12:10

This comment was marked as outdated.

Sign in to view

seer-by-sentry bot reviewed Sep 29, 2025

View reviewed changes

src/backends/sentry_backend_inproc.c Outdated Show resolved Hide resolved

Align macros and fix constants

10ba6bd

jpnurmi requested a review from supervacuus September 29, 2025 13:53

supervacuus requested changes Sep 30, 2025

View reviewed changes

Update tests/test_dotnet_signals.py

d9eafcd

Co-authored-by: Mischan Toosarani-Hausberger <[email protected]>

jpnurmi changed the title ~~fix: interop with managed .NET runtimes~~ fix: AOT interop with managed .NET runtimes Sep 30, 2025

This comment was marked as outdated.

Sign in to view

jpnurmi added 2 commits September 30, 2025 11:38

Clarify AOT in CHANGELOG & comment

1bf0f52

Separate JIT vs. AOT test cases

060b18b

jpnurmi commented Sep 30, 2025

View reviewed changes

src/backends/sentry_backend_inproc.c Show resolved Hide resolved

supervacuus requested changes Sep 30, 2025

View reviewed changes

jpnurmi added 2 commits September 30, 2025 21:15

Apply suggestions from code review

d4dd399

Add __mips64__ and __s390x__

4da4c4a

jpnurmi marked this pull request as draft October 1, 2025 12:36

jpnurmi mentioned this pull request Oct 10, 2025

Prevent Native/Cocoa SDKs from capturing errors that also get captured by .NET getsentry/sentry-dotnet#3954

Open

Minimize test diff

06c84e1

jpnurmi added a commit to getsentry/sentry-dotnet that referenced this pull request Oct 21, 2025

Android: update integration test

ca67e42

getsentry/sentry-native#1392 fixes #3954

jpnurmi marked this pull request as ready for review October 21, 2025 09:38

Less is more - drop mips, risc, s390x

e92bb00

They are irrelevant for sentry-dotnet or .NET on Android, and there are no tests checking if they even work.

This comment was marked as outdated.

Sign in to view

supervacuus requested changes Oct 21, 2025

View reviewed changes

jpnurmi mentioned this pull request Oct 22, 2025

test: run mobile integration tests in Release and Debug modes getsentry/sentry-dotnet#4666

Merged

jpnurmi added 2 commits October 22, 2025 13:48

Move log flush

de6d67c

Update test (native-crash, managed-exception, unhandled-managed-excep…

4a73907

…tion)

supervacuus reviewed Oct 22, 2025

View reviewed changes

tests/fixtures/dotnet_signal/Program.cs Outdated Show resolved Hide resolved

jpnurmi and others added 2 commits October 22, 2025 16:57

Update tests/fixtures/dotnet_signal/Program.cs

72df376

Co-authored-by: Mischan Toosarani-Hausberger <[email protected]>

AOT test case: add unhandled-managed-exception

95c3c9b

jpnurmi added 2 commits October 22, 2025 19:26

Skip the crashy libbacktrace fallback from chained signal handlers

b6e37d1

fix build on windows

ac6877e

jpnurmi requested a review from supervacuus October 24, 2025 10:25

jpnurmi mentioned this pull request Oct 27, 2025

feat: native signal handler strategy on Android getsentry/sentry-dotnet#4676

Draft

supervacuus approved these changes Oct 27, 2025

View reviewed changes

src/backends/sentry_backend_inproc.c Show resolved Hide resolved

jpnurmi added 2 commits October 27, 2025 14:25

Merge remote-tracking branch 'upstream/master' into fix/dotnet-interop

8a5c5b6

Clarify why the fallback is skipped

2ef7c5b

jpnurmi merged commit 9895a5c into master Oct 27, 2025
41 checks passed

jpnurmi deleted the fix/dotnet-interop branch October 27, 2025 16:24

jpnurmi mentioned this pull request Oct 28, 2025

[WIP] fix: AOT interop with managed .NET runtimes getsentry/sentry-cocoa#6193

Draft

7 tasks

Uh oh!

Uh oh!

fix: AOT interop with managed .NET runtimes #1392

fix: AOT interop with managed .NET runtimes #1392

Uh oh!

Conversation

jpnurmi commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

supervacuus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

supervacuus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

supervacuus Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpnurmi commented Oct 1, 2025

Uh oh!

supervacuus commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supervacuus commented Oct 1, 2025

Uh oh!

jpnurmi commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpnurmi commented Oct 1, 2025

Uh oh!

supervacuus commented Oct 1, 2025

Uh oh!

supervacuus commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpnurmi commented Oct 2, 2025

Uh oh!

supervacuus commented Oct 3, 2025

Uh oh!

jpnurmi commented Oct 21, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

supervacuus left a comment

Choose a reason for hiding this comment

Uh oh!

jpnurmi commented Oct 22, 2025

Uh oh!

supervacuus commented Oct 22, 2025

Uh oh!

supervacuus commented Oct 22, 2025

Uh oh!

Uh oh!

jpnurmi commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpnurmi commented Oct 22, 2025

Uh oh!

codecov bot commented Oct 22, 2025

Codecov Report

Uh oh!

supervacuus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpnurmi commented Sep 29, 2025 •

edited

Loading

supervacuus commented Oct 1, 2025 •

edited

Loading

jpnurmi commented Oct 1, 2025 •

edited

Loading

supervacuus commented Oct 1, 2025 •

edited

Loading

jpnurmi commented Oct 22, 2025 •

edited

Loading

jpnurmi commented Oct 27, 2025 •

edited

Loading