Skip to content

Closing and exporting all spans on panic (when panic=abort)? #542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
joshtriplett opened this issue May 8, 2021 · 6 comments
Open

Closing and exporting all spans on panic (when panic=abort)? #542

joshtriplett opened this issue May 8, 2021 · 6 comments
Assignees
Labels
A-trace Area: issues related to tracing question Further information is requested release:required-for-stable Must be resolved before GA release, or nice to have before GA.

Comments

@joshtriplett
Copy link

I'm using tracing-opentelemetry. From what I understand, spans rely on Drop implementations to close, and until the span is closed it won't be shipped off. I'm currently using panic=abort for various reasons. I'd love to have a global method that I can call from a panic handler that will close and export all spans before exiting, to make sure enough tracing information has been sent out to debug the panic.

@TommyCpp
Copy link
Contributor

TommyCpp commented May 9, 2021

This could be challenging. The Span holds a weak reference to TracerProvider via Tracer. But not the other way around. So it's gonna be hard to try to find all spans and export them.

But other than the spans that in flight. You also need to try to export the ended spans cached in BatchSpanProcessor. So it maybe makes sense to try to force_flush TracerProvider in the panic handler.

@TommyCpp TommyCpp added the question Further information is requested label May 9, 2021
@TommyCpp TommyCpp added the A-trace Area: issues related to tracing label Jan 9, 2022
@hdost hdost added this to the Tracing API And SDK Stable milestone Mar 2, 2023
@hdost hdost added the release:required-for-stable Must be resolved before GA release, or nice to have before GA. label Mar 2, 2023
@blueforesticarus
Copy link

blueforesticarus commented Mar 17, 2023

Logging active spans on crash, and logging spans that start but never end, are pretty important requirements for a tracing/logging system.

This could be challenging. The Span holds a weak reference to TracerProvider via Tracer. But not the other way around.

This seems like there is a design issue somewhere.

@cijothomas
Copy link
Member

Logging active spans on crash, and logging spans that start but never end, are pretty important requirements for a tracing/logging system.

I am not able to find any other OpenTelemetry implementation which does this automatically. For eg:, in OpenTelemetry .NET, there is this doc which describes one possible way for users to achieve this, but SDK does not do this automatically.
https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/trace/reporting-exceptions#unhandled-exception
Even this suggestion only finishes/exports those Span in that context. Spans in other threads/context remain unstopped and unexported.

@blueforesticarus
Copy link

blueforesticarus commented Mar 17, 2023

Writing things to a file doesn't have the second problem, and if you do things right it doesn't have the first.

I'm using tracing to output to opentelemetry. Tracing has hooks for when span's start, and continues to function in a panic hook. I get 100s of log lines in stdout, but NOTHING ends up sent to jaeger/zipkin via opentelemetry, due to an error 30s into startup.

I don't know whether it is design or implementation, but all the monitoring "solutions" (jaeger, zipkin, opentelemetry, etc) that I have come across do not deal with unfinished spans (which seems to be because the span start doesn't emit anything). This seems to me a basic viability issue (for opentelemetry as a whole).

Not trying to bash the maintainers here (and it seems like a wider issue anyway), just want to point out that

  1. This is a BIG problem
  2. I don't think I'm the only person who will lose a few hours banging their head against a wall thinking "surely there is a way to do this"

@djc
Copy link
Contributor

djc commented Mar 21, 2023

The Span holds a weak reference to TracerProvider via Tracer. But not the other way around.

Conceptually, why is this? Could we do it the other way around?

@cijothomas
Copy link
Member

Related to #1209

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-trace Area: issues related to tracing question Further information is requested release:required-for-stable Must be resolved before GA release, or nice to have before GA.
Projects
None yet
Development

No branches or pull requests

7 participants