Skip to content

apache_beam + beam integration not sending exceptions in GCP #4203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gshoultz42 opened this issue Mar 26, 2025 · 5 comments
Open

apache_beam + beam integration not sending exceptions in GCP #4203

gshoultz42 opened this issue Mar 26, 2025 · 5 comments
Labels

Comments

@gshoultz42
Copy link

How do you use Sentry?

Sentry Saas (sentry.io)

Version

2.23.1

Steps to Reproduce

  1. Add sentry to python dataflow job
  2. Add try catch to process on a DoFn
  3. inside of catch, capture_Exception
  4. set dataflow job as stream in GCP
  5. Deploy dataflow job to cloud runner
  6. send event that will trigger exception

Expected Result

Error exception is sent to Sentry

Actual Result

Sentry does not get an exception.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Mar 26, 2025
@antonpirker
Copy link
Member

Hey @gshoultz42 !
Can you show me your sentry_sdk.init() and also the call to capture_exception() including the try/except?

What you also can do:

  • Enable debug output (sentry_sdk.init(debug=True)) then you will see console output of what the Sentry SDK is doing. (Look for log messages that contain Sending envelope)
  • Make sure that wherever the Sentry SDK is running, it can make a HTTP connection to ingest.sentry.io.

Hope this helps

@antonpirker
Copy link
Member

I just fed your question into our bot in our discord server, and it has these suggestions: https://discord.com/channels/621778831602221064/1354782809020960779/1354782809020960779 (its not super helpful I have to admit, but it is a nice experiment)

@gshoultz42
Copy link
Author

gshoultz42 commented Mar 27, 2025

Sentry Init

sentry_sdk.init(
        dsn=sentry_dsn,
        environment=env,
        integrations=[BeamIntegration()],
    )

Exception call

class ParseEvent(beam.DoFn):
    def process(self, element: bytes, *args, **kwargs):
        try:
            event_json = json.loads(element)
            event_name = event_json["message"]["payload"]["eventName"]
            model = get_model(event_name)
            event = model.model_validate_json(element)
            yield event
        except (ValidationError, KeyError, JSONDecodeError, TypeError) as error:
            logging.exception("Error parsing event: %r", element)
            error_output = {
                "error": repr(error),
                "payload": (
                    event_json
                    if not isinstance(error, JSONDecodeError)
                    else element.decode()
                ),
            }
            yield TaggedOutput("parse_errors", json.dumps(error_output).encode())

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Mar 27, 2025
@antonpirker
Copy link
Member

Thanks for the follow up!

The line

logging.exception("Error parsing event: %r", element)

should send an error if the logging integration is enabled (which the sentry SDK enables by default.)

If you want to have your error to show up in sentry you can add a sentry_sdk.capture_exception(error) in your except block.

Because you handle the exception yourself, it never bubbles up for Sentry to capture it.

You could also add a simple raise after your last yield so the Exception is raised again and eventually captured by Sentry.

@gshoultz42
Copy link
Author

Wanted to provide an update. I am still looking into this. I added debug to sentry but did not get the log. Working with our admin to make sure the logging is setup correctly in our GCP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants