Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update handling of user-agent data inferral #4641

Open
mydea opened this issue Apr 3, 2025 · 0 comments
Open

Update handling of user-agent data inferral #4641

mydea opened this issue Apr 3, 2025 · 0 comments

Comments

@mydea
Copy link
Member

mydea commented Apr 3, 2025

Related to getsentry/sentry-docs#13205

We infer the following data from the sent user-agent in relay (as far as I can tell):

  • Browser Context
  • Browser Tags
  • Device Context
  • Device Tags
  • (Client) OS Context

As of now, the user-agent is always sent in the event.request object, like this:

"request": {
  "url": "https://docs.sentry.io/product/issues/issue-details/performance-issues/n-one-queries/",
  "headers": [
    [
      "Referer",
      "https://www.google.com/"
    ],
    [
      "User-Agent",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
    ]
  ]
},

While this makes sense for backend/API events, where the request object contains information about the incoming request - there, the user-agent is actually a header, etc - this is not really correct for browser events. In the case of errors happening in a browser SDK, there is not really a user-agent header. We are really mis-using the event.request object in this case to carry the user agent, so we can infer the browser/os/device data that we indeed also want to have for browser SDKs.

For context, sending this via event.request is especially weird because this is then shown in the Sentry UI as follows:

Image

which is really misleading for a browser error, it only really makes sense for backend environments.

Because of this, we want to stop sending the user agent in the request context here, and instead send it in a different way that makes more sense for a browser SDK.

(Side note: Other things we use from event.request right now will also need to be moved, see getsentry/sentry-docs#13203, but this is a bit easier to reason about)

For user-agent specifically, we thus propose to allow to send this in the browser context:

{
  "contexts": {
    "browser": { "user_agent": "Mozilla/....." }
  }
}

If this is present, and no other data exists in the sent browser context, then this should be prioritized as source for user-agent data inferral:

  1. If contexts.browser.user_agent exists, use this
  2. Else, if request.headers['user-agent'] exists, use this

browser context data inferred in relay will be merged with potentially existing browser context sent in the event, where the inferred data has lower priority in merging.

{
  "contexts": {
    "browser": { 
      "user_agent": "Mozilla/.....",
      "name": "Chrome",
      "version": "101.0.0"
     }
  }
}

If (theoretically) a name or version already exist in the browser context, they should not be overwritten.

Why not just use the user agent from the HTTP request?

While this would work in most cases, it does not work with tunneled requests (they may loose the user-agent if not specifically forwarded by the tunnel implementation, which we do not suggest users to do as of now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant