Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog not propagating distributed traces with persistent connections enabled #70

Open
hmcfletch opened this issue Jan 16, 2025 · 10 comments

Comments

@hmcfletch
Copy link

I am running into a some issues with my Datadog integration when I enable the persistent plugin.

I am using v1.4.0

I am enabling it like so:

http.plugin(:follow_redirects).plugin(:persistent)

This is what I see for the initial connection request:
Image

  • The pinkish is my wrapper code
  • The purple is the httpx code
  • The light blue is the down stream service I am making the HTTP call to
  • You'll notice the error (openssl::ssl::sslerror: SSL_read: unexpected eof while reading) on the left, which I am assuming is httpx attempting with a previous connection that is no longer valid.

This is a span where I am assuming the current connection has been reused (it has a much shorter duration than the above example):
Image
I see my wrapper code and the httpx code, but no down steam service.

Since I am hoping that connection will be constantly reused in our production environment for the performance improvement, this issue is preventing me from more widely adopting httpx (which I'd really like to do).

@HoneyryderChuck
Copy link
Owner

@hmcfletch thx for the report.

I've pushed a fix in the gh-70 branch, could you give it a try and see if this solves it?

@hmcfletch
Copy link
Author

I'll give it a go either today or on Monday and let you know. Thanks for the quick response.

@hmcfletch
Copy link
Author

Just gave things ago. I no longer see the extraneous error span, but now neither of the situations show me the down stream service. Both the longer and the shorter (in ms) spans look like the second option above.

@HoneyryderChuck
Copy link
Owner

That's odd. Can you share: your ddtrace gem version; your datadog configuration; your peer service datadog configuration? I suspect some issue on your peer service (distributed tracing turned off), or some bug in the specific version you're using

@HoneyryderChuck
Copy link
Owner

Enabling debug logs (either via env var or debug/debug_level options) would also help you confirm that the distributed tracing headers are being sent.

@hmcfletch
Copy link
Author

Currently using datadog v2.4.0. They renamed it when they bumped it to 2.x.

I am assuming that my down stream setup is working for a few of reasons:

  • We could see the distributed trace in the first example above.
  • Currently I am using the faraday adapter for the open api sdk generator with non-persistent connections and the distributed traces are showing up just fine.

Current Datadog config looks like this:

Datadog.configure do |c|
  # This enables DataDog for production and staging
  c.tracing.enabled = Rails.env.production?

  c.tracing.instrument :active_record, service_name: ENV['DD_SERVICE'] # defaults to adapter name, which is postgres for us
  c.tracing.instrument :active_support, cache_service: ENV['DD_SERVICE'] # defaults to active_support-cache
  c.tracing.instrument :aws, service_name: ENV['DD_SERVICE'] # defaults to aws
  c.tracing.instrument :faraday, service_name: ENV['DD_SERVICE'] # defaults to faraday
  c.tracing.instrument :http, service_name: ENV['DD_SERVICE'] # defaults to net/http
  c.tracing.instrument :httpclient, service_name: ENV['DD_SERVICE'] # defaults to httpclient
  c.tracing.instrument :rest_client, service_name: ENV['DD_SERVICE'] # defaults to rest_client

  # app specific services names
  c.tracing.instrument :pg, service_name: 'pg-my-serice' # defaults to pg
  c.tracing.instrument :redis, service_name: 'redis-my-serice' # defaults to redis

  # Set sampling rates
  
  # logic to setup sampling rules

  c.tracing.sampling.rules = JSON.generate(sampling_rules)
end

I specifically haven't renamed the httpx service yet to better identify what is happening while I am debgging things.

@HoneyryderChuck
Copy link
Owner

Thx for the paste.

I see that you're using the faraday tracing for this, not httpx. So perhaps it's the faraday/datadog combo wrapped around httpx which is causing issues. Would it be possible for you to send a sample request from your environment with logs turned on (via HTTPX_DEBUG=2 env var or :debug/:debug_level options) and post the log dump here (where the data you consider PII redacted)?

@HoneyryderChuck
Copy link
Owner

I've added a test file to the suite testing faraday/datadog/httpx in tandem, where only the faraday datadog integration is turned on. You can see here the result: there are two tests failing related to distributed tracing.

The issue seems to be that, when using faraday, and doing a faraday http request from within an active trace, the faraday span is pushed before the active span, which seems incorrect (it should be the other way around). I've tested it with net/http, and the behaviour is the same. This seems to be a faraday datadog integration issue, which should be reported to them.

@hmcfletch
Copy link
Author

I think there was some miscommunication, though I am glad this uncovered another bug. I was originally using the farday with net/http, but am wanting to switch to purely httpx. The issues I am seeing are when I am using a purely httpx solution (not with httpx as a farday adapter). Been sick the last few days so hopefully I can get some of that debug output to you soon.

@HoneyryderChuck
Copy link
Owner

HoneyryderChuck commented Jan 24, 2025

think there was some miscommunication, though I am glad this uncovered another bug. I was originally using the farday with net/http, but am wanting to switch to purely httpx.

I see, so your endgame is to ditch faraday completely and use only httpx, not switching your faraday adapter to httpx. Understood.

I'll need those request log dumps to work then. I'd suggest you switch between both versions of your application, and enable logs on both. This is how you can do it for both variants:

# for faraday with net-http
conn = Faraday.new(...) do |f|
  f.adapter :net_http do |http|
    http.set_debug_output($stdout)
  end
end

# for httpx
# either set env var HTTPX_DEBUG=2 , or:
http = HTTPX.with(debug: $stdout, debug_level: 2)

get well soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants