feat: Integrate OpenTelemetry by khvn26 · Pull Request #189 · Flagsmith/flagsmith-common

khvn26 · 2026-03-30T18:59:56Z

Contributes to #182.

Adds OpenTelemetry integration to flagsmith-common, enabling trace and structured log export over OTLP.

Tracing

When OTEL_EXPORTER_OTLP_ENDPOINT is set, ensure_cli_env() configures:

TracerProvider with OTLP/HTTP span export, W3C TraceContext + Baggage propagation
DjangoInstrumentor — creates a root span per HTTP request
Psycopg2Instrumentor — creates child spans for each SQL query with db.system, db.statement, and db.name attributes. SQL commenter is enabled, adding trace context as SQL comments for database-side correlation.
RedisInstrumentor — creates child spans for each Redis command

RouteLoggerMiddleware renames spans using get_route_template (e.g. GET /api/v1/projects/{pk}/) and sets http.route to the same normalised route. URL paths can be excluded from tracing via OTEL_TRACING_EXCLUDED_URL_PATHS.

Structured logs

A custom structlog processor emits each log event as:

An OTLP log record with body, event_name, severity_text, severity_number, and attributes (double-underscore keys are converted to dots, e.g. organisation__id → organisation.id). W3C baggage entries from the current OTel context are copied into log attributes. A flagsmith.event attribute duplicates the event name for backends that don't surface EventName.
A span event on the active span (if one exists) with the same name and attributes, making structlog events visible in trace backends (e.g. SigNoz's Events tab).

A companion processor add_otel_trace_context injects trace_id and span_id from the active span into the structlog event dict for JSON log output.

Configuration

Variable	Description	Default
`OTEL_EXPORTER_OTLP_ENDPOINT`	Base OTLP endpoint (e.g. `http://collector:4318`). If unset, no OTel setup occurs.	(disabled)
`OTEL_SERVICE_NAME`	The `service.name` resource attribute.	`flagsmith-api`
`OTEL_TRACING_EXCLUDED_URL_PATHS`	Comma-separated URL paths to exclude from tracing.	(none)

How to review

Code structure

The entry point is src/common/core/main.py — the if otel_endpoint: block in ensure_cli_env() conditionally imports and wires up OTel. When the env var is unset, no OTel code is imported.

All OTel logic lives in src/common/core/otel.py:

setup_tracing() — context manager that sets the global TracerProvider, configures W3C propagators, and instruments Django, psycopg2, and Redis. Teardown uninstruments in reverse order and shuts down the provider.
make_structlog_otel_processor() — factory that returns a structlog processor. The processor maps event dicts to OTLP log records (attribute key conversion, reserved key filtering, EventName construction, baggage propagation) and attaches them as span events on the active span. The flagsmith.event attribute is a workaround for SigNoz not surfacing OTel's EventName field.
build_tracer_provider() / build_otel_log_provider() — factory functions for the trace and log providers, both using batch processors with OTLP/HTTP export.
add_otel_trace_context — structlog processor that injects trace_id/span_id into the event dict.

src/common/gunicorn/middleware.py has one addition: span.set_attribute("http.route", route_template) — overrides the raw regex from Django's instrumentor with the normalised route template.

pyproject.toml adds opentelemetry-instrumentation-psycopg2, opentelemetry-instrumentation-redis, and redis to the common-core extra.

Testing

Unit tests (tests/unit/common/core/test_otel.py) — test the structlog processor in isolation using InMemoryLogExporter, and verify instrumentor setup/teardown via mocks.
Integration tests (tests/integration/core/test_otel.py) — test the full stack with real Django test client, real database, and InMemorySpanExporter. The tracing tests assert the complete span output (exact span names, kinds, attributes, parent-child relationships, SQL statements) rather than spot-checking individual fields.

The integration tracing tests use a module-scoped fixture because OTel only allows setting the global TracerProvider once per process. Per-test isolation is achieved via the span_exporter fixture which clears the exporter between tests.

codecov-commenter · 2026-03-30T20:07:05Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.88%. Comparing base (2e0459d) to head (7657026).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #189      +/-   ##
==========================================
+ Coverage   96.48%   96.88%   +0.40%     
==========================================
  Files          97      101       +4     
  Lines        3552     3978     +426     
==========================================
+ Hits         3427     3854     +427     
+ Misses        125      124       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

emyller

Looks good overall. Left a few questions around, none a blocker I believe. Also, couple non-action feedback:

I think context-manager-based design makes it trippy to understand what's going on.
Both unit and integration tests look very similar in purpose sometimes. Feels duplicated — but I also appreciate both unit and integration test suites are accomplishing their goal here. Still, in practice, are we maybe probing for the same thing twice?

src/common/core/main.py

src/common/core/otel.py

README.md

tests/integration/core/test_otel.py

khvn26 · 2026-04-01T12:10:23Z

@emyller

I think context-manager-based design makes it trippy to understand what's going on.

It's essential to handle graceful shutdowns in a manageable way. Without this, we'll lose spans.

Both unit and integration tests look very similar in purpose sometimes

Fair observation. The difference is that unit tests configure structlog directly while integration tests go through setup_logging().

I inventoried all tests and found three unit tests that were genuinely redundant with integration counterparts.

Dropped all three in 7657026. The remaining unit tests cover edge cases the integration tests don't exercise: kebab-case normalisation, dunder vs single underscore distinction, all 5 severity levels (parametrised), module name fallback when no logger name is given, span event emission, and the factory/lifecycle functions.

emyller

LET'S GO!

khvn26 added 6 commits March 30, 2026 19:58

wip

4628a95

cleaner api, excluded paths

039e8a2

same span names as log_extra, openapi

dad413c

improve instrumentation scope

a4aa83f

add docs

f11be93

add trace_id and span_id to structured logs

d3f811e

khvn26 requested a review from a team as a code owner March 30, 2026 18:59

khvn26 requested review from emyller and removed request for a team March 30, 2026 18:59

fix test

850b26e

nicer fixture

0ffbeaf

khvn26 force-pushed the feat/otel branch from b95c161 to 0ffbeaf Compare March 30, 2026 20:09

khvn26 added 8 commits March 30, 2026 21:16

even nicer fixture

80faba4

add missing tests

241a6d5

add psycopg2, redis instrumenters

065d535

add span events as well

0bc8d45

add flagsmith.event

f61e6a9

use normalised routes for http.route attribute values

56b0b3e

restore severity text

9f3f6a1

fix tests

0a3f1cf

khvn26 mentioned this pull request Mar 31, 2026

OTel: Sampling support #190

Open

emyller reviewed Apr 1, 2026

View reviewed changes

khvn26 added 3 commits April 1, 2026 12:55

fix docs

eabc583

simplify baggage test: remove unnecessary try/finally

bbe9107

remove redundant tests

7657026

khvn26 requested a review from emyller April 1, 2026 12:10

account for empty log bodies

bad5a92

emyller approved these changes Apr 1, 2026

View reviewed changes

khvn26 merged commit 2134b53 into main Apr 1, 2026
4 checks passed

flagsmithdev mentioned this pull request Apr 1, 2026

chore(main): release 3.6.0 #188

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Integrate OpenTelemetry#189

feat: Integrate OpenTelemetry#189
khvn26 merged 20 commits intomainfrom
feat/otel

khvn26 commented Mar 30, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 30, 2026 •

edited

Loading

Uh oh!

emyller left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

khvn26 commented Apr 1, 2026

Uh oh!

emyller left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

khvn26 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tracing

Structured logs

Configuration

How to review

Code structure

Testing

Uh oh!

codecov-commenter commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

emyller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

khvn26 commented Apr 1, 2026

Uh oh!

emyller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

khvn26 commented Mar 30, 2026 •

edited

Loading

codecov-commenter commented Mar 30, 2026 •

edited

Loading