feat: add native OpenTelemetry instrumentation#6304
Conversation
There was a problem hiding this comment.
Summary: This PR adds native OpenTelemetry tracing around crew, task, agent, tool, LLM, memory, knowledge, A2A, and flow execution, plus context propagation for trace continuity. No exploitable security vulnerabilities were identified in the added code.
Risk: Low risk. The changes introduce observability instrumentation and do not add public endpoints, authentication changes, authorization logic, filesystem access, SQL construction, or new untrusted network/file inputs.
📝 WalkthroughWalkthroughAdds native OpenTelemetry instrumentation to CrewAI by introducing ChangesNative OpenTelemetry Instrumentation
Suggested Reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| with operation("failing op"): | ||
| raise RuntimeError("boom") | ||
|
|
||
| finished = span_exporter.get_finished_spans() |
| with operation("doubly recorded"): | ||
| raise RuntimeError("once") | ||
|
|
||
| span = span_exporter.get_finished_spans()[0] |
| with operation("cancelled op"): | ||
| raise asyncio.CancelledError("cancel") | ||
|
|
||
| span = span_exporter.get_finished_spans()[0] |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
lib/crewai/tests/telemetry/test_otel.py (1)
85-91: 📐 Maintainability & Code Quality | 🔵 TrivialIsolate the private OpenTelemetry internals behind a versioned helper for clarity.
trace._TRACER_PROVIDER_SET_ONCE._doneandtrace._TRACER_PROVIDERare undocumented private symbols; they are not part of the public API and relying on them across version upgrades causes stability issues. While the assertion guardassert actual is _SHARED_PROVIDERwill catch a hard break, it surfaces only as a confusing test failure rather than a clear signal. Wrap this in a small helper function with a comment documenting the supportedopentelemetry-apiversion (currently~1.34.0) and verify these attributes exist at initialization time, so future maintainers understand the fragility and constraints.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/telemetry/test_otel.py` around lines 85 - 91, The test code directly accesses undocumented private OpenTelemetry internals (trace._TRACER_PROVIDER_SET_ONCE._done and trace._TRACER_PROVIDER) which are not part of the public API and lack version stability guarantees. Create a small helper function that encapsulates this private attribute access and verify these attributes exist at initialization time, add a documentation comment specifying the supported opentelemetry-api version constraint (currently ~1.34.0), and call this helper from the test instead of directly manipulating the private symbols. This way future maintainers will understand the version fragility and the assertion guard will provide clearer context if something breaks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/flow/runtime/__init__.py`:
- Around line 2493-2525: The `HumanFeedbackPending` exception is escaping the
`operation("execute flow", ...)` context manager before being caught by an outer
exception handler, causing the span to be marked as an error even though it
represents expected control flow. Wrap the code inside the operation context
that calls `self._execute_start_method(start_method)` and
`asyncio.gather(*tasks)` in a try-except block that catches
`HumanFeedbackPending` separately, re-raises it after the operation context
closes so the outer exception handler can process it without the operation span
recording it as an error.
- Around line 2852-2867: The issue is that when a sync method returns a
coroutine, the await is happening after the operation span exits, placing part
of the execution outside the tracing context. To fix this, after executing the
sync method in the else block (where asyncio.to_thread is used with ctx.run),
check if the result is a coroutine using asyncio.iscoroutine(result), and if so,
await it while still inside the operation context manager block. This ensures
the complete execution of coroutine-returning sync methods stays within the
"execute flow method" operation span.
In `@lib/crewai/src/crewai/llms/providers/anthropic/completion.py`:
- Around line 301-303: The instrumentation code added to the context manager at
the llm_call_context() call (around lines 301-303) and at line 378-380 is
logically correct but violates Ruff formatting rules, causing CI to fail. Run
the Ruff formatter on the file by executing the command `uv run ruff format
lib/crewai/src/crewai/llms/providers/anthropic/completion.py` to automatically
apply the required formatting changes, then commit and push the formatted output
to unblock CI.
In `@lib/crewai/src/crewai/llms/providers/openai/completion.py`:
- Around line 414-416: The ruff formatter is detecting formatting issues in the
completion.py file that are blocking CI. Run the command to automatically format
the file using ruff, which will resolve the formatting violations detected by
ruff format --check. Specifically, apply ruff formatting to the file containing
the llm_call_context and operation function calls around the "call llm"
operation context, then commit the formatted changes.
In `@lib/crewai/src/crewai/telemetry/otel.py`:
- Line 1: The file lib/crewai/src/crewai/telemetry/otel.py has formatting issues
detected by the Ruff formatter that are blocking CI. Run the command `uv run
ruff format lib/crewai/src/crewai/telemetry/otel.py` to automatically apply the
required formatting standards to the file, or run `uv run ruff format lib/` to
format the entire lib directory. This will resolve the formatting drift and
allow the CI pipeline to pass.
In `@lib/crewai/tests/telemetry/test_otel.py`:
- Around line 567-570: The ruff format check is failing because several
expressions are unnecessarily split across multiple lines when they fit within
the 88 character limit. Collapse the manually wrapped expressions in the
test_otel.py file: the ThreadPoolExecutor pool.submit(...).result() blocks
(found around lines 568-570, 591-593, 612-614, and 635-637) and the
single-argument assert statements (around lines 498-500 and 573-575) should each
be collapsed onto a single line where they fit under 88 characters. Run uv run
ruff format lib/ to automatically fix all formatting issues in the directory.
---
Nitpick comments:
In `@lib/crewai/tests/telemetry/test_otel.py`:
- Around line 85-91: The test code directly accesses undocumented private
OpenTelemetry internals (trace._TRACER_PROVIDER_SET_ONCE._done and
trace._TRACER_PROVIDER) which are not part of the public API and lack version
stability guarantees. Create a small helper function that encapsulates this
private attribute access and verify these attributes exist at initialization
time, add a documentation comment specifying the supported opentelemetry-api
version constraint (currently ~1.34.0), and call this helper from the test
instead of directly manipulating the private symbols. This way future
maintainers will understand the version fragility and the assertion guard will
provide clearer context if something breaks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 120c8a32-83fb-4d7a-882b-bc510eeca1d9
📒 Files selected for processing (25)
docs/docs.jsondocs/edge/en/changelog.mdxdocs/edge/en/observability/opentelemetry.mdxlib/crewai/src/crewai/a2a/utils/delegation.pylib/crewai/src/crewai/agent/core.pylib/crewai/src/crewai/crew.pylib/crewai/src/crewai/events/event_bus.pylib/crewai/src/crewai/flow/runtime/__init__.pylib/crewai/src/crewai/knowledge/knowledge.pylib/crewai/src/crewai/llm.pylib/crewai/src/crewai/llms/providers/anthropic/completion.pylib/crewai/src/crewai/llms/providers/azure/completion.pylib/crewai/src/crewai/llms/providers/bedrock/completion.pylib/crewai/src/crewai/llms/providers/gemini/completion.pylib/crewai/src/crewai/llms/providers/openai/completion.pylib/crewai/src/crewai/memory/unified_memory.pylib/crewai/src/crewai/task.pylib/crewai/src/crewai/tasks/llm_guardrail.pylib/crewai/src/crewai/telemetry/__init__.pylib/crewai/src/crewai/telemetry/otel.pylib/crewai/src/crewai/tools/base_tool.pylib/crewai/src/crewai/tools/structured_tool.pylib/crewai/src/crewai/utilities/reasoning_handler.pylib/crewai/tests/telemetry/test_otel.pylib/crewai/tests/telemetry/test_otel_noop.py
| with ThreadPoolExecutor(max_workers=2) as pool: | ||
| inner = pool.submit( | ||
| contextvars.copy_context().run, _task | ||
| ).result() |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
CI is red on ruff format; collapse these manually wrapped expressions.
The lint job fails because several expressions here are wrapped across lines that ruff format will collapse onto a single line (each fits well under 88 cols). Representative spots: the pool.submit(...).result() blocks at Lines 568-570, 591-593, 612-614, 635-637 and the single-arg assert blocks at Lines 498-500 and 573-575. Run uv run ruff format lib/ to fix all reformatted files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/tests/telemetry/test_otel.py` around lines 567 - 570, The ruff
format check is failing because several expressions are unnecessarily split
across multiple lines when they fit within the 88 character limit. Collapse the
manually wrapped expressions in the test_otel.py file: the ThreadPoolExecutor
pool.submit(...).result() blocks (found around lines 568-570, 591-593, 612-614,
and 635-637) and the single-argument assert statements (around lines 498-500 and
573-575) should each be collapsed onto a single line where they fit under 88
characters. Run uv run ruff format lib/ to automatically fix all formatting
issues in the directory.
Source: Pipeline failures
Open spans directly on the user's thread so that stdlib log records emitted during hot paths like `Crew.kickoff`, `BaseTool.run`, and `LLM.call` carry the active trace context and correlate with the spans they belong to — a gap the previous metrics-only telemetry could not close. Introduces a `crewai.telemetry.otel` module exposing `operation` and `follows_from`, instruments the execution hot paths, and propagates the active context across every parallel-dispatch site. Depends only on `opentelemetry-api` so provider and exporter choice stays with the host application per the standard OTel library pattern; without an installed SDK the `ProxyTracer` keeps everything as a NoOp. Co-authored-by: Cursor <cursoragent@cursor.com>
fdfe593 to
7cc74cb
Compare
| with operation("paused op", expected_exceptions=(_ExpectedPause,)): | ||
| raise _ExpectedPause("pause") | ||
|
|
||
| span = span_exporter.get_finished_spans()[0] |
Address review feedback on the native OpenTelemetry instrumentation
7cc74cb to
4d6ff2c
Compare
`test_otel.py`'s `span_exporter` fixture installed an SDK `TracerProvider` once via module-level globals and never restored the default `ProxyTracerProvider`, so `test_otel_noop.py`'s unconfigured- default-state assertions failed whenever the two files ran on the same worker. Install the SDK provider fresh per test and reset the global slot back to `ProxyTracerProvider` in `finally`; `_tracer()` re-resolves on every span so swapping providers between tests is safe.
`Telemetry.set_tracer()` installed crewAI's anonymous SDK
`TracerProvider` into OpenTelemetry's process-global slot, so the first
`Crew` constructed in a test or host application replaced the default
`ProxyTracerProvider` and exfiltrated every host span emitted via
`trace.get_tracer(...)` to crewAI's OTLP endpoint. Keep the provider
local to the `Telemetry` instance and route every anonymous span
through `self.provider.get_tracer("crewai.telemetry")` so the global
slot stays untouched. Mirrors the fix in `crewai_core.telemetry`,
drops the now-dead `set_tracer()` calls in `event_listener.py` and
`crewai_cli.command`, and adds regression coverage that asserts the
provider stays a `ProxyTracerProvider` after constructing a `Crew`.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 99b87e8. Configure here.
| # whole call stays inside the "execute flow method" span | ||
| # (enables AgentExecutor pattern). | ||
| if asyncio.iscoroutine(result): | ||
| result = await result |
There was a problem hiding this comment.
Flow method span ends early
Low Severity
The new execute flow method span wraps only the method body and auto-awaited coroutine, but _run_human_feedback_step runs after that block ends. If feedback fails or raises HumanFeedbackPending, the method span is already finished with a non-error status, so traces can show a successful method while the flow is paused or failed in the feedback step.
Reviewed by Cursor Bugbot for commit 99b87e8. Configure here.


Open spans directly on the user's thread so that stdlib log records emitted during hot paths like
Crew.kickoff,BaseTool.run, andLLM.callcarry the active trace context and correlate with the spans they belong to — a gap the previous metrics-only telemetry could not close. Introduces acrewai.telemetry.otelmodule exposingoperationandfollows_from, instruments the execution hot paths, and propagates the active context across every parallel-dispatch site. Depends only onopentelemetry-apiso provider and exporter choice stays with the host application per the standard OTel library pattern; without an installed SDK theProxyTracerkeeps everything as a NoOp.Note
Medium Risk
Touches every major execution path (crew/task/agent/LLM/tools/flows) and changes telemetry global-provider behavior; mitigated by NoOp defaults without an SDK and broad new test coverage.
Overview
Adds native OpenTelemetry spans on the host’s global tracer via new
crewai.telemetry.otel(operation,follows_from), wrapping crew kickoff, tasks, agents, flows (including resume/method execution), LLM calls, tools, memory, knowledge, A2A delegation, guardrails, and agent reasoning.Anonymous CrewAI OTLP telemetry no longer calls
set_tracer()— it keeps a privateTracerProviderand emits only through_tracer(), so importing or running CrewAI does not take over the process-wide OTel provider. CLI/EventListenerstop forcing global tracer installation.Trace continuity fixes: the event bus re-attaches OTel context when dispatching async handlers via
run_coroutine_threadsafe; structured tools run sync callables withcontextvars.copy_context()in the executor so nested spans and log correlation survive thread hops.Flow HITL pauses can opt out of ERROR spans via
expected_exceptions; resume opens a dedicated resume flow span (with docs for optionalfollows_fromlinks). Expanded tests cover span nesting, log correlation, context propagation sites, and no-op behavior without an SDK provider.Reviewed by Cursor Bugbot for commit 99b87e8. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests