feat: add native OpenTelemetry instrumentation by lucasgomide · Pull Request #6304 · crewAIInc/crewAI

lucasgomide · 2026-06-23T13:04:01Z

Open spans directly on the user's thread so that stdlib log records emitted during hot paths like Crew.kickoff, BaseTool.run, and LLM.call carry the active trace context and correlate with the spans they belong to — a gap the previous metrics-only telemetry could not close. Introduces a crewai.telemetry.otel module exposing operation and follows_from, instruments the execution hot paths, and propagates the active context across every parallel-dispatch site. Depends only on opentelemetry-api so provider and exporter choice stays with the host application per the standard OTel library pattern; without an installed SDK the ProxyTracer keeps everything as a NoOp.

Note

Medium Risk
Touches every major execution path (crew/task/agent/LLM/tools/flows) and changes telemetry global-provider behavior; mitigated by NoOp defaults without an SDK and broad new test coverage.

Overview
Adds native OpenTelemetry spans on the host’s global tracer via new crewai.telemetry.otel (operation, follows_from), wrapping crew kickoff, tasks, agents, flows (including resume/method execution), LLM calls, tools, memory, knowledge, A2A delegation, guardrails, and agent reasoning.

Anonymous CrewAI OTLP telemetry no longer calls set_tracer() — it keeps a private TracerProvider and emits only through _tracer(), so importing or running CrewAI does not take over the process-wide OTel provider. CLI/EventListener stop forcing global tracer installation.

Trace continuity fixes: the event bus re-attaches OTel context when dispatching async handlers via run_coroutine_threadsafe; structured tools run sync callables with contextvars.copy_context() in the executor so nested spans and log correlation survive thread hops.

Flow HITL pauses can opt out of ERROR spans via expected_exceptions; resume opens a dedicated resume flow span (with docs for optional follows_from links). Expanded tests cover span nesting, log correlation, context propagation sites, and no-op behavior without an SDK provider.

^{Reviewed by Cursor Bugbot for commit 99b87e8. Bugbot is set up for automated code reviews on this repo. Configure here.}

Summary by CodeRabbit

Release Notes

New Features
- Added OpenTelemetry instrumentation throughout the execution pipeline, including crew kickoff, task runs, agent reasoning, flow phases, LLM calls, tool usage, knowledge queries, and memory operations.
- Improved trace context propagation across async and thread boundaries for consistent end-to-end observability.
Bug Fixes
- Refined telemetry span behavior for expected/control-flow exceptions so they don’t incorrectly mark spans as errors.
Tests
- Added OpenTelemetry test coverage for span creation, exception-to-status handling, follows-from links, log/trace correlation, and multi-thread/context propagation.
- Added no-op telemetry tests to ensure behavior remains safe when no tracing provider is configured.

corridor-security

Summary: This PR adds native OpenTelemetry tracing around crew, task, agent, tool, LLM, memory, knowledge, A2A, and flow execution, plus context propagation for trace continuity. No exploitable security vulnerabilities were identified in the added code.

Risk: Low risk. The changes introduce observability instrumentation and do not add public endpoints, authentication changes, authorization logic, filesystem access, SQL construction, or new untrusted network/file inputs.

coderabbitai · 2026-06-23T13:05:48Z

📝 Walkthrough

Walkthrough

Adds native OpenTelemetry instrumentation to CrewAI by introducing operation() and follows_from() primitives in a new crewai.telemetry.otel module, then wrapping every major execution stage—crew, task, agent, flow, LLM providers, tools, memory, knowledge, guardrail, reasoning, and A2A delegation—in named spans, while propagating OTel context across thread/loop boundaries in the event bus and structured tool executor.

Changes

Native OpenTelemetry Instrumentation

Layer / File(s)	Summary
OTel core primitives: `operation()` and `follows_from()` `lib/crewai/src/crewai/telemetry/otel.py`, `lib/crewai/src/crewai/telemetry/__init__.py`	Adds `_tracer()` late-resolving the global provider, `operation()` context manager with explicit exception handling and new `expected_exceptions` parameter to exempt control-flow exceptions from error recording, and `follows_from()` for causal span links. Updates `__all__` to export both helpers alongside `Telemetry`.
Cross-thread OTel context propagation `lib/crewai/src/crewai/events/event_bus.py`, `lib/crewai/src/crewai/tools/structured_tool.py`	Introduces `_ctx_run_coro` to attach/detach OTel context around async coroutines scheduled onto the background loop thread; applies it to all four dispatch sites in `emit()` and `replay()`. Updates `ainvoke` to propagate `contextvars` into the executor thread via `ctx.run`.
Crew and Task execution spans `lib/crewai/src/crewai/crew.py`, `lib/crewai/src/crewai/task.py`	Wraps `kickoff()`/`akickoff()` in `operation("execute crew", ...)` and `_execute_core()`/`_aexecute_core()` in `operation("execute task", ...)` with crew/task identity attributes.
Agent execution spans `lib/crewai/src/crewai/agent/core.py`	Wraps `execute_task`, `aexecute_task`, `kickoff`, and `kickoff_async` in `operation("execute agent", {agent.role, agent.id})` spans.
Flow runtime spans `lib/crewai/src/crewai/flow/runtime/__init__.py`	Wraps flow resume, kickoff, and per-method execution in `"resume flow"`, `"execute flow"`, and `"execute flow method"` operations, all with `expected_exceptions=(HumanFeedbackPending,)`.
LLM provider call spans `lib/crewai/src/crewai/llm.py`, `lib/crewai/src/crewai/llms/providers/...`	Adds `operation("call llm", {"crewai.llm.model": ...})` alongside `llm_call_context()` for both sync and async call paths in the base `LLM` class and all five provider implementations (Anthropic, Azure, Bedrock, Gemini, OpenAI).
Tool, Knowledge, Memory, and A2A delegation spans `lib/crewai/src/crewai/tools/base_tool.py`, `lib/crewai/src/crewai/knowledge/knowledge.py`, `lib/crewai/src/crewai/memory/unified_memory.py`, `lib/crewai/src/crewai/a2a/utils/delegation.py`	Wraps `BaseTool`/`Tool` run methods in `"call tool"` spans, `Knowledge.query`/`aquery` in `"query knowledge"` spans, `UnifiedMemory.remember`/`recall` in `"remember/recall memory"` spans, and `aexecute_a2a_delegation` in `"a2a delegate"` spans with failure event enrichment.
Guardrail and Reasoning spans `lib/crewai/src/crewai/tasks/llm_guardrail.py`, `lib/crewai/src/crewai/utilities/reasoning_handler.py`	Wraps `LLMGuardrail.__call__` in `operation("guard llm", ...)` and `AgentReasoning.handle_agent_reasoning` planning call in `operation("agent reason", ...)`.
OTel instrumentation integration tests `lib/crewai/tests/telemetry/test_otel.py`	Validates span creation, attributes, error recording, `expected_exceptions` semantics, `follows_from` link construction, nested hot-path trace coherence with shared `trace_id`, log↔trace ID correlation, and context propagation across all cross-thread dispatch patterns.
OTel no-op behavior tests `lib/crewai/tests/telemetry/test_otel_noop.py`	Verifies proxy tracer provider default, non-recording spans when no SDK is installed, and clean kickoff execution without raising or replacing the global provider.

Suggested Reviewers

lorenzejay
greysonlalonde
joaomdmoura

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.06% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately describes the primary objective of this comprehensive changeset: adding native OpenTelemetry instrumentation throughout crewAI's execution hot paths.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch luzk/otel-instrumentation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

+            with operation("failing op"):
+                raise RuntimeError("boom")
+
+        finished = span_exporter.get_finished_spans()


+            with operation("doubly recorded"):
+                raise RuntimeError("once")
+
+        span = span_exporter.get_finished_spans()[0]


+            with operation("cancelled op"):
+                raise asyncio.CancelledError("cancel")
+
+        span = span_exporter.get_finished_spans()[0]


mintlify · 2026-06-23T13:12:29Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
crewai	🟢 Ready	View Preview	Jun 23, 2026, 1:24 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (1)

lib/crewai/tests/telemetry/test_otel.py (1)
85-91: 📐 Maintainability & Code Quality | 🔵 Trivial

Isolate the private OpenTelemetry internals behind a versioned helper for clarity.

trace._TRACER_PROVIDER_SET_ONCE._done and trace._TRACER_PROVIDER are undocumented private symbols; they are not part of the public API and relying on them across version upgrades causes stability issues. While the assertion guard assert actual is _SHARED_PROVIDER will catch a hard break, it surfaces only as a confusing test failure rather than a clear signal. Wrap this in a small helper function with a comment documenting the supported opentelemetry-api version (currently ~1.34.0) and verify these attributes exist at initialization time, so future maintainers understand the fragility and constraints.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/telemetry/test_otel.py` around lines 85 - 91, The test code
directly accesses undocumented private OpenTelemetry internals
(trace._TRACER_PROVIDER_SET_ONCE._done and trace._TRACER_PROVIDER) which are not
part of the public API and lack version stability guarantees. Create a small
helper function that encapsulates this private attribute access and verify these
attributes exist at initialization time, add a documentation comment specifying
the supported opentelemetry-api version constraint (currently ~1.34.0), and call
this helper from the test instead of directly manipulating the private symbols.
This way future maintainers will understand the version fragility and the
assertion guard will provide clearer context if something breaks.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/flow/runtime/__init__.py`:
- Around line 2493-2525: The `HumanFeedbackPending` exception is escaping the
`operation("execute flow", ...)` context manager before being caught by an outer
exception handler, causing the span to be marked as an error even though it
represents expected control flow. Wrap the code inside the operation context
that calls `self._execute_start_method(start_method)` and
`asyncio.gather(*tasks)` in a try-except block that catches
`HumanFeedbackPending` separately, re-raises it after the operation context
closes so the outer exception handler can process it without the operation span
recording it as an error.
- Around line 2852-2867: The issue is that when a sync method returns a
coroutine, the await is happening after the operation span exits, placing part
of the execution outside the tracing context. To fix this, after executing the
sync method in the else block (where asyncio.to_thread is used with ctx.run),
check if the result is a coroutine using asyncio.iscoroutine(result), and if so,
await it while still inside the operation context manager block. This ensures
the complete execution of coroutine-returning sync methods stays within the
"execute flow method" operation span.

In `@lib/crewai/src/crewai/llms/providers/anthropic/completion.py`:
- Around line 301-303: The instrumentation code added to the context manager at
the llm_call_context() call (around lines 301-303) and at line 378-380 is
logically correct but violates Ruff formatting rules, causing CI to fail. Run
the Ruff formatter on the file by executing the command `uv run ruff format
lib/crewai/src/crewai/llms/providers/anthropic/completion.py` to automatically
apply the required formatting changes, then commit and push the formatted output
to unblock CI.

In `@lib/crewai/src/crewai/llms/providers/openai/completion.py`:
- Around line 414-416: The ruff formatter is detecting formatting issues in the
completion.py file that are blocking CI. Run the command to automatically format
the file using ruff, which will resolve the formatting violations detected by
ruff format --check. Specifically, apply ruff formatting to the file containing
the llm_call_context and operation function calls around the "call llm"
operation context, then commit the formatted changes.

In `@lib/crewai/src/crewai/telemetry/otel.py`:
- Line 1: The file lib/crewai/src/crewai/telemetry/otel.py has formatting issues
detected by the Ruff formatter that are blocking CI. Run the command `uv run
ruff format lib/crewai/src/crewai/telemetry/otel.py` to automatically apply the
required formatting standards to the file, or run `uv run ruff format lib/` to
format the entire lib directory. This will resolve the formatting drift and
allow the CI pipeline to pass.

In `@lib/crewai/tests/telemetry/test_otel.py`:
- Around line 567-570: The ruff format check is failing because several
expressions are unnecessarily split across multiple lines when they fit within
the 88 character limit. Collapse the manually wrapped expressions in the
test_otel.py file: the ThreadPoolExecutor pool.submit(...).result() blocks
(found around lines 568-570, 591-593, 612-614, and 635-637) and the
single-argument assert statements (around lines 498-500 and 573-575) should each
be collapsed onto a single line where they fit under 88 characters. Run uv run
ruff format lib/ to automatically fix all formatting issues in the directory.

---

Nitpick comments:
In `@lib/crewai/tests/telemetry/test_otel.py`:
- Around line 85-91: The test code directly accesses undocumented private
OpenTelemetry internals (trace._TRACER_PROVIDER_SET_ONCE._done and
trace._TRACER_PROVIDER) which are not part of the public API and lack version
stability guarantees. Create a small helper function that encapsulates this
private attribute access and verify these attributes exist at initialization
time, add a documentation comment specifying the supported opentelemetry-api
version constraint (currently ~1.34.0), and call this helper from the test
instead of directly manipulating the private symbols. This way future
maintainers will understand the version fragility and the assertion guard will
provide clearer context if something breaks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 120c8a32-83fb-4d7a-882b-bc510eeca1d9

📥 Commits

Reviewing files that changed from the base of the PR and between 2eb4e3a and fdfe593.

📒 Files selected for processing (25)

docs/docs.json
docs/edge/en/changelog.mdx
docs/edge/en/observability/opentelemetry.mdx
lib/crewai/src/crewai/a2a/utils/delegation.py
lib/crewai/src/crewai/agent/core.py
lib/crewai/src/crewai/crew.py
lib/crewai/src/crewai/events/event_bus.py
lib/crewai/src/crewai/flow/runtime/__init__.py
lib/crewai/src/crewai/knowledge/knowledge.py
lib/crewai/src/crewai/llm.py
lib/crewai/src/crewai/llms/providers/anthropic/completion.py
lib/crewai/src/crewai/llms/providers/azure/completion.py
lib/crewai/src/crewai/llms/providers/bedrock/completion.py
lib/crewai/src/crewai/llms/providers/gemini/completion.py
lib/crewai/src/crewai/llms/providers/openai/completion.py
lib/crewai/src/crewai/memory/unified_memory.py
lib/crewai/src/crewai/task.py
lib/crewai/src/crewai/tasks/llm_guardrail.py
lib/crewai/src/crewai/telemetry/__init__.py
lib/crewai/src/crewai/telemetry/otel.py
lib/crewai/src/crewai/tools/base_tool.py
lib/crewai/src/crewai/tools/structured_tool.py
lib/crewai/src/crewai/utilities/reasoning_handler.py
lib/crewai/tests/telemetry/test_otel.py
lib/crewai/tests/telemetry/test_otel_noop.py

coderabbitai · 2026-06-23T13:14:20Z

+            with ThreadPoolExecutor(max_workers=2) as pool:
+                inner = pool.submit(
+                    contextvars.copy_context().run, _task
+                ).result()


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

CI is red on ruff format; collapse these manually wrapped expressions.

The lint job fails because several expressions here are wrapped across lines that ruff format will collapse onto a single line (each fits well under 88 cols). Representative spots: the pool.submit(...).result() blocks at Lines 568-570, 591-593, 612-614, 635-637 and the single-arg assert blocks at Lines 498-500 and 573-575. Run uv run ruff format lib/ to fix all reformatted files.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/telemetry/test_otel.py` around lines 567 - 570, The ruff format check is failing because several expressions are unnecessarily split across multiple lines when they fit within the 88 character limit. Collapse the manually wrapped expressions in the test_otel.py file: the ThreadPoolExecutor pool.submit(...).result() blocks (found around lines 568-570, 591-593, 612-614, and 635-637) and the single-argument assert statements (around lines 498-500 and 573-575) should each be collapsed onto a single line where they fit under 88 characters. Run uv run ruff format lib/ to automatically fix all formatting issues in the directory.

Source: Pipeline failures

Open spans directly on the user's thread so that stdlib log records emitted during hot paths like `Crew.kickoff`, `BaseTool.run`, and `LLM.call` carry the active trace context and correlate with the spans they belong to — a gap the previous metrics-only telemetry could not close. Introduces a `crewai.telemetry.otel` module exposing `operation` and `follows_from`, instruments the execution hot paths, and propagates the active context across every parallel-dispatch site. Depends only on `opentelemetry-api` so provider and exporter choice stays with the host application per the standard OTel library pattern; without an installed SDK the `ProxyTracer` keeps everything as a NoOp. Co-authored-by: Cursor <cursoragent@cursor.com>

+            with operation("paused op", expected_exceptions=(_ExpectedPause,)):
+                raise _ExpectedPause("pause")
+
+        span = span_exporter.get_finished_spans()[0]


Address review feedback on the native OpenTelemetry instrumentation

`test_otel.py`'s `span_exporter` fixture installed an SDK `TracerProvider` once via module-level globals and never restored the default `ProxyTracerProvider`, so `test_otel_noop.py`'s unconfigured- default-state assertions failed whenever the two files ran on the same worker. Install the SDK provider fresh per test and reset the global slot back to `ProxyTracerProvider` in `finally`; `_tracer()` re-resolves on every span so swapping providers between tests is safe.

`Telemetry.set_tracer()` installed crewAI's anonymous SDK `TracerProvider` into OpenTelemetry's process-global slot, so the first `Crew` constructed in a test or host application replaced the default `ProxyTracerProvider` and exfiltrated every host span emitted via `trace.get_tracer(...)` to crewAI's OTLP endpoint. Keep the provider local to the `Telemetry` instance and route every anonymous span through `self.provider.get_tracer("crewai.telemetry")` so the global slot stays untouched. Mirrors the fix in `crewai_core.telemetry`, drops the now-dead `set_tracer()` calls in `event_listener.py` and `crewai_cli.command`, and adds regression coverage that asserts the provider stays a `ProxyTracerProvider` after constructing a `Crew`.

linear · 2026-06-23T18:55:59Z

CON-290

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 99b87e8. Configure here.}

cursor · 2026-06-23T18:58:13Z

+                    # whole call stays inside the "execute flow method" span
+                    # (enables AgentExecutor pattern).
+                    if asyncio.iscoroutine(result):
+                        result = await result


Flow method span ends early

Low Severity

The new execute flow method span wraps only the method body and auto-awaited coroutine, but _run_human_feedback_step runs after that block ends. If feedback fails or raises HumanFeedbackPending, the method span is already finished with a non-error status, so traces can show a successful method while the flow is paused or failed in the feedback step.

^{Reviewed by Cursor Bugbot for commit 99b87e8. Configure here.}

github-actions Bot added the size/XL label Jun 23, 2026

corridor-security Bot reviewed Jun 23, 2026

View reviewed changes

cursor Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/flow/runtime/__init__.py

github-code-quality Bot found potential problems Jun 23, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

mintlify Bot deployed to staging - docs June 23, 2026 13:24 View deployment

lucasgomide force-pushed the luzk/otel-instrumentation branch from fdfe593 to 7cc74cb Compare June 23, 2026 14:57

cursor Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/tools/base_tool.py

github-code-quality Bot found potential problems Jun 23, 2026

View reviewed changes

Comment thread lib/crewai/tests/telemetry/test_otel.py

with operation("paused op", expected_exceptions=(_ExpectedPause,)):

raise _ExpectedPause("pause")

span = span_exporter.get_finished_spans()[0]

fix: keep coroutine results inside the execute flow method span

4d6ff2c

Address review feedback on the native OpenTelemetry instrumentation

lucasgomide force-pushed the luzk/otel-instrumentation branch from 7cc74cb to 4d6ff2c Compare June 23, 2026 15:05

mintlify Bot deployed to staging - docs June 23, 2026 15:09 View deployment

mintlify Bot deployed to staging - docs June 23, 2026 15:19 View deployment

lucasgomide added 2 commits June 23, 2026 13:46

Merge branch 'main' into luzk/otel-instrumentation

99b87e8

cursor Bot reviewed Jun 23, 2026

View reviewed changes

mintlify Bot deployed to staging - docs June 23, 2026 19:48 View deployment

Conversation

lucasgomide commented Jun 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

corridor-security Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Suggested Reviewers

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

mintlify Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

linear Bot commented Jun 23, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 23, 2026

Choose a reason for hiding this comment

Flow method span ends early

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lucasgomide commented Jun 23, 2026 •

edited by cursor Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

mintlify Bot commented Jun 23, 2026 •

edited

Loading