Skip to content

fix(tracing): prevent mutation of user options when tracing is enabled #1273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Jul 8, 2025

PR Description

When tracing is enabled, the user's original GenerationOptions object was modified, causing instability and crashes when the same options were reused across multiple calls.

Changes:

  • create a copy of user options before modification when tracing is enabled
  • add comprehensive tests to verify options are not mutated
  • ensure tracing still works correctly with the fix

requires #1269

264

@Pouyanpi Pouyanpi added this to the v0.15.0 milestone Jul 8, 2025
@Pouyanpi Pouyanpi self-assigned this Jul 8, 2025
@Pouyanpi Pouyanpi added the bug Something isn't working label Jul 8, 2025
@Pouyanpi Pouyanpi marked this pull request as ready for review July 8, 2025 14:05
@Pouyanpi Pouyanpi requested review from Copilot and tgasser-nv July 8, 2025 14:06
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures that when tracing is enabled, the user's original GenerationOptions remain unchanged by creating a copy before modifying, and it adds tests to verify immutability and correct tracing behavior.

  • Copy user options in generate_async to prevent in-place mutation when tracing is on
  • Add tests for:
    • verifying no mutation of provided GenerationOptions
    • handling options=None gracefully
    • aggressive override behavior when all log options are disabled

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tests/test_tracing.py Added new async tests to check option immutability, None handling, and overrides
nemoguardrails/rails/llm/llmrails.py Wrapped existing options in a copy when tracing is enabled
Comments suppressed due to low confidence (3)

tests/test_tracing.py:22

  • The import LLMRails is not used in this test file; consider removing it to clean up unused imports.
from nemoguardrails import LLMRails, RailsConfig

tests/test_tracing.py:30

  • GenerationRailsOptions is imported but never used in this test file; consider removing it to avoid unused imports.
    GenerationRailsOptions,

nemoguardrails/rails/llm/llmrails.py:844

  • [nitpick] Consider using a deep copy approach (e.g., copy.deepcopy or Pydantic's model_copy(deep=True)) to ensure nested log objects are fully duplicated and prevent unintended side effects.
                options = GenerationOptions(**options.model_dump())

@Pouyanpi Pouyanpi force-pushed the fix/tracing-otel-api branch 2 times, most recently from 1c92762 to 73d11fc Compare July 8, 2025 14:33
Pouyanpi added 5 commits July 8, 2025 16:40
Refactored the OpenTelemetry tracing adapter to follow OpenTelemetry library best practices. The adapter now uses only the OpenTelemetry API, does not modify global state, and relies on the application to configure the SDK and tracer provider. Removed legacy exporter and span processor configuration from the adapter. Updated tests to reflect the new initialization logic and ensure backward compatibility with old config parameters.
@Pouyanpi Pouyanpi force-pushed the fix/tracing-otel-api branch from 73d11fc to 6609cf2 Compare July 8, 2025 14:40
Pouyanpi added 2 commits July 8, 2025 16:42
Refactored the OpenTelemetry tracing adapter to follow OpenTelemetry library best practices. The adapter now uses only the OpenTelemetry API, does not modify global state, and relies on the application to configure the SDK and tracer provider. Removed legacy exporter and span processor configuration from the adapter. Updated tests to reflect the new initialization logic and ensure backward compatibility with old config parameters.
When tracing is enabled, the system was directly modifying the user's
original GenerationOptions object, causing instability when the same
options were reused across multiple calls.

This fix creates a copy of the options before modification, ensuring
the original user options remain unchanged while maintaining full
tracing functionality
@Pouyanpi Pouyanpi force-pushed the fix/tracing-options-mutation branch from 4b9b666 to b85493c Compare July 8, 2025 14:58
Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Looks good! Just a couple of nits and test cleanups needed before merging. There are a couple of follow-on items we need to track:

  • Decoupling logging from the giant tangle of LLMRails into its own object.
  • Fixing the behaviour when users turn off tracing and we turn it back on again for them (!)

async def test_tracing_aggressive_override_when_all_disabled():
"""Test that tracing aggressively enables all logging when user disables all options.

When user disables all three tracing related options, tracing still enables
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user I would not expect this behaviour! If I disable all tracing options I'd expect tracing to be disabled too. No fix required for this PR but let's think about how to clean this up

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we need to emphasize it in the docs,

enable tracing = enabling all log options

@@ -839,6 +839,12 @@ async def generate_async(
if self.config.tracing.enabled:
if options is None:
options = GenerationOptions()
else:
# create a copy of the options to avoid modifying the original
options = GenerationOptions(**options.model_dump())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still work if we pass options in as a Dict:

        options: Optional[Union[dict, GenerationOptions]] = None,

Could you add a test to check this is still ok?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would options.model_copy(deep=True) be clearer here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the first one is indeed a bug.

Yes it is more readable. I'll make both of these changes 👍🏻

messages=[{"role": "user", "content": "hello"}], options=user_options
)

assert user_options.log.activated_rails == original_activated_rails
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Are we checking if none of the activated_rails, llm_calls, or internal_events are True then we enable all three? Could you add that explicitly below rather than checking against the original values?

@Pouyanpi Pouyanpi force-pushed the fix/tracing-otel-api branch from 573a67d to 873d654 Compare July 11, 2025 14:03
Base automatically changed from fix/tracing-otel-api to develop July 11, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants