Skip to content

Conversation

Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Sep 2, 2025

Description

  • Complete rewrite of RunnableRails implementing full LangChain Runnable protocol
  • Added comprehensive async/sync support for invoke, batch, and streaming operations
  • Enhanced input/output handling for all LangChain message formats and chaining scenarios
  • Extensive test suite with 14 test files covering batching, streaming, tool calling, and composition

requires #1364 and #1343 and #1289

Add example configuration and documentation for using NVIDIA NeMoGuard NIMs,
including content moderation, topic control, and jailbreak
detection.
Update verbose logging to safely handle cases where log records may not
have 'id' or 'task' attributes. Prevents potential AttributeError and
improves robustness of LLM and prompt log output formatting.
Implements tool call extraction and passthrough functionality in LLMRails:
- Add tool_calls_var context variable for storing LLM tool calls
- Refactor llm_call utils to extract and store tool calls from responses
- Support tool calls in both GenerationResponse and dict message formats
- Add ToolMessage support for langchain message conversion
- Comprehensive test coverage for tool calling integration
@Pouyanpi Pouyanpi added this to the v0.17.0 milestone Sep 2, 2025
@Pouyanpi Pouyanpi self-assigned this Sep 2, 2025
@Pouyanpi Pouyanpi added the enhancement New feature or request label Sep 2, 2025
@Pouyanpi Pouyanpi changed the title Feat/runnable rails feat(runnable_rails): complete rewrite of RunnableRails with full LangChain Runnable protocol support Sep 2, 2025
Copy link
Contributor

github-actions bot commented Sep 2, 2025

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-1366

@codecov-commenter
Copy link

codecov-commenter commented Sep 2, 2025

Codecov Report

❌ Patch coverage is 73.19347% with 115 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.52%. Comparing base (960e0b8) to head (68f438e).
⚠️ Report is 6 commits behind head on develop.

Files with missing lines Patch % Lines
...uardrails/integrations/langchain/runnable_rails.py 68.33% 114 Missing ⚠️
nemoguardrails/actions/llm/utils.py 98.18% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1366      +/-   ##
===========================================
- Coverage    71.62%   71.52%   -0.10%     
===========================================
  Files          171      171              
  Lines        17020    17348     +328     
===========================================
+ Hits         12191    12409     +218     
- Misses        4829     4939     +110     
Flag Coverage Δ
python 71.52% <73.19%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
nemoguardrails/context.py 100.00% <100.00%> (ø)
nemoguardrails/logging/verbose.py 90.90% <100.00%> (+0.39%) ⬆️
nemoguardrails/rails/llm/llmrails.py 90.79% <100.00%> (+0.06%) ⬆️
nemoguardrails/rails/llm/options.py 63.39% <100.00%> (+0.32%) ⬆️
nemoguardrails/actions/llm/utils.py 82.31% <98.18%> (+2.31%) ⬆️
...uardrails/integrations/langchain/runnable_rails.py 71.18% <68.33%> (-18.90%) ⬇️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… Runnable protocol support

- Implement comprehensive async/sync invoke, batch, and streaming support
- Add robust input/output transformation for all LangChain formats (ChatPromptValue, BaseMessage,
dict, string)
- Enhance chaining behavior with intelligent __or__ method handling RunnableBinding and complex
chains
- Add concurrency controls, error handling, and configurable blocking messages
- Implement proper tool calling support with tool call passthrough
- Add extensive test suite (14 test files, 2800+ lines) covering all major functionality including
batching, streaming, composition, piping, and tool calling
- Reorganize and expand test structure for better maintainability
@Pouyanpi Pouyanpi force-pushed the feat/runnable-rails branch from 7ae2f3a to 68f438e Compare September 9, 2025 09:22
@Pouyanpi Pouyanpi marked this pull request as ready for review September 9, 2025 09:41
@Pouyanpi Pouyanpi requested a review from Copilot September 9, 2025 09:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR completes a full rewrite of RunnableRails with comprehensive LangChain Runnable protocol support, providing async/sync operations, streaming capabilities, tool calling functionality, and enhanced input/output handling.

  • Implemented complete LangChain Runnable protocol including invoke, batch, stream, and async variants
  • Added tool calling support with proper context variable management across the pipeline
  • Enhanced streaming functionality with proper chunk formatting and state management

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/utils.py Added streaming support to FakeLLM with _stream and _astream methods
tests/test_tool_calls_context.py New test file for tool calls context variable functionality
tests/test_tool_calling_utils.py New comprehensive tests for tool calling utility functions
tests/test_tool_calling_passthrough_integration.py Integration tests for tool calling in passthrough mode
tests/runnable_rails/*.py 14 new test files covering streaming, batching, composition, and tool calling
nemoguardrails/rails/llm/options.py Added tool_calls field to GenerationResponse model
nemoguardrails/rails/llm/llmrails.py Enhanced to extract and include tool calls in responses
nemoguardrails/logging/verbose.py Fixed potential AttributeError with missing record attributes
nemoguardrails/integrations/langchain/runnable_rails.py Complete rewrite implementing full Runnable protocol
nemoguardrails/context.py Added tool_calls_var context variable
nemoguardrails/actions/llm/utils.py Refactored llm_call with tool calling support and improved message handling
examples/configs/nemoguards/* New example configuration demonstrating NeMoGuard safety rails

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +89 to +114
def _stream(self, prompt, stop=None, run_manager=None, **kwargs):
"""Stream the response by breaking it into tokens."""
if self.exception:
raise self.exception

current_i = self.i
if current_i >= len(self.responses):
raise RuntimeError(
f"No responses available for query number {current_i + 1} in FakeLLM. "
"Most likely, too many LLM calls are made or additional responses need to be provided."
)

response = self.responses[current_i]
self.i = current_i + 1

if not self.streaming:
# If streaming is disabled, return single response
yield response
return

tokens = response.split()
for i, token in enumerate(tokens):
if i == 0:
yield token
else:
yield " " + token
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The streaming logic is duplicated between _stream and _astream methods. Consider extracting the token splitting logic into a helper method to reduce code duplication.

Copilot uses AI. Check for mistakes.

Comment on lines +57 to +59
id_str = getattr(record, "id", None)
id_display = f"({id_str[:5]}..)" if id_str else ""
console.print(f"[cyan]LLM {title} {id_display}[/]")
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When id_str is None, the slice operation id_str[:5] will fail. The condition should check for None before slicing.

Copilot uses AI. Check for mistakes.

@@ -199,43 +668,281 @@ def invoke(
# If more than one message is returned, we only take the first one.
# This can happen for advanced use cases, e.g., when the LLM could predict
# multiple function calls at the same time. We'll deal with these later.
if isinstance(result, list):
if isinstance(result, list) and len(result) > 0:
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This silently discards multiple results when a list is returned. Consider logging when multiple results are discarded or adding configuration to control this behavior.

Suggested change
if isinstance(result, list) and len(result) > 0:
if isinstance(result, list) and len(result) > 0:
if len(result) > 1:
logger.warning(
f"Multiple results returned ({len(result)}). Only the first result will be used. "
"Consider updating your configuration or code to handle multiple results if needed."
)

Copilot uses AI. Check for mistakes.

Comment on lines +915 to +925
semaphore = asyncio.Semaphore(self.concurrency_limit)

async def process_with_semaphore(input_item):
async with semaphore:
return await self.ainvoke(input_item, config, **kwargs)

return await gather_with_concurrency(
self.concurrency_limit,
*[process_with_semaphore(input_item) for input_item in inputs],
)

Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The concurrency_limit is used both for the semaphore and gather_with_concurrency, creating redundant concurrency control. Consider using only one mechanism or clarifying why both are needed.

Suggested change
semaphore = asyncio.Semaphore(self.concurrency_limit)
async def process_with_semaphore(input_item):
async with semaphore:
return await self.ainvoke(input_item, config, **kwargs)
return await gather_with_concurrency(
self.concurrency_limit,
*[process_with_semaphore(input_item) for input_item in inputs],
)
return await gather_with_concurrency(
self.concurrency_limit,
*[self.ainvoke(input_item, config, **kwargs) for input_item in inputs],
)

Copilot uses AI. Check for mistakes.

@@ -175,15 +226,15 @@ def get_colang_history(
history += f'user "{event["text"]}"\n'
elif event["type"] == "UserIntent":
if include_texts:
history += f' {event["intent"]}\n'
history += f" {event['intent']}\n"
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using single quotes inside f-strings with double quotes is inconsistent with the rest of the codebase style. Consider using consistent quote style throughout.

Copilot uses AI. Check for mistakes.

else:
history += f'user {event["intent"]}\n'
history += f"user {event['intent']}\n"
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using single quotes inside f-strings with double quotes is inconsistent with the rest of the codebase style. Consider using consistent quote style throughout.

Copilot uses AI. Check for mistakes.

elif event["type"] == "BotIntent":
# If we have instructions, we add them before the bot message.
# But we only do that for the last bot message.
if "instructions" in event and idx == last_bot_intent_idx:
history += f"# {event['instructions']}\n"
history += f'bot {event["intent"]}\n'
history += f"bot {event['intent']}\n"
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using single quotes inside f-strings with double quotes is inconsistent with the rest of the codebase style. Consider using consistent quote style throughout.

Copilot uses AI. Check for mistakes.

@@ -352,9 +403,9 @@ def flow_to_colang(flow: Union[dict, Flow]) -> str:
if "_type" not in element:
raise Exception("bla")
if element["_type"] == "UserIntent":
colang_flow += f'user {element["intent_name"]}\n'
colang_flow += f"user {element['intent_name']}\n"
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using single quotes inside f-strings with double quotes is inconsistent with the rest of the codebase style. Consider using consistent quote style throughout.

Copilot uses AI. Check for mistakes.

elif element["_type"] == "run_action" and element["action_name"] == "utter":
colang_flow += f'bot {element["action_params"]["value"]}\n'
colang_flow += f"bot {element['action_params']['value']}\n"
Copy link
Preview

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using single quotes inside f-strings with double quotes is inconsistent with the rest of the codebase style. Consider using consistent quote style throughout.

Suggested change
colang_flow += f"bot {element['action_params']['value']}\n"
colang_flow += f"bot {element["action_params"]["value"]}\n"

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request runnable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants