-
Notifications
You must be signed in to change notification settings - Fork 504
feat: parallel rails execution #1234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
commit 6ab101ce510c52a89f613e183b8ac485f2d98e08 Author: Christian Schüller <[email protected]> Date: Thu Jun 19 11:17:10 2025 +0200 Revert abc bot changes commit 6c8ff5788bd9de9375a4dcc165a633b540fbb3b1 Author: Christian Schüller <[email protected]> Date: Thu Jun 19 11:07:05 2025 +0200 Add unit tests commit 43e83fe1df21fa335d3c0afdf7bc90da854594bc Author: Christian Schüller <[email protected]> Date: Thu Jun 19 10:53:56 2025 +0200 Fix event order in processing log commit b2ac8c2d4ba0be4dbeb50b076ef0018e1fe1bc53 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 15:44:03 2025 +0200 Fix bug in task cancellation commit aa53bd4ce857d2dd399a76ef1077dd6e5bfc0418 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 15:30:54 2025 +0200 Handle CancelledError commit a37c890a40bd13f1263a6d61ef296b5d7ae63e6a Author: Christian Schüller <[email protected]> Date: Wed Jun 18 11:26:54 2025 +0200 Revert bot config changes commit 2ebeee89d409bf2d844656176337a887fbf68024 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 11:00:02 2025 +0200 Fix types for Python 3.9 commit 410ee0e3a5eaa3c24749fd291a6d8da4cc3c06d0 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 10:17:59 2025 +0200 Add support for parallel output rails commit 5760f9a29cb723f8a9049aef6b8efe45c98e0c23 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 10:13:00 2025 +0200 Fix event history creation commit 3b9193a391f19e3e22ebe56c4dca23fe416fcbae Author: Christian Schüller <[email protected]> Date: Tue Jun 17 16:51:03 2025 +0200 Add support for tracing events commit ba20b5b39f55a366919a8e5bb038ad3dc350b162 Author: Christian Schüller <[email protected]> Date: Tue Jun 17 15:39:09 2025 +0200 Fix tracing commit b807f28b67c8347593f2b925733ec80a81f67449 Author: Christian Schüller <[email protected]> Date: Mon Jun 16 13:43:50 2025 +0200 Improve parallel task handling commit 2516d4f030ff6dcfd64d39708eb2bb36aa36bbc0 Author: Christian Schüller <[email protected]> Date: Fri Jun 13 11:54:36 2025 +0200 Add support for rail parameters commit b70942d Author: Razvan Dinu <[email protected]> Date: Wed Jun 4 15:11:52 2025 +0300 feat: poc for parallel rails The approach is to add a custom action called `run_flows_in_parallel` which can be used for both input and output rails. Small tweak needed for the logic for starting new flows. Only basic manual testing done to check that two self-check input flows run in paralle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for parallel execution of input and output rails via a new run_flows_in_parallel
action.
- Introduces a
parallel
flag in the rails config to toggle between sequential and parallel processing. - Implements
_run_flows_in_parallel
,_run_input_rails_in_parallel
, and_run_output_rails_in_parallel
in the runtime, and registers them as actions. - Adds end-to-end tests and test configs that verify success and failure behaviors under parallel rail execution.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
tests/test_parallel_rails.py | New async tests covering parallel-rails success and failure paths |
tests/test_configs/parallel_rails/config.yml | Bot config with parallel: True and lists of rails to run |
tests/test_configs/parallel_rails/blocked_terms.co | Definitions for blocked-term subflows |
tests/test_configs/parallel_rails/prompts.yml | Prompt templates for self-check input/output |
tests/test_configs/parallel_rails/actions.py | Simulated actions with delays to test concurrency |
nemoguardrails/rails/llm/llm_flows.co | Branches on parallel flag to call new parallel-run actions |
nemoguardrails/rails/llm/config.py | Adds parallel field to InputRails model |
nemoguardrails/logging/processing_log.py | Extends ignored actions to hide parallel-run helpers |
nemoguardrails/colang/v1_0/runtime/runtime.py | Registers the new parallel-run actions and unpacks history events |
nemoguardrails/colang/v1_0/runtime/flows.py | Updates flow-start logic to respect explicit start_flow events |
Comments suppressed due to low confidence (1)
tests/test_configs/parallel_rails/config.yml:30
- The third input rail is duplicated here instead of the intended 'generate user intent' subflow, which will cause the test to fail at index 2. Please replace this entry with the correct flow name.
- check blocked input terms
Co-authored-by: Copilot <[email protected]> Signed-off-by: Christian Schüller <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Christian Schüller <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1234 +/- ##
===========================================
+ Coverage 68.65% 70.38% +1.73%
===========================================
Files 161 161
Lines 15978 16129 +151
===========================================
+ Hits 10969 11353 +384
+ Misses 5009 4776 -233
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Thank you @schuellc-nvidia, I'm going to review it gradually 👍🏻 Have you considered adding tests that compares sequential and parallel execution, I see that following test passes but would be great to address the differences in their behavior @pytest.mark.asyncio
async def test_sequential_rails_execution():
# Test sequential execution - rails should run one after another and take longer
config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "parallel_rails"))
config.rails.input.parallel = False
config.rails.output.parallel = False
chat = TestChat(
config,
llm_completions=[
"No",
"Hi there! How can I assist you with questions about the ABC Company today?",
"No",
],
)
chat >> "hi"
result = await chat.app.generate_async(messages=chat.history, options=OPTIONS)
# Assert the response is correct
assert (
result
and result.response[0]["content"]
== "Hi there! How can I assist you with questions about the ABC Company today?"
)
# Check that all rails were executed sequentially
assert result.log.activated_rails[0].name == "self check input"
assert (
result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
)
assert len(result.log.activated_rails[1].executed_actions) == 1
assert (
result.log.activated_rails[2].name == "check blocked input terms $duration=1.0"
)
assert result.log.activated_rails[3].name == "generate user intent"
assert result.log.activated_rails[4].name == "self check output"
assert (
result.log.activated_rails[5].name == "check blocked output terms $duration=1.0"
)
assert (
result.log.activated_rails[6].name == "check blocked output terms $duration=1.0"
)
assert len(result.log.activated_rails[5].executed_actions) == 1
# Sequential execution should take approximately 2 seconds for each set of rails:
# Input rails: 1s + 1s = 2s (sequential)
# Output rails: 1s + 1s = 2s (sequential)
assert (
result.log.stats.input_rails_duration >= 2
and result.log.stats.output_rails_duration >= 2
), "Sequential rails processing should take longer than parallel processing." |
@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails. |
@schuellc-nvidia I totally hear you that our parallel rails tests shouldn’t get bogged down in sequential details. The actual intent here is to lock in and document the behavioral delta between the two modes so we catch any future regressions: For example # sequential
assert result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
# parallel
assert result.log.activated_rails[1].name == "check blocked input terms" Happy to move this into its own tests to keep the parallel suite focused while still covering this critical diff. And I think it is important that this difference being addressed 👍🏻 |
Good catch! The difference was actually an unintended side effect and the last commit fixed this such that sequential and parallel rail calling behave the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schuellc-nvidia This PR looks good, but I think we're missing vital coverage of actually calling real-world NVCF functions and App models.
- Adapted from the safeguard_ai_virtual_assistant_notebook.ipynb. Please set up a config that calls the NVCF Content Safety, Topic Control, and Jailbreak detect as in Step 4 (config.yml).
- Adapted from this blog post to call Alignscore check facts and Llama Guard check output.
For each of the two tests, please measure the change in latency and check the outputs are the same. Please add those in a comment in the MR, and don't merge if anything breaks or you see suspicious latency numbers.
Rails Configuration Comparisonmodels:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct
- type: content_safety
engine: nim
model: nvidia/llama-3.1-nemoguard-8b-content-safety
- type: topic_control
engine: nim
model: nvidia/llama-3.1-nemoguard-8b-topic-control
rails:
input:
parallel: True
flows:
- content safety check input $model=content_safety
- topic safety check input $model=topic_control
- self check input
output:
parallel: True
flows:
- content safety check output $model=content_safety
- self check output Parallel Rails Configuration> hi there
Input rails duration: 0.43209362030029297
Output rails duration: 0.5509295463562012
Hello! It's nice to meet you. My name is Assistant and I'm here to help you with anything you need. How can I assist you today?
> The weater is nice
Input rails duration: 0.6604671478271484
Output rails duration: 0.6846234798431396
That's great to hear! According to the latest weather report, it's currently sunny with a temperature of 75 degrees Fahrenheit. Is there anything else you would like to know about the weather?
> I want to buy a gun
Input rails duration: 0.37376832962036133
Output rails duration: None
I'm sorry, I can't respond to that.
> I don't like you
Input rails duration: 0.3921196460723877
Output rails duration: None
I'm sorry, I can't respond to that.
> tell me something nasty
Input rails duration: 0.6805319786071777
Output rails duration: None
I'm sorry, I can't respond to that.
> Can you help me?
Input rails duration: 0.633216142654419
Output rails duration: 0.5047521591186523
Of course! I am designed to assist you with anything you need. What can I help you with specifically?
> ok, lets do that
Input rails duration: 0.6267142295837402
Output rails duration: 0.6642739772796631
Sure thing! What would you like to do? I can help you with tasks such as setting reminders, making appointments, or even just providing information on various topics. Just let me know what you need and I'll do my best to assist you.
> Tell me something about your products
Input rails duration: 0.6722187995910645
Output rails duration: 0.5426013469696045
I would be happy to tell you about our products! We have a wide range of products available, including smart home devices, virtual assistants, and productivity tools. Our products are designed to make your life easier and more efficient. Is there a specific product
you would like to know more about? Sequential Rails Configuration> hi there
Input rails duration: 1.2866036891937256
Output rails duration: 1.032092809677124
Hello! It's great to meet you. My name is Assistant and I am here to help you with anything you need. How can I assist you today?
> The weater is nice
Input rails duration: 1.0234577655792236
Output rails duration: 1.042959451675415
That's wonderful to hear! According to my sources, the weather in your location is currently sunny with a high of 75 degrees Fahrenheit and a low of 62 degrees Fahrenheit. Would you like me to provide you with any other weather updates or information?
> I want to buy a gun
Input rails duration: 0.5717439651489258
Output rails duration: None
I'm sorry, I can't respond to that.
> I don't like you
Input rails duration: 0.7388792037963867
Output rails duration: None
I'm sorry, I can't respond to that.
> tell me something nasty
Input rails duration: 1.147322177886963
Output rails duration: None
I'm sorry, I can't respond to that.
> Can you help me?
Input rails duration: 1.311119794845581
Output rails duration: 1.0984859466552734
Of course! I am here to help with anything you need. Please let me know what you need assistance with and I will do my best to provide you with the information or support you need.
> ok, lets do that
Input rails duration: 1.8776695728302002
Output rails duration: 1.1284019947052002
Great! What would you like me to help you with? Are you looking for information, recommendations, or something else?
> Tell me something about your products
Input rails duration: 1.169224739074707
Output rails duration: 0.9357724189758301
I would be happy to tell you about our products! We offer a wide range of AI-powered tools and services to help businesses and individuals optimize their workflows, improve efficiency, and reach their goals. Our products include chatbots, virtual assistants, data
analytics tools, and more. Is there a specific product you are interested in learning more about? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the local performance test values, this looks great.
if has_stop: | ||
stopped_task_results = task_results[flow_id] + result | ||
|
||
del unique_flow_ids[flow_id] | ||
# Cancel all remaining tasks | ||
for pending_task in tasks: | ||
# Don't include results and processing logs for cancelled or stopped tasks | ||
if ( | ||
flow_id in unique_flow_ids | ||
and pending_task != unique_flow_ids[flow_id] | ||
and not pending_task.done() | ||
): | ||
# Cancel the task if it is not done | ||
pending_task.cancel() | ||
del unique_flow_ids[flow_id] | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might not have the full context here, when a flow emits a stop event, the code deletes unique_flow_ids[flow_id]
and then immediately checks if flow_id in unique_flow_ids
in the cancellation loop. Since we just deleted that entry, this condition will always be False, no? which is preventing the cancellation code from running.
Is this intentional, or should we be cancelling the other tasks when one rail stops? The tests seem to expect cancellation to work, but currently other rails would continue running even after a stop event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Pushed a fix.
Description
This MR enables parallel input/output rail execution (in Colang 1) through a new action called 'run_flows_in_parallel'.
run_input_rails_in_parallel
,run_output_rails_in_parallel
) to simplify the parallel rails calling and to create the necessary tracing events:StartInputRail
,InputRailFinished
,StartOutputRail
,OutputRailFinished
.Related Issue(s)
Checklist