feat: parallel rails execution #1234

schuellc-nvidia · 2025-06-19T13:27:29Z

Description

This MR enables parallel input/output rail execution (in Colang 1) through a new action called 'run_flows_in_parallel'.

Input and output rails can be configured to run in parallel through a new field in the bot config (parallel: True).
Two additional actions are added (run_input_rails_in_parallel, run_output_rails_in_parallel) to simplify the parallel rails calling and to create the necessary tracing events: StartInputRail, InputRailFinished, StartOutputRail, OutputRailFinished.
The parallel rail execution tries to mimic the sequential processing as much as possible. Therefore, the resulting event history, processing log and flow context will be almost the same and appear in sequential order, even if the event timestamps reveal the concurrent execution of the rails.

Related Issue(s)

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

commit 6ab101ce510c52a89f613e183b8ac485f2d98e08 Author: Christian Schüller <[email protected]> Date: Thu Jun 19 11:17:10 2025 +0200 Revert abc bot changes commit 6c8ff5788bd9de9375a4dcc165a633b540fbb3b1 Author: Christian Schüller <[email protected]> Date: Thu Jun 19 11:07:05 2025 +0200 Add unit tests commit 43e83fe1df21fa335d3c0afdf7bc90da854594bc Author: Christian Schüller <[email protected]> Date: Thu Jun 19 10:53:56 2025 +0200 Fix event order in processing log commit b2ac8c2d4ba0be4dbeb50b076ef0018e1fe1bc53 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 15:44:03 2025 +0200 Fix bug in task cancellation commit aa53bd4ce857d2dd399a76ef1077dd6e5bfc0418 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 15:30:54 2025 +0200 Handle CancelledError commit a37c890a40bd13f1263a6d61ef296b5d7ae63e6a Author: Christian Schüller <[email protected]> Date: Wed Jun 18 11:26:54 2025 +0200 Revert bot config changes commit 2ebeee89d409bf2d844656176337a887fbf68024 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 11:00:02 2025 +0200 Fix types for Python 3.9 commit 410ee0e3a5eaa3c24749fd291a6d8da4cc3c06d0 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 10:17:59 2025 +0200 Add support for parallel output rails commit 5760f9a29cb723f8a9049aef6b8efe45c98e0c23 Author: Christian Schüller <[email protected]> Date: Wed Jun 18 10:13:00 2025 +0200 Fix event history creation commit 3b9193a391f19e3e22ebe56c4dca23fe416fcbae Author: Christian Schüller <[email protected]> Date: Tue Jun 17 16:51:03 2025 +0200 Add support for tracing events commit ba20b5b39f55a366919a8e5bb038ad3dc350b162 Author: Christian Schüller <[email protected]> Date: Tue Jun 17 15:39:09 2025 +0200 Fix tracing commit b807f28b67c8347593f2b925733ec80a81f67449 Author: Christian Schüller <[email protected]> Date: Mon Jun 16 13:43:50 2025 +0200 Improve parallel task handling commit 2516d4f030ff6dcfd64d39708eb2bb36aa36bbc0 Author: Christian Schüller <[email protected]> Date: Fri Jun 13 11:54:36 2025 +0200 Add support for rail parameters commit b70942d Author: Razvan Dinu <[email protected]> Date: Wed Jun 4 15:11:52 2025 +0300 feat: poc for parallel rails The approach is to add a custom action called `run_flows_in_parallel` which can be used for both input and output rails. Small tweak needed for the logic for starting new flows. Only basic manual testing done to check that two self-check input flows run in paralle.

Copilot

Pull Request Overview

This PR adds support for parallel execution of input and output rails via a new run_flows_in_parallel action.

Introduces a parallel flag in the rails config to toggle between sequential and parallel processing.
Implements _run_flows_in_parallel, _run_input_rails_in_parallel, and _run_output_rails_in_parallel in the runtime, and registers them as actions.
Adds end-to-end tests and test configs that verify success and failure behaviors under parallel rail execution.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test_parallel_rails.py	New async tests covering parallel-rails success and failure paths
tests/test_configs/parallel_rails/config.yml	Bot config with `parallel: True` and lists of rails to run
tests/test_configs/parallel_rails/blocked_terms.co	Definitions for blocked-term subflows
tests/test_configs/parallel_rails/prompts.yml	Prompt templates for self-check input/output
tests/test_configs/parallel_rails/actions.py	Simulated actions with delays to test concurrency
nemoguardrails/rails/llm/llm_flows.co	Branches on `parallel` flag to call new parallel-run actions
nemoguardrails/rails/llm/config.py	Adds `parallel` field to `InputRails` model
nemoguardrails/logging/processing_log.py	Extends ignored actions to hide parallel-run helpers
nemoguardrails/colang/v1_0/runtime/runtime.py	Registers the new parallel-run actions and unpacks history events
nemoguardrails/colang/v1_0/runtime/flows.py	Updates flow-start logic to respect explicit `start_flow` events

Comments suppressed due to low confidence (1)

tests/test_configs/parallel_rails/config.yml:30

The third input rail is duplicated here instead of the intended 'generate user intent' subflow, which will cause the test to fail at index 2. Please replace this entry with the correct flow name.

      - check blocked input terms

nemoguardrails/rails/llm/llm_flows.co

nemoguardrails/logging/processing_log.py

Co-authored-by: Copilot <[email protected]> Signed-off-by: Christian Schüller <[email protected]>

codecov-commenter · 2025-06-19T13:59:21Z

Codecov Report

Attention: Patch coverage is 93.63636% with 7 lines in your changes missing coverage. Please review.

Project coverage is 70.38%. Comparing base (5f9974a) to head (34ea153).
Report is 24 commits behind head on develop.

Files with missing lines	Patch %	Lines
nemoguardrails/colang/v1_0/runtime/runtime.py	92.47%	7 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1234      +/-   ##
===========================================
+ Coverage    68.65%   70.38%   +1.73%     
===========================================
  Files          161      161              
  Lines        15978    16129     +151     
===========================================
+ Hits         10969    11353     +384     
+ Misses        5009     4776     -233

Flag	Coverage Δ
python	`70.38% <93.63%> (+1.73%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
nemoguardrails/colang/runtime.py	`89.74% <100.00%> (+1.86%)`	⬆️
nemoguardrails/colang/v1_0/runtime/flows.py	`96.01% <100.00%> (+6.08%)`	⬆️
nemoguardrails/logging/processing_log.py	`100.00% <100.00%> (ø)`
nemoguardrails/rails/llm/config.py	`90.33% <100.00%> (+0.37%)`	⬆️
nemoguardrails/rails/llm/llmrails.py	`89.22% <100.00%> (+2.00%)`	⬆️
nemoguardrails/colang/v1_0/runtime/runtime.py	`86.95% <92.47%> (+1.98%)`	⬆️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Pouyanpi · 2025-06-26T10:23:09Z

Thank you @schuellc-nvidia, I'm going to review it gradually 👍🏻

Have you considered adding tests that compares sequential and parallel execution, I see that following test passes but would be great to address the differences in their behavior

@pytest.mark.asyncio
async def test_sequential_rails_execution():
    # Test sequential execution - rails should run one after another and take longer
    config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "parallel_rails"))
    config.rails.input.parallel = False
    config.rails.output.parallel = False
    chat = TestChat(
        config,
        llm_completions=[
            "No",
            "Hi there! How can I assist you with questions about the ABC Company today?",
            "No",
        ],
    )

    chat >> "hi"
    result = await chat.app.generate_async(messages=chat.history, options=OPTIONS)

    # Assert the response is correct
    assert (
        result
        and result.response[0]["content"]
        == "Hi there! How can I assist you with questions about the ABC Company today?"
    )

    # Check that all rails were executed sequentially
    assert result.log.activated_rails[0].name == "self check input"
    assert (
        result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
    )
    assert len(result.log.activated_rails[1].executed_actions) == 1
    assert (
        result.log.activated_rails[2].name == "check blocked input terms $duration=1.0"
    )
    assert result.log.activated_rails[3].name == "generate user intent"
    assert result.log.activated_rails[4].name == "self check output"
    assert (
        result.log.activated_rails[5].name == "check blocked output terms $duration=1.0"
    )
    assert (
        result.log.activated_rails[6].name == "check blocked output terms $duration=1.0"
    )
    assert len(result.log.activated_rails[5].executed_actions) == 1

    # Sequential execution should take approximately 2 seconds for each set of rails:
    # Input rails: 1s + 1s = 2s (sequential)
    # Output rails: 1s + 1s = 2s (sequential)
    assert (
        result.log.stats.input_rails_duration >= 2
        and result.log.stats.output_rails_duration >= 2
    ), "Sequential rails processing should take longer than parallel processing."

schuellc-nvidia · 2025-06-26T11:34:31Z

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

Pouyanpi · 2025-06-26T12:47:08Z

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

@schuellc-nvidia I totally hear you that our parallel rails tests shouldn’t get bogged down in sequential details. The actual intent here is to lock in and document the behavioral delta between the two modes so we catch any future regressions:

For example

# sequential
assert result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
# parallel
assert result.log.activated_rails[1].name == "check blocked input terms"

Happy to move this into its own tests to keep the parallel suite focused while still covering this critical diff. And I think it is important that this difference being addressed 👍🏻

nemoguardrails/colang/v1_0/runtime/runtime.py

schuellc-nvidia · 2025-07-02T15:08:00Z

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

@schuellc-nvidia I totally hear you that our parallel rails tests shouldn’t get bogged down in sequential details. The actual intent here is to lock in and document the behavioral delta between the two modes so we catch any future regressions:

For example
# sequential
assert result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
# parallel
assert result.log.activated_rails[1].name == "check blocked input terms"
Happy to move this into its own tests to keep the parallel suite focused while still covering this critical diff. And I think it is important that this difference being addressed 👍🏻

Good catch! The difference was actually an unintended side effect and the last commit fixed this such that sequential and parallel rail calling behave the same.

tgasser-nv

@schuellc-nvidia This PR looks good, but I think we're missing vital coverage of actually calling real-world NVCF functions and App models.

Adapted from the safeguard_ai_virtual_assistant_notebook.ipynb. Please set up a config that calls the NVCF Content Safety, Topic Control, and Jailbreak detect as in Step 4 (config.yml).
Adapted from this blog post to call Alignscore check facts and Llama Guard check output.

For each of the two tests, please measure the change in latency and check the outputs are the same. Please add those in a comment in the MR, and don't merge if anything breaks or you see suspicious latency numbers.

schuellc-nvidia · 2025-07-11T13:13:07Z

Rails Configuration Comparison

models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo-instruct

  - type: content_safety
    engine: nim
    model: nvidia/llama-3.1-nemoguard-8b-content-safety
  - type: topic_control
    engine: nim
    model: nvidia/llama-3.1-nemoguard-8b-topic-control

rails:
  input:
    parallel: True
    flows:
      - content safety check input $model=content_safety
      - topic safety check input $model=topic_control
      - self check input
  output:
    parallel: True
    flows:
      - content safety check output $model=content_safety
      - self check output

Parallel Rails Configuration

> hi there
Input rails duration: 0.43209362030029297
Output rails duration: 0.5509295463562012
Hello! It's nice to meet you. My name is Assistant and I'm here to help you with anything you need. How can I assist you today?

> The weater is nice
Input rails duration: 0.6604671478271484
Output rails duration: 0.6846234798431396
That's great to hear! According to the latest weather report, it's currently sunny with a temperature of 75 degrees Fahrenheit. Is there anything else you would like to know about the weather?

> I want to buy a gun
Input rails duration: 0.37376832962036133
Output rails duration: None
I'm sorry, I can't respond to that.

> I don't like you
Input rails duration: 0.3921196460723877
Output rails duration: None
I'm sorry, I can't respond to that.

> tell me something nasty
Input rails duration: 0.6805319786071777
Output rails duration: None
I'm sorry, I can't respond to that.

> Can you help me?
Input rails duration: 0.633216142654419
Output rails duration: 0.5047521591186523
Of course! I am designed to assist you with anything you need. What can I help you with specifically?

> ok, lets do that 
Input rails duration: 0.6267142295837402
Output rails duration: 0.6642739772796631
Sure thing! What would you like to do? I can help you with tasks such as setting reminders, making appointments, or even just providing information on various topics. Just let me know what you need and I'll do my best to assist you.

> Tell me something about your products
Input rails duration: 0.6722187995910645
Output rails duration: 0.5426013469696045
I would be happy to tell you about our products! We have a wide range of products available, including smart home devices, virtual assistants, and productivity tools. Our products are designed to make your life easier and more efficient. Is there a specific product 
you would like to know more about?

Sequential Rails Configuration

> hi there
Input rails duration: 1.2866036891937256
Output rails duration: 1.032092809677124
Hello! It's great to meet you. My name is Assistant and I am here to help you with anything you need. How can I assist you today?

> The weater is nice
Input rails duration: 1.0234577655792236
Output rails duration: 1.042959451675415
That's wonderful to hear! According to my sources, the weather in your location is currently sunny with a high of 75 degrees Fahrenheit and a low of 62 degrees Fahrenheit. Would you like me to provide you with any other weather updates or information?

> I want to buy a gun
Input rails duration: 0.5717439651489258
Output rails duration: None
I'm sorry, I can't respond to that.

> I don't like you
Input rails duration: 0.7388792037963867
Output rails duration: None
I'm sorry, I can't respond to that.

> tell me something nasty
Input rails duration: 1.147322177886963
Output rails duration: None
I'm sorry, I can't respond to that.

> Can you help me?
Input rails duration: 1.311119794845581
Output rails duration: 1.0984859466552734
Of course! I am here to help with anything you need. Please let me know what you need assistance with and I will do my best to provide you with the information or support you need.

> ok, lets do that 
Input rails duration: 1.8776695728302002
Output rails duration: 1.1284019947052002
Great! What would you like me to help you with? Are you looking for information, recommendations, or something else?

> Tell me something about your products
Input rails duration: 1.169224739074707
Output rails duration: 0.9357724189758301
I would be happy to tell you about our products! We offer a wide range of AI-powered tools and services to help businesses and individuals optimize their workflows, improve efficiency, and reach their goals. Our products include chatbots, virtual assistants, data 
analytics tools, and more. Is there a specific product you are interested in learning more about?

tgasser-nv

Thanks for adding the local performance test values, this looks great.

Pouyanpi · 2025-07-11T15:00:07Z

nemoguardrails/colang/v1_0/runtime/runtime.py

+                    if has_stop:
+                        stopped_task_results = task_results[flow_id] + result
+
+                        del unique_flow_ids[flow_id]
+                        # Cancel all remaining tasks
+                        for pending_task in tasks:
+                            # Don't include results and processing logs for cancelled or stopped tasks
+                            if (
+                                flow_id in unique_flow_ids
+                                and pending_task != unique_flow_ids[flow_id]
+                                and not pending_task.done()
+                            ):
+                                # Cancel the task if it is not done
+                                pending_task.cancel()
+                                del unique_flow_ids[flow_id]
+                        break


I might not have the full context here, when a flow emits a stop event, the code deletes unique_flow_ids[flow_id] and then immediately checks if flow_id in unique_flow_ids in the cancellation loop. Since we just deleted that entry, this condition will always be False, no? which is preventing the cancellation code from running.

Is this intentional, or should we be cancelling the other tasks when one rail stops? The tests seem to expect cancellation to work, but currently other rails would continue running even after a stop event?

Good catch. Pushed a fix.

schuellc-nvidia requested review from drazvan, Pouyanpi and Copilot June 19, 2025 13:27

schuellc-nvidia self-assigned this Jun 19, 2025

Copilot AI reviewed Jun 19, 2025

View reviewed changes

nemoguardrails/rails/llm/llm_flows.co Outdated Show resolved Hide resolved

nemoguardrails/logging/processing_log.py Show resolved Hide resolved

schuellc-nvidia and others added 3 commits June 19, 2025 15:31

Update nemoguardrails/rails/llm/llm_flows.co

477a41c

Co-authored-by: Copilot <[email protected]> Signed-off-by: Christian Schüller <[email protected]>

Update nemoguardrails/logging/processing_log.py

bbd9c6b

Co-authored-by: Copilot <[email protected]> Signed-off-by: Christian Schüller <[email protected]>

Add missing output rail config field

ffeeb82

Pouyanpi added enhancement New feature or request status: in review labels Jun 19, 2025

Pouyanpi added this to the v0.15.0 milestone Jun 19, 2025

Improve test coverage and enable rail parameters for custom rails

6adc3be

Pouyanpi requested a review from tgasser-nv June 26, 2025 10:23

Pouyanpi reviewed Jun 30, 2025

View reviewed changes

nemoguardrails/colang/v1_0/runtime/runtime.py Outdated Show resolved Hide resolved

Fix issue with processing log generation

c4f0afc

schuellc-nvidia added 2 commits July 2, 2025 17:10

Remove unnecessary comments

53f0ba2

Add suggested changes to input checks

e0b3eaa

tgasser-nv reviewed Jul 10, 2025

View reviewed changes

Fix issue with missing context update and potential lower event limit

ad85d65

tgasser-nv approved these changes Jul 11, 2025

View reviewed changes

Pouyanpi reviewed Jul 11, 2025

View reviewed changes

Fix cancellation of pending tasks

34ea153

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: parallel rails execution #1234

feat: parallel rails execution #1234

Uh oh!

schuellc-nvidia commented Jun 19, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jun 19, 2025 •

edited

Loading

Uh oh!

Pouyanpi commented Jun 26, 2025

Uh oh!

schuellc-nvidia commented Jun 26, 2025

Uh oh!

Pouyanpi commented Jun 26, 2025

Uh oh!

Uh oh!

schuellc-nvidia commented Jul 2, 2025

Uh oh!

tgasser-nv left a comment

Uh oh!

schuellc-nvidia commented Jul 11, 2025

Uh oh!

tgasser-nv left a comment

Uh oh!

Pouyanpi Jul 11, 2025 •

edited

Loading

Uh oh!

schuellc-nvidia Jul 14, 2025

Uh oh!

Uh oh!

feat: parallel rails execution #1234

Are you sure you want to change the base?

feat: parallel rails execution #1234

Uh oh!

Conversation

schuellc-nvidia commented Jun 19, 2025

Description

Related Issue(s)

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Pouyanpi commented Jun 26, 2025

Uh oh!

schuellc-nvidia commented Jun 26, 2025

Uh oh!

Pouyanpi commented Jun 26, 2025

Uh oh!

Uh oh!

schuellc-nvidia commented Jul 2, 2025

Uh oh!

tgasser-nv left a comment

Choose a reason for hiding this comment

Uh oh!

schuellc-nvidia commented Jul 11, 2025

Rails Configuration Comparison

Parallel Rails Configuration

Sequential Rails Configuration

Uh oh!

tgasser-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Pouyanpi Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

schuellc-nvidia Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented Jun 19, 2025 •

edited

Loading

Pouyanpi Jul 11, 2025 •

edited

Loading