Skip to content

feat: parallel rails execution #1234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open

Conversation

schuellc-nvidia
Copy link
Collaborator

Description

This MR enables parallel input/output rail execution (in Colang 1) through a new action called 'run_flows_in_parallel'.

  • Input and output rails can be configured to run in parallel through a new field in the bot config (parallel: True).
  • Two additional actions are added (run_input_rails_in_parallel, run_output_rails_in_parallel) to simplify the parallel rails calling and to create the necessary tracing events: StartInputRail, InputRailFinished, StartOutputRail, OutputRailFinished.
  • The parallel rail execution tries to mimic the sequential processing as much as possible. Therefore, the resulting event history, processing log and flow context will be almost the same and appear in sequential order, even if the event timestamps reveal the concurrent execution of the rails.

Related Issue(s)

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

commit 6ab101ce510c52a89f613e183b8ac485f2d98e08
Author: Christian Schüller <[email protected]>
Date:   Thu Jun 19 11:17:10 2025 +0200

    Revert abc bot changes

commit 6c8ff5788bd9de9375a4dcc165a633b540fbb3b1
Author: Christian Schüller <[email protected]>
Date:   Thu Jun 19 11:07:05 2025 +0200

    Add unit tests

commit 43e83fe1df21fa335d3c0afdf7bc90da854594bc
Author: Christian Schüller <[email protected]>
Date:   Thu Jun 19 10:53:56 2025 +0200

    Fix event order in processing log

commit b2ac8c2d4ba0be4dbeb50b076ef0018e1fe1bc53
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 15:44:03 2025 +0200

    Fix bug in task cancellation

commit aa53bd4ce857d2dd399a76ef1077dd6e5bfc0418
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 15:30:54 2025 +0200

    Handle CancelledError

commit a37c890a40bd13f1263a6d61ef296b5d7ae63e6a
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 11:26:54 2025 +0200

    Revert bot config changes

commit 2ebeee89d409bf2d844656176337a887fbf68024
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 11:00:02 2025 +0200

    Fix types for Python 3.9

commit 410ee0e3a5eaa3c24749fd291a6d8da4cc3c06d0
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 10:17:59 2025 +0200

    Add support for  parallel output rails

commit 5760f9a29cb723f8a9049aef6b8efe45c98e0c23
Author: Christian Schüller <[email protected]>
Date:   Wed Jun 18 10:13:00 2025 +0200

    Fix event history creation

commit 3b9193a391f19e3e22ebe56c4dca23fe416fcbae
Author: Christian Schüller <[email protected]>
Date:   Tue Jun 17 16:51:03 2025 +0200

    Add support for tracing events

commit ba20b5b39f55a366919a8e5bb038ad3dc350b162
Author: Christian Schüller <[email protected]>
Date:   Tue Jun 17 15:39:09 2025 +0200

    Fix tracing

commit b807f28b67c8347593f2b925733ec80a81f67449
Author: Christian Schüller <[email protected]>
Date:   Mon Jun 16 13:43:50 2025 +0200

    Improve parallel task handling

commit 2516d4f030ff6dcfd64d39708eb2bb36aa36bbc0
Author: Christian Schüller <[email protected]>
Date:   Fri Jun 13 11:54:36 2025 +0200

    Add support for rail parameters

commit b70942d
Author: Razvan Dinu <[email protected]>
Date:   Wed Jun 4 15:11:52 2025 +0300

    feat: poc for parallel rails

    The approach is to add a custom action called `run_flows_in_parallel` which can be used for both input and output rails.
    Small tweak needed for the logic for starting new flows.
    Only basic manual testing done to check that two self-check input flows run in paralle.
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for parallel execution of input and output rails via a new run_flows_in_parallel action.

  • Introduces a parallel flag in the rails config to toggle between sequential and parallel processing.
  • Implements _run_flows_in_parallel, _run_input_rails_in_parallel, and _run_output_rails_in_parallel in the runtime, and registers them as actions.
  • Adds end-to-end tests and test configs that verify success and failure behaviors under parallel rail execution.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_parallel_rails.py New async tests covering parallel-rails success and failure paths
tests/test_configs/parallel_rails/config.yml Bot config with parallel: True and lists of rails to run
tests/test_configs/parallel_rails/blocked_terms.co Definitions for blocked-term subflows
tests/test_configs/parallel_rails/prompts.yml Prompt templates for self-check input/output
tests/test_configs/parallel_rails/actions.py Simulated actions with delays to test concurrency
nemoguardrails/rails/llm/llm_flows.co Branches on parallel flag to call new parallel-run actions
nemoguardrails/rails/llm/config.py Adds parallel field to InputRails model
nemoguardrails/logging/processing_log.py Extends ignored actions to hide parallel-run helpers
nemoguardrails/colang/v1_0/runtime/runtime.py Registers the new parallel-run actions and unpacks history events
nemoguardrails/colang/v1_0/runtime/flows.py Updates flow-start logic to respect explicit start_flow events
Comments suppressed due to low confidence (1)

tests/test_configs/parallel_rails/config.yml:30

  • The third input rail is duplicated here instead of the intended 'generate user intent' subflow, which will cause the test to fail at index 2. Please replace this entry with the correct flow name.
      - check blocked input terms

@codecov-commenter
Copy link

codecov-commenter commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 93.63636% with 7 lines in your changes missing coverage. Please review.

Project coverage is 70.38%. Comparing base (5f9974a) to head (34ea153).
Report is 24 commits behind head on develop.

Files with missing lines Patch % Lines
nemoguardrails/colang/v1_0/runtime/runtime.py 92.47% 7 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1234      +/-   ##
===========================================
+ Coverage    68.65%   70.38%   +1.73%     
===========================================
  Files          161      161              
  Lines        15978    16129     +151     
===========================================
+ Hits         10969    11353     +384     
+ Misses        5009     4776     -233     
Flag Coverage Δ
python 70.38% <93.63%> (+1.73%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
nemoguardrails/colang/runtime.py 89.74% <100.00%> (+1.86%) ⬆️
nemoguardrails/colang/v1_0/runtime/flows.py 96.01% <100.00%> (+6.08%) ⬆️
nemoguardrails/logging/processing_log.py 100.00% <100.00%> (ø)
nemoguardrails/rails/llm/config.py 90.33% <100.00%> (+0.37%) ⬆️
nemoguardrails/rails/llm/llmrails.py 89.22% <100.00%> (+2.00%) ⬆️
nemoguardrails/colang/v1_0/runtime/runtime.py 86.95% <92.47%> (+1.98%) ⬆️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Pouyanpi Pouyanpi added enhancement New feature or request status: in review labels Jun 19, 2025
@Pouyanpi Pouyanpi added this to the v0.15.0 milestone Jun 19, 2025
@Pouyanpi
Copy link
Collaborator

Thank you @schuellc-nvidia, I'm going to review it gradually 👍🏻

Have you considered adding tests that compares sequential and parallel execution, I see that following test passes but would be great to address the differences in their behavior

@pytest.mark.asyncio
async def test_sequential_rails_execution():
    # Test sequential execution - rails should run one after another and take longer
    config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "parallel_rails"))
    config.rails.input.parallel = False
    config.rails.output.parallel = False
    chat = TestChat(
        config,
        llm_completions=[
            "No",
            "Hi there! How can I assist you with questions about the ABC Company today?",
            "No",
        ],
    )

    chat >> "hi"
    result = await chat.app.generate_async(messages=chat.history, options=OPTIONS)

    # Assert the response is correct
    assert (
        result
        and result.response[0]["content"]
        == "Hi there! How can I assist you with questions about the ABC Company today?"
    )

    # Check that all rails were executed sequentially
    assert result.log.activated_rails[0].name == "self check input"
    assert (
        result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
    )
    assert len(result.log.activated_rails[1].executed_actions) == 1
    assert (
        result.log.activated_rails[2].name == "check blocked input terms $duration=1.0"
    )
    assert result.log.activated_rails[3].name == "generate user intent"
    assert result.log.activated_rails[4].name == "self check output"
    assert (
        result.log.activated_rails[5].name == "check blocked output terms $duration=1.0"
    )
    assert (
        result.log.activated_rails[6].name == "check blocked output terms $duration=1.0"
    )
    assert len(result.log.activated_rails[5].executed_actions) == 1

    # Sequential execution should take approximately 2 seconds for each set of rails:
    # Input rails: 1s + 1s = 2s (sequential)
    # Output rails: 1s + 1s = 2s (sequential)
    assert (
        result.log.stats.input_rails_duration >= 2
        and result.log.stats.output_rails_duration >= 2
    ), "Sequential rails processing should take longer than parallel processing."

@Pouyanpi Pouyanpi requested a review from tgasser-nv June 26, 2025 10:23
@schuellc-nvidia
Copy link
Collaborator Author

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

@Pouyanpi
Copy link
Collaborator

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

@schuellc-nvidia I totally hear you that our parallel rails tests shouldn’t get bogged down in sequential details. The actual intent here is to lock in and document the behavioral delta between the two modes so we catch any future regressions:

For example

# sequential
assert result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
# parallel
assert result.log.activated_rails[1].name == "check blocked input terms"

Happy to move this into its own tests to keep the parallel suite focused while still covering this critical diff. And I think it is important that this difference being addressed 👍🏻

@schuellc-nvidia
Copy link
Collaborator Author

@Pouyanpi Reagrding your ask for comparison to sequential rails: I don't think this should be a unit test of the parallel rails. If at all this should probably be a new test for sequential rails.

@schuellc-nvidia I totally hear you that our parallel rails tests shouldn’t get bogged down in sequential details. The actual intent here is to lock in and document the behavioral delta between the two modes so we catch any future regressions:

For example

# sequential
assert result.log.activated_rails[1].name == "check blocked input terms $duration=1.0"
# parallel
assert result.log.activated_rails[1].name == "check blocked input terms"

Happy to move this into its own tests to keep the parallel suite focused while still covering this critical diff. And I think it is important that this difference being addressed 👍🏻

Good catch! The difference was actually an unintended side effect and the last commit fixed this such that sequential and parallel rail calling behave the same.

Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schuellc-nvidia This PR looks good, but I think we're missing vital coverage of actually calling real-world NVCF functions and App models.

  1. Adapted from the safeguard_ai_virtual_assistant_notebook.ipynb. Please set up a config that calls the NVCF Content Safety, Topic Control, and Jailbreak detect as in Step 4 (config.yml).
  2. Adapted from this blog post to call Alignscore check facts and Llama Guard check output.

For each of the two tests, please measure the change in latency and check the outputs are the same. Please add those in a comment in the MR, and don't merge if anything breaks or you see suspicious latency numbers.

@schuellc-nvidia
Copy link
Collaborator Author

Rails Configuration Comparison

models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo-instruct

  - type: content_safety
    engine: nim
    model: nvidia/llama-3.1-nemoguard-8b-content-safety
  - type: topic_control
    engine: nim
    model: nvidia/llama-3.1-nemoguard-8b-topic-control

rails:
  input:
    parallel: True
    flows:
      - content safety check input $model=content_safety
      - topic safety check input $model=topic_control
      - self check input
  output:
    parallel: True
    flows:
      - content safety check output $model=content_safety
      - self check output

Parallel Rails Configuration

> hi there
Input rails duration: 0.43209362030029297
Output rails duration: 0.5509295463562012
Hello! It's nice to meet you. My name is Assistant and I'm here to help you with anything you need. How can I assist you today?

> The weater is nice
Input rails duration: 0.6604671478271484
Output rails duration: 0.6846234798431396
That's great to hear! According to the latest weather report, it's currently sunny with a temperature of 75 degrees Fahrenheit. Is there anything else you would like to know about the weather?

> I want to buy a gun
Input rails duration: 0.37376832962036133
Output rails duration: None
I'm sorry, I can't respond to that.

> I don't like you
Input rails duration: 0.3921196460723877
Output rails duration: None
I'm sorry, I can't respond to that.

> tell me something nasty
Input rails duration: 0.6805319786071777
Output rails duration: None
I'm sorry, I can't respond to that.

> Can you help me?
Input rails duration: 0.633216142654419
Output rails duration: 0.5047521591186523
Of course! I am designed to assist you with anything you need. What can I help you with specifically?

> ok, lets do that 
Input rails duration: 0.6267142295837402
Output rails duration: 0.6642739772796631
Sure thing! What would you like to do? I can help you with tasks such as setting reminders, making appointments, or even just providing information on various topics. Just let me know what you need and I'll do my best to assist you.

> Tell me something about your products
Input rails duration: 0.6722187995910645
Output rails duration: 0.5426013469696045
I would be happy to tell you about our products! We have a wide range of products available, including smart home devices, virtual assistants, and productivity tools. Our products are designed to make your life easier and more efficient. Is there a specific product 
you would like to know more about?

Sequential Rails Configuration

> hi there
Input rails duration: 1.2866036891937256
Output rails duration: 1.032092809677124
Hello! It's great to meet you. My name is Assistant and I am here to help you with anything you need. How can I assist you today?

> The weater is nice
Input rails duration: 1.0234577655792236
Output rails duration: 1.042959451675415
That's wonderful to hear! According to my sources, the weather in your location is currently sunny with a high of 75 degrees Fahrenheit and a low of 62 degrees Fahrenheit. Would you like me to provide you with any other weather updates or information?

> I want to buy a gun
Input rails duration: 0.5717439651489258
Output rails duration: None
I'm sorry, I can't respond to that.

> I don't like you
Input rails duration: 0.7388792037963867
Output rails duration: None
I'm sorry, I can't respond to that.

> tell me something nasty
Input rails duration: 1.147322177886963
Output rails duration: None
I'm sorry, I can't respond to that.

> Can you help me?
Input rails duration: 1.311119794845581
Output rails duration: 1.0984859466552734
Of course! I am here to help with anything you need. Please let me know what you need assistance with and I will do my best to provide you with the information or support you need.

> ok, lets do that 
Input rails duration: 1.8776695728302002
Output rails duration: 1.1284019947052002
Great! What would you like me to help you with? Are you looking for information, recommendations, or something else?

> Tell me something about your products
Input rails duration: 1.169224739074707
Output rails duration: 0.9357724189758301
I would be happy to tell you about our products! We offer a wide range of AI-powered tools and services to help businesses and individuals optimize their workflows, improve efficiency, and reach their goals. Our products include chatbots, virtual assistants, data 
analytics tools, and more. Is there a specific product you are interested in learning more about?

Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the local performance test values, this looks great.

Comment on lines 372 to 387
if has_stop:
stopped_task_results = task_results[flow_id] + result

del unique_flow_ids[flow_id]
# Cancel all remaining tasks
for pending_task in tasks:
# Don't include results and processing logs for cancelled or stopped tasks
if (
flow_id in unique_flow_ids
and pending_task != unique_flow_ids[flow_id]
and not pending_task.done()
):
# Cancel the task if it is not done
pending_task.cancel()
del unique_flow_ids[flow_id]
break
Copy link
Collaborator

@Pouyanpi Pouyanpi Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not have the full context here, when a flow emits a stop event, the code deletes unique_flow_ids[flow_id] and then immediately checks if flow_id in unique_flow_ids in the cancellation loop. Since we just deleted that entry, this condition will always be False, no? which is preventing the cancellation code from running.

Is this intentional, or should we be cancelling the other tasks when one rail stops? The tests seem to expect cancellation to work, but currently other rails would continue running even after a stop event?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Pushed a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status: in review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants