Skip to content

docs: Output rails are supported with streaming #1007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions docs/user-guides/advanced/streaming.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Streaming

To use a guardrails configuration in streaming mode, the following must be met:
If the application LLM supports streaming, you can configure NeMo Guardrails to stream tokens as well.

1. The main LLM must support streaming.
2. There are no output rails.
For information about configuring streaming with output guardrails, refer to the following:

- For configuration, refer to [streaming output configuration](../../user-guides/configuration-guide.md#streaming-output-configuration).
- For sample Python client code, refer to [streaming output](../../getting-started/5-output-rails/README.md#streaming-output).

## Configuration

Expand All @@ -26,6 +28,7 @@ nemoguardrails chat --config=examples/configs/streaming --streaming
### Python API

You can use the streaming directly from the python API in two ways:

1. Simple: receive just the chunks (tokens).
2. Full: receive both the chunks as they are generated and the full response at the end.

Expand Down Expand Up @@ -73,9 +76,11 @@ For the complete working example, check out this [demo script](https://github.co
### Server API

To make a call to the NeMo Guardrails Server in streaming mode, you have to set the `stream` parameter to `True` inside the JSON body. For example, to get the completion for a chat session using the `/v1/chat/completions` endpoint:

```
POST /v1/chat/completions
```

```json
{
"config_id": "some_config_id",
Expand Down