Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 0 additions & 45 deletions assemblies/configuring-the-guardrails-orchestrator-service.adoc

This file was deleted.

52 changes: 52 additions & 0 deletions assemblies/enabling-ai-safety-with-guardrails.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
:_module-type: ASSEMBLY

ifdef::context[:parent-context: {context}]
[id="enabling-ai-safety-with-guardrails_{context}"]
= Enabling AI safety with Guardrails

The TrustyAI Guardrails Orchestrator service is a tool to invoke detections on text generation inputs and outputs, as well as standalone detections.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to leave this mention of "TrustyAI Guardrails" here? Or remove the "TrustyAI" bit?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the spirit of decoipling function from tool. Whenever we have to refer to something concrete, we can reference the tool that does the function.

In this case, the phrasing looks okay here but ti will be important to always communicate "tool x makes function y happen" in this case, maybe a blurb on how the trustyAI orchestratore service helps with enabling AI safety with guardrails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, a potential updated text might read: " The Guardrails Orchestrator is a service included in TrustyAI to perform detections on text generation inputs and/or outputs." ?


It is underpinned by the open-source project link:https://github.com/foundation-model-stack/fms-guardrails-orchestrator[FMS-Guardrails Orchestrator] from IBM. You can deploy the Guardrails Orchestrator service through a Custom Resource Definition (CRD) that is managed by the TrustyAI Operator.

The following sections describe the Guardrails components, how to deploy them and provide example use cases of how to protect your AI applications using these tools:

Understanding detectors::
Explore the available detector types in the Guardrails framework. Currently supported detectors are:
- The built-in detector: Out-of-the-box guardrailing algorithms for quick setup and easy experimentation.
- Hugging Face detectors: Text classification models for guardrailing, such as link:https://huggingface.co/ibm-granite/granite-guardian-hap-38m[ibm-granite/granite-guardian-hap-38m] or any other text classifier from Hugging Face.
Configuring the Orchestrator::
Configure the Orchestrator to communicate with available detectors and your generation model.

Configuring the Guardrails Gateway::
Define preset guardrail pipelines with corresponding unique endpoints.

Deploying the Orchestrator::
Create a Guardrails Orchestrator to begin securing your Large Language Model (LLM) deployments.

Automatically configuring Guardrails using `AutoConfig`::
Automatically configure Guardrails based on available resources in your namespace.

Monitoring user-inputs to your LLM::
Enable a safer LLM by filtering hateful, profane, or toxic inputs.

Enabling the OpenTelemetry exporter for metrics and tracing::
Provide observability for the security and governance mechanisms of AI applications.

include::modules/guardrails-orchestrator-detectors.adoc[leveloffset=+1]

[role='_additional-resources']
.Additional resources
ifndef::upstream[]
* To learn how to use the built-in detectors with `trustyai_fms` Orchestrator server external provider for Llama Stack to detect PII, see link:{rhoaidocshome}{default-format-url}/monitoring_data_science_models#detecting-pii-by-using-guardrails-with-llama-stack[Detecting personally identifiable information (PII) by using Guardrails with Llama Stack].
endif::[]

include::modules/guardrails-configuring-the-hugging-face-detector-serving-runtime.adoc[leveloffset=+2]
include::modules/guardrails-orchestrator-configmap-parameters.adoc[leveloffset=+1]
include::modules/guardrails-gateway-config-parameters.adoc[leveloffset=+1]
include::modules/guardrails-deploying-the-guardrails-orchestrator-service.adoc[leveloffset=+1]
include::modules/guardrails-auto-config.adoc[leveloffset=+1]
include::modules/guardrails-configuring-the-opentelemetry-exporter.adoc[leveloffset=+1]


ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]
26 changes: 26 additions & 0 deletions assemblies/using-guardrails-for-ai-safety.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
:_module-type: ASSEMBLY

ifdef::context[:parent-context: {context}]
[id="using-guardrails-for-ai-safety_{context}"]
= Using Guardrails for AI safety
Use the Guardrails tools to ensure the safety and security of your generative AI applications in production.

== Detecting PII and sensitive data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've added 'include' statements but not explained what they are?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As such they render in the preview like this:
Image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I'm not seeing these raw includes in the preview:
image

What are you using to preview? Or are you intending that I should add a short description under each heading (which I've taken as the action item from this comment since I don't see the rendering issue you're describing in the preview I'm looking at https://opendatahub-documentation--1036.org.readthedocs.build/en/1036/enabling-ai-safety/index.html#using-guardrails-for-ai-safety_safety)

Protect user privacy by identifying and filtering personally identifiable information (PII) in LLM inputs and outputs using built-in regex detectors or custom detection models.

include::modules/detecting-pii-by-using-guardrails-with-llama-stack.adoc[leveloffset=+1]
include::modules/guardrails-filtering-flagged-content-by-sending-requests-to-the-regex-detector.adoc[leveloffset=+1]

== Securing prompts
Prevent malicious prompt injection attacks by using specialized detectors to identify and block potentially harmful prompts before they reach your model.

include::modules/mitigating-prompt-injection-by-using-a-hugging-face-prompt-injection-detector.adoc[leveloffset=+1]

== Moderating and safeguarding content
Filter toxic, hateful, or profane content from user inputs and model outputs to maintain safe and appropriate AI interactions.

include::modules/detecting-hateful-and-profane-language.adoc[leveloffset=+1]
include::modules/guardrails-enforcing-configured-safety-pipelines-for-llm-inference-using-guardrails-gateway.adoc[leveloffset=+1]

ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]
2 changes: 1 addition & 1 deletion assemblies/using-llama-stack-with-trustyai.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The following sections describe how to work with Llama Stack and provide example

include::modules/using-llama-stack-external-evaluation-provider-with-lm-evaluation-harness-in-TrustyAI.adoc[leveloffset=+1]
include::modules/running-custom-evaluations-with-LMEval-and-llama-stack.adoc[leveloffset=+1]
include::modules/using-guardrails-orchestrator-with-llama-stack.adoc[leveloffset=+1]
include::modules/detecting-pii-by-using-guardrails-with-llama-stack.adoc[leveloffset=+1]


ifdef::parent-context[:context: {parent-context}]
Expand Down
20 changes: 20 additions & 0 deletions enabling-ai-safety.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
layout: docs
title: Enabling AI safety
permalink: /docs/enabling-ai-safety
custom_css: asciidoc.css
---
//:self-managed:
:upstream:
include::_artifacts/document-attributes-global.adoc[]

:doctype: book
:toc: left
:compat-mode:
:context: safety

= Enabling AI safety

include::assemblies/enabling-ai-safety-with-guardrails.adoc[leveloffset=+1]

include::assemblies/using-guardrails-for-ai-safety.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ ifdef::context[:parent-context: {context}]
= Configuring the built-in detector and guardrails gateway
[role='_abstract']

The built-in detectors and guardrails gateway are sidecar containers that you can deploy with the `GuardrailsOrchestrator` service, either individually or together. Use the `GuardrailsOrchestrator` custom resource (CR) to enable them. This example uses the regex built-in detector to demonstrate the process.
The built-in detectors and guardrails gateway are sidecar containers that you can deploy with the `GuardrailsOrchestrator` service, either individually or together. Use the `GuardrailsOrchestrator` custom resource (CR) to enable them.

.Prerequisites
* You have cluster administrator privileges for your {productname-short} cluster.
Expand All @@ -24,30 +24,6 @@ endif::[]
* You have a large language model (LLM) for chat generation or text classification, or both, deployed in your namespace.

.Procedure

. Define a `ConfigMap` object in a YAML file to specify the `regexDetectorImage`. For example, create a YAML file called `regex_image_cm.yaml` with the following content:
+
.Example `regex_gateway_images_cm.yaml`
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: gorch-regex-gateway-image-config
data:
regexDetectorImage: 'quay.io/repository/trustyai/regex-detector@sha256:efab6cd8b637b9c35d311aaf639dfedee7d28de3ee07b412ab473deadecd3606' <1>
GatewayImage: 'quay.io/repository/trustyai/vllm-orchestrator-gateway@sha256:c511b386d61a728acdfe8a1ac7a16b3774d072dd053718e5b9c5fab0f025ac3b' <2>
----
<1> The regex detector is a sidecar image that provides regex-based detections.
<2> The guardrails gateway is a sidecar image that emulates the vLLM chat completions API and saves preset detector configurations.

. Deploy the `regex_gateway_images_cm.yaml` config map:
+
[source,terminal]
----
$ oc apply -f regex_gateway_images_cm.yaml -n <TEST_NAMESPACE>
----

. Define the guardrails gateway `ConfigMap` object to specify the `detectors` and `routes`. For example, create a YAML file called `detectors_cm.yaml` with the following contents:
+
.Example `detectors_cm.yaml`
Expand All @@ -61,12 +37,10 @@ metadata:
app: fmstack-nlp
data:
config.yaml: |
orchestrator: <1>
host: "localhost"
port: 8032
detectors: <2>
detectors: <1>
- name: regex_language
input: true <3>
server: built_in
input: true <2>
output: true
detector_params:
regex:
Expand All @@ -80,18 +54,17 @@ data:
- $CUSTOM_REGEX
- name: hap
detector_params: {}
routes: <4>
routes: <3>
- name: all
detectors:
- regex_language
- hap
- name: passthrough
detectors:
----
<1> The orchestrator service.
<2> A list of preconfigured regular expressions for common detection actions. These regular expressions detect personal identifying information, such as `email` and `credit-card`.
<3> The detector will be used for both input and output.
<4> The resulting endpoints for the detectors. For example, `pii` is served at `$GUARDRAILS_GATEWAY_URL/pii/v1/chat/completions` and uses the `regex` detector. The `passthrough` preset does not use any detectors.
<1> A list of preconfigured regular expressions for common detection actions. These regular expressions detect personal identifying information, such as `email` and `credit-card`.
<2> The detector will be used for both input and output.
<3> The resulting endpoints for the detectors. For example, `pii` is served at `$GUARDRAILS_GATEWAY_URL/pii/v1/chat/completions` and uses the `regex` detector. The `passthrough` preset does not use any detectors.

. Deploy the guardrails gateway `detectors_cm.yaml` config map:
+
Expand Down
83 changes: 68 additions & 15 deletions modules/deploying-the-guardrails-orchestrator-service.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,11 @@ ifdef::upstream[]
* You have a large language model (LLM) for chat generation or text classification, or both, deployed in your namespace.


.Procedure
. Define a `ConfigMap` object in a YAML file to specify the `chat_generation` and `detectors` services. For example, create a file named `orchestrator_cm.yaml` with the following content:
.Creating an Orchestrator Configuration
. Define a `ConfigMap` object in a YAML file to provide your Guardrails Orchestrator configuration. Here is an example
version of this file:
+
.Example `orchestrator_cm.yaml`
.Example `orchestrator_configmap.yaml`
[source,yaml]
----
kind: ConfigMap
Expand All @@ -39,28 +40,80 @@ metadata:
data:
config.yaml: |
chat_generation: <1>
service:
hostname: <CHAT_GENERATION_HOSTNAME>
port: the generation service port (for example 8033)

detectors: <2>
regex_language:
service: <2>
hostname: <Chat generation hostname>
port: <Generation service port> (for example 8080)
tls: generation-model-tls
detectors: <3>
built_in: <4>
type: text_contents
service:
hostname: "127.0.0.1"
port: 8080
chunker_id: whole_doc_chunker
default_threshold: 0.5
hap:
some-other-detector:
type: text_contents
service:
hostname: guardrails-detector-ibm-hap-predictor.model-namespace.svc.cluster.local
port: the generation service port (for example 8000)
hostname: <other detector hostname>
port: <detector server port> (for example 8000)
tls: some-other-detector-tls
chunker_id: whole_doc_chunker
default_threshold: 0.5
----
<1> A service for chat generation referring to a deployed LLM in your namespace where you are adding guardrails.
<2> A list of services responsible for running detection of a certain class of content on text spans.
tls: <5>
- generation-model-tls:
cert_path: /etc/tls/<Path 1>/tls.crt
key_path: /etc/tls/<Path 1>/tls.key
ca_path: /etc/tls/ca/service-ca.crt
- some-other-detector-tls:
cert_path: /etc/tls/<Path 2>/tls.crt
key_path: /etc/tls/<Path 2>/tls.key
ca_path: /etc/tls/ca/service-ca.crt
passthrough_headers: <6>
- "authorization"
- "content-type"
----
<1> The `chat_generation` section describes The generation model to guardrail
<2> A service configuration - throughout the orchestrator config, all external services are described via the
service configuration, which contains the following fields:
* `hostname` - The hostname of the service
* `port` - The port of the service
* `tls` *(Optional)* - The name of the TLS configuration (specified later in the configuration) to use for this service. If provided, the orchestrator
will communicate with this service via HTTPS.
<3> The `detectors` section is where the detector servers available to the orchestrator are specified. Provide some unique name for the detector server as the key
to each entry, and then the following values are required:
* `type` - The kind of detector server. For now, the only supported kind within RHOAI is `text_contents`
* `service` - The service configuration for the detector server, see <2> for details. Note, if you want to use the built-in detector, the service configuration should always be

service:
hostname: "127.0.0.1"
port: 8080

* `chunker_id`- The chunker to use for this detector server. For now, the only supported chunker is `whole_doc_chunker`
* `default_threshold`- The threshold to pass to the detector server. The threshold can be used by the detector servers to determine their sensitivity, and recommended values
will vary by detector algorithm. We recommend keeping this at `0.5` as a safe starting point.
<4> Each key in the detector section will define the *name* of the detector server. You'll need to reference these names later, so pick memorable and descriptive names. Here, we've used `built_in` and `some-other-detector` as the names of the two detector servers we are configuring in this example.
<5> The `tls` section defines TLS configurations. The names of these configurations can then be used as values within `service.tls` in your service configurations (see <2>).
A TLS configuration consists of the following fields:
* `cert_path` - The path to a `.crt` file inside the Guardrails Orchestrator container.
* `key_path` - The path to a `.key` file inside the Guardrails Orchestrator container.
* `ca_path` - The path to CA certificate `.crt` file on the Guardrails Orchestrator container. The default Openshift Serving CA will be mounted at `/etc/tls/ca/service-ca.crt`, we recommend using this as your `ca_path`.
+
See the <<tlsSecrets-param,`tlsSecrets`>> section of the GuardrailsOrchestrator Custom Resource to learn how to mount custom TLS files into the Guardrails Orchestrator container.
<6> The `passthrough_headers` section defines which headers from your requests to the Guardrails Orchestrator get sent onwards to
the various services specified in this configuration. If you want to ensure that the Orchestrator can talk to authenticated services, we recommend specifying `"authorization"` and `"content-type"` as `passthrough_headers`.











---

. Deploy the `orchestrator_cm.yaml` config map:
+
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
:_module-type: PROCEDURE

ifdef::context[:parent-context: {context}]
[id="guardrails-orchestrator-hap-scenario_{context}"]
= Monitoring user inputs with the Guardrails Orchestrator service
[id="detecting-hateful-and-profane-language_{context}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've updated a number of titles to include the word 'guardrails' at the start of the title actual. This is not in keeping with our Contributor guidelines. However, it is most likely incidental given that we are switching to DITA soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to disregard this comment because its a really deep/tedious change at this point to update all the files

= Detecting hateful and profane language
[role='_abstract']

The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled.
The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to detect and protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled.

.Prerequisites

Expand All @@ -21,7 +21,7 @@ ifdef::cloud-service[]
endif::[]

ifdef::upstream[]
* You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/monitoring_data_science_models/#deploying-the-guardrails-orchestrator-service_monitor[Deploying the Guardrails Orchestrator].
* You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/enabling-ai-safety#deploying-the-guardrails-orchestrator-service_safety[Deploying the Guardrails Orchestrator].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: " and associated detectors' instead of 'related'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think related is fine, just my 2cents

endif::[]


Expand Down
Loading