opendatahub-io · skrthomas · Dec 1, 2025 · Nov 4, 2025 · Nov 5, 2025 · Nov 6, 2025
diff --git a/assemblies/configuring-the-guardrails-orchestrator-service.adoc b/assemblies/configuring-the-guardrails-orchestrator-service.adoc
diff --git a/assemblies/enabling-ai-safety-with-guardrails.adoc b/assemblies/enabling-ai-safety-with-guardrails.adoc
@@ -0,0 +1,52 @@
+:_module-type: ASSEMBLY
+
+ifdef::context[:parent-context: {context}]
+[id="enabling-ai-safety-with-guardrails_{context}"]
+= Enabling AI safety with Guardrails
+
+The TrustyAI Guardrails Orchestrator service is a tool to invoke detections on text generation inputs and outputs, as well as standalone detections.
+
+It is underpinned by the open-source project link:https://github.com/foundation-model-stack/fms-guardrails-orchestrator[FMS-Guardrails Orchestrator] from IBM. You can deploy the Guardrails Orchestrator service through a Custom Resource Definition (CRD) that is managed by the TrustyAI Operator.
+
+The following sections describe the Guardrails components, how to deploy them and provide example use cases of how to protect your AI applications using these tools:
+
+Understanding detectors::
+Explore the available detector types in the Guardrails framework. Currently supported detectors are:
+    - The built-in detector: Out-of-the-box guardrailing algorithms for quick setup and easy experimentation.
+    - Hugging Face detectors: Text classification models for guardrailing, such as link:https://huggingface.co/ibm-granite/granite-guardian-hap-38m[ibm-granite/granite-guardian-hap-38m] or any other text classifier from Hugging Face.
+Configuring the Orchestrator::
+Configure the Orchestrator to communicate with available detectors and your generation model.
+
+Configuring the Guardrails Gateway::
+Define preset guardrail pipelines with corresponding unique endpoints.
+
+Deploying the Orchestrator::
+Create a Guardrails Orchestrator to begin securing your Large Language Model (LLM) deployments.
+
+Automatically configuring Guardrails using `AutoConfig`::
+Automatically configure Guardrails based on available resources in your namespace.
+
+Monitoring user-inputs to your LLM::
+Enable a safer LLM by filtering hateful, profane, or toxic inputs.
+
+Enabling the OpenTelemetry exporter for metrics and tracing::
+Provide observability for the security and governance mechanisms of AI applications.
+
+include::modules/guardrails-orchestrator-detectors.adoc[leveloffset=+1]
+
+[role='_additional-resources']
+.Additional resources
+ifndef::upstream[]
+* To learn how to use the built-in detectors with `trustyai_fms` Orchestrator server external provider for Llama Stack to detect PII, see link:{rhoaidocshome}{default-format-url}/monitoring_data_science_models#detecting-pii-by-using-guardrails-with-llama-stack[Detecting personally identifiable information (PII) by using Guardrails with Llama Stack].
+endif::[]
+
+include::modules/guardrails-configuring-the-hugging-face-detector-serving-runtime.adoc[leveloffset=+2]
+include::modules/guardrails-orchestrator-configmap-parameters.adoc[leveloffset=+1]
+include::modules/guardrails-gateway-config-parameters.adoc[leveloffset=+1]
+include::modules/guardrails-deploying-the-guardrails-orchestrator-service.adoc[leveloffset=+1]
+include::modules/guardrails-auto-config.adoc[leveloffset=+1]
+include::modules/guardrails-configuring-the-opentelemetry-exporter.adoc[leveloffset=+1]
+
+
+ifdef::parent-context[:context: {parent-context}]
+ifndef::parent-context[:!context:]
diff --git a/assemblies/using-guardrails-for-ai-safety.adoc b/assemblies/using-guardrails-for-ai-safety.adoc
@@ -0,0 +1,26 @@
+:_module-type: ASSEMBLY
+
+ifdef::context[:parent-context: {context}]
+[id="using-guardrails-for-ai-safety_{context}"]
+= Using Guardrails for AI safety
+Use the Guardrails tools to ensure the safety and security of your generative AI applications in production.
+
+== Detecting PII and sensitive data
+Protect user privacy by identifying and filtering personally identifiable information (PII) in LLM inputs and outputs using built-in regex detectors or custom detection models.
+
+include::modules/detecting-pii-by-using-guardrails-with-llama-stack.adoc[leveloffset=+1]
+include::modules/guardrails-filtering-flagged-content-by-sending-requests-to-the-regex-detector.adoc[leveloffset=+1]
+
+== Securing prompts
+Prevent malicious prompt injection attacks by using specialized detectors to identify and block potentially harmful prompts before they reach your model.
+
+include::modules/mitigating-prompt-injection-by-using-a-hugging-face-prompt-injection-detector.adoc[leveloffset=+1]
+
+== Moderating and safeguarding content
+Filter toxic, hateful, or profane content from user inputs and model outputs to maintain safe and appropriate AI interactions.
+
+include::modules/detecting-hateful-and-profane-language.adoc[leveloffset=+1]
+include::modules/guardrails-enforcing-configured-safety-pipelines-for-llm-inference-using-guardrails-gateway.adoc[leveloffset=+1]
+
+ifdef::parent-context[:context: {parent-context}]
+ifndef::parent-context[:!context:]
diff --git a/assemblies/using-llama-stack-with-trustyai.adoc b/assemblies/using-llama-stack-with-trustyai.adoc
@@ -17,7 +17,7 @@ The following sections describe how to work with Llama Stack and provide example
 
 include::modules/using-llama-stack-external-evaluation-provider-with-lm-evaluation-harness-in-TrustyAI.adoc[leveloffset=+1]
 include::modules/running-custom-evaluations-with-LMEval-and-llama-stack.adoc[leveloffset=+1]
-include::modules/using-guardrails-orchestrator-with-llama-stack.adoc[leveloffset=+1]
+include::modules/detecting-pii-by-using-guardrails-with-llama-stack.adoc[leveloffset=+1]
 
 
 ifdef::parent-context[:context: {parent-context}]

diff --git a/enabling-ai-safety.adoc b/enabling-ai-safety.adoc
@@ -0,0 +1,20 @@
+---
+layout: docs
+title: Enabling AI safety
+permalink: /docs/enabling-ai-safety
+custom_css: asciidoc.css
+---
+//:self-managed:
+:upstream:
+include::_artifacts/document-attributes-global.adoc[]
+
+:doctype: book
+:toc: left
+:compat-mode:
+:context: safety
+
+= Enabling AI safety
+
+include::assemblies/enabling-ai-safety-with-guardrails.adoc[leveloffset=+1]
+
+include::assemblies/using-guardrails-for-ai-safety.adoc[leveloffset=+1]
diff --git a/modules/configuring-the-built-in-detector-and-guardrails-gateway.adoc b/modules/configuring-the-built-in-detector-and-guardrails-gateway.adoc
@@ -5,7 +5,7 @@ ifdef::context[:parent-context: {context}]
 = Configuring the built-in detector and guardrails gateway
 [role='_abstract']
 
-The built-in detectors and guardrails gateway are sidecar containers that you can deploy with the `GuardrailsOrchestrator` service, either individually or together. Use the `GuardrailsOrchestrator` custom resource (CR) to enable them. This example uses the regex built-in detector to demonstrate the process.
+The built-in detectors and guardrails gateway are sidecar containers that you can deploy with the `GuardrailsOrchestrator` service, either individually or together. Use the `GuardrailsOrchestrator` custom resource (CR) to enable them.
 
 .Prerequisites
 * You have cluster administrator privileges for your {productname-short} cluster.
@@ -24,30 +24,6 @@ endif::[]
 * You have a large language model (LLM) for chat generation or text classification, or both, deployed in your namespace.  
 
 .Procedure
-
-. Define a `ConfigMap` object in a YAML file to specify the `regexDetectorImage`. For example, create a YAML file called `regex_image_cm.yaml` with the following content:
-+
-.Example `regex_gateway_images_cm.yaml`
-[source,yaml]
-----
-apiVersion: v1
-kind: ConfigMap
-metadata:
-  name: gorch-regex-gateway-image-config
-data:
-  regexDetectorImage: 'quay.io/repository/trustyai/regex-detector@sha256:efab6cd8b637b9c35d311aaf639dfedee7d28de3ee07b412ab473deadecd3606'            <1>
-  GatewayImage: 'quay.io/repository/trustyai/vllm-orchestrator-gateway@sha256:c511b386d61a728acdfe8a1ac7a16b3774d072dd053718e5b9c5fab0f025ac3b' <2>
-----
-<1> The regex detector is a sidecar image that provides regex-based detections.
-<2> The guardrails gateway is a sidecar image that emulates the vLLM chat completions API and saves preset detector configurations.
-
-. Deploy the `regex_gateway_images_cm.yaml` config map:
-+
-[source,terminal]
-----
-$ oc apply -f regex_gateway_images_cm.yaml -n <TEST_NAMESPACE>
-----
-
 . Define the guardrails gateway `ConfigMap` object to specify the `detectors` and `routes`. For example, create a YAML file called `detectors_cm.yaml` with the following contents:
 +
 .Example `detectors_cm.yaml`
@@ -61,12 +37,10 @@ metadata:
     app: fmstack-nlp
 data:
   config.yaml: |
-    orchestrator:   <1>
-      host: "localhost"
-      port: 8032
-    detectors:      <2>
+    detectors:      <1>
       - name: regex_language
-        input: true <3>
+        server: built_in
+        input: true <2>
         output: true
         detector_params:
           regex:
@@ -80,18 +54,17 @@ data:
             - $CUSTOM_REGEX
       - name: hap
         detector_params: {}
-    routes:         <4>
+    routes:         <3>
       - name: all
         detectors:
           - regex_language
           - hap
       - name: passthrough
         detectors:
 ----
-<1> The orchestrator service.
-<2> A list of preconfigured regular expressions for common detection actions. These regular expressions detect personal identifying information, such as `email` and `credit-card`.
-<3> The detector will be used for both input and output. 
-<4> The resulting endpoints for the  detectors. For example, `pii` is served at `$GUARDRAILS_GATEWAY_URL/pii/v1/chat/completions` and uses the `regex` detector. The `passthrough` preset does not use any detectors.
+<1> A list of preconfigured regular expressions for common detection actions. These regular expressions detect personal identifying information, such as `email` and `credit-card`.
+<2> The detector will be used for both input and output.
+<3> The resulting endpoints for the  detectors. For example, `pii` is served at `$GUARDRAILS_GATEWAY_URL/pii/v1/chat/completions` and uses the `regex` detector. The `passthrough` preset does not use any detectors.
 
 . Deploy the guardrails gateway `detectors_cm.yaml` config map:
 +

diff --git a/modules/deploying-the-guardrails-orchestrator-service.adoc b/modules/deploying-the-guardrails-orchestrator-service.adoc
@@ -26,10 +26,11 @@ ifdef::upstream[]
 * You have a large language model (LLM) for chat generation or text classification, or both, deployed in your namespace. 
 
 
-.Procedure
-. Define a `ConfigMap` object in a YAML file to specify the `chat_generation` and `detectors` services. For example, create a file named `orchestrator_cm.yaml` with the following content:
+.Creating an Orchestrator Configuration
+. Define a `ConfigMap` object in a YAML file to provide your Guardrails Orchestrator configuration. Here is an example
+version of this file:
 +
-.Example `orchestrator_cm.yaml`
+.Example `orchestrator_configmap.yaml`
 [source,yaml]
 ----
 kind: ConfigMap
@@ -39,28 +40,80 @@ metadata:
 data:
   config.yaml: |
     chat_generation: <1>
-      service:
-        hostname: <CHAT_GENERATION_HOSTNAME>
-        port: the generation service port (for example 8033)
-
-    detectors:       <2>
-      regex_language:
+      service: <2>
+        hostname: <Chat generation hostname>
+        port: <Generation service port> (for example 8080)
+        tls: generation-model-tls
+    detectors: <3>
+      built_in: <4>
         type: text_contents
         service:
             hostname: "127.0.0.1"
             port: 8080
         chunker_id: whole_doc_chunker
         default_threshold: 0.5
-      hap:
+      some-other-detector:
         type: text_contents
         service:
-          hostname: guardrails-detector-ibm-hap-predictor.model-namespace.svc.cluster.local
-          port: the generation service port (for example 8000)
+          hostname: <other detector hostname>
+          port: <detector server port> (for example 8000)
+          tls: some-other-detector-tls
         chunker_id: whole_doc_chunker
         default_threshold: 0.5
-----
-<1> A service for chat generation referring to a deployed LLM in your namespace where you are adding guardrails.
-<2> A list of services responsible for running detection of a certain class of content on text spans.
+    tls: <5>
+      - generation-model-tls:
+          cert_path: /etc/tls/<Path 1>/tls.crt
+          key_path: /etc/tls/<Path 1>/tls.key
+          ca_path: /etc/tls/ca/service-ca.crt
+      - some-other-detector-tls:
+          cert_path: /etc/tls/<Path 2>/tls.crt
+          key_path: /etc/tls/<Path 2>/tls.key
+          ca_path: /etc/tls/ca/service-ca.crt
+    passthrough_headers: <6>
+      - "authorization"
+      - "content-type"
+----
+<1> The `chat_generation` section describes The generation model to guardrail
+<2> A service configuration - throughout the orchestrator config, all external services are described via the
+service configuration, which contains the following fields:
+* `hostname` - The hostname of the service
+* `port` - The port of the service
+* `tls` *(Optional)* - The name of the TLS configuration (specified later in the configuration) to use for this service. If provided, the orchestrator
+will communicate with this service via HTTPS.
+<3> The `detectors` section is where the detector servers available to the orchestrator are specified. Provide some unique name for the detector server as the key
+to each entry, and then the following values are required:
+* `type` - The kind of detector server. For now, the only supported kind within RHOAI is `text_contents`
+* `service` - The service configuration for the detector server, see <2> for details. Note, if you want to use the built-in detector, the service configuration should always be
+
+    service:
+        hostname: "127.0.0.1"
+        port: 8080
+
+* `chunker_id`- The chunker to use for this detector server. For now, the only supported chunker is `whole_doc_chunker`
+* `default_threshold`- The threshold to pass to the detector server. The threshold can be used by the detector servers to determine their sensitivity, and recommended values
+will vary by detector algorithm. We recommend keeping this at `0.5` as a safe starting point.
+<4> Each key in the detector section will define the *name* of the detector server. You'll need to reference these names later, so pick memorable and descriptive names. Here, we've used `built_in` and `some-other-detector` as the names of the two detector servers we are configuring in this example.
+<5> The `tls` section defines TLS configurations. The names of these configurations can then be used as values within `service.tls` in your service configurations (see <2>).
+A TLS configuration consists of the following fields:
+    * `cert_path` - The path to a `.crt` file inside the Guardrails Orchestrator container.
+    * `key_path` - The path to a `.key` file inside the Guardrails Orchestrator container.
+    * `ca_path` - The path to CA certificate `.crt` file on the Guardrails Orchestrator container. The default Openshift Serving CA will be mounted at `/etc/tls/ca/service-ca.crt`, we recommend using this as your `ca_path`.
++
+See the <<tlsSecrets-param,`tlsSecrets`>> section of the GuardrailsOrchestrator Custom Resource to learn how to mount custom TLS files into the Guardrails Orchestrator container.
+<6> The `passthrough_headers` section defines which headers from your requests to the Guardrails Orchestrator get sent onwards to
+the various services specified in this configuration. If you want to ensure that the Orchestrator can talk to authenticated services, we recommend specifying `"authorization"` and `"content-type"` as `passthrough_headers`.
+
+
+
+
+
+
+
+
+
+
+
+---
 
 . Deploy the `orchestrator_cm.yaml` config map:
 +

diff --git a/...guardrails-orchestrator-hap-scenario.adoc → ...tecting-hateful-and-profane-language.adoc b/...guardrails-orchestrator-hap-scenario.adoc → ...tecting-hateful-and-profane-language.adoc
@@ -1,11 +1,11 @@
 :_module-type: PROCEDURE
 
 ifdef::context[:parent-context: {context}]
-[id="guardrails-orchestrator-hap-scenario_{context}"]
-= Monitoring user inputs with the Guardrails Orchestrator service
+[id="detecting-hateful-and-profane-language_{context}"]
+= Detecting hateful and profane language
 [role='_abstract']
 
-The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled. 
+The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to detect and protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled. 
 
 .Prerequisites
 
@@ -21,7 +21,7 @@ ifdef::cloud-service[]
 endif::[]
 
 ifdef::upstream[]
-* You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/monitoring_data_science_models/#deploying-the-guardrails-orchestrator-service_monitor[Deploying the Guardrails Orchestrator].
+* You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/enabling-ai-safety#deploying-the-guardrails-orchestrator-service_safety[Deploying the Guardrails Orchestrator].
 endif::[]