-
Notifications
You must be signed in to change notification settings - Fork 38
RHOAIENG-37346:Refactor Guardrails for Safety JTBD #1036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f2585fe
17aefe3
a01beaf
19fd5b1
f679834
a3b5926
7afcbf3
eae31eb
a571973
190c71c
e76f30f
b43060f
d09e061
7fb7ffe
7ee4373
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| :_module-type: ASSEMBLY | ||
|
|
||
| ifdef::context[:parent-context: {context}] | ||
| [id="enabling-ai-safety-with-guardrails_{context}"] | ||
| = Enabling AI safety with Guardrails | ||
|
|
||
| The TrustyAI Guardrails Orchestrator service is a tool to invoke detections on text generation inputs and outputs, as well as standalone detections. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to leave this mention of "TrustyAI Guardrails" here? Or remove the "TrustyAI" bit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So in the spirit of decoipling function from tool. Whenever we have to refer to something concrete, we can reference the tool that does the function. In this case, the phrasing looks okay here but ti will be important to always communicate "tool x makes function y happen" in this case, maybe a blurb on how the trustyAI orchestratore service helps with enabling AI safety with guardrails.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, a potential updated text might read: " The Guardrails Orchestrator is a service included in TrustyAI to perform detections on text generation inputs and/or outputs." ? |
||
|
|
||
| It is underpinned by the open-source project link:https://github.com/foundation-model-stack/fms-guardrails-orchestrator[FMS-Guardrails Orchestrator] from IBM. You can deploy the Guardrails Orchestrator service through a Custom Resource Definition (CRD) that is managed by the TrustyAI Operator. | ||
|
|
||
| The following sections describe the Guardrails components, how to deploy them and provide example use cases of how to protect your AI applications using these tools: | ||
skrthomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Understanding detectors:: | ||
| Explore the available detector types in the Guardrails framework. Currently supported detectors are: | ||
| - The built-in detector: Out-of-the-box guardrailing algorithms for quick setup and easy experimentation. | ||
| - Hugging Face detectors: Text classification models for guardrailing, such as link:https://huggingface.co/ibm-granite/granite-guardian-hap-38m[ibm-granite/granite-guardian-hap-38m] or any other text classifier from Hugging Face. | ||
| Configuring the Orchestrator:: | ||
skrthomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Configure the Orchestrator to communicate with available detectors and your generation model. | ||
|
|
||
| Configuring the Guardrails Gateway:: | ||
| Define preset guardrail pipelines with corresponding unique endpoints. | ||
|
|
||
| Deploying the Orchestrator:: | ||
| Create a Guardrails Orchestrator to begin securing your Large Language Model (LLM) deployments. | ||
|
|
||
| Automatically configuring Guardrails using `AutoConfig`:: | ||
| Automatically configure Guardrails based on available resources in your namespace. | ||
|
|
||
| Monitoring user-inputs to your LLM:: | ||
| Enable a safer LLM by filtering hateful, profane, or toxic inputs. | ||
|
|
||
| Enabling the OpenTelemetry exporter for metrics and tracing:: | ||
| Provide observability for the security and governance mechanisms of AI applications. | ||
|
|
||
| include::modules/guardrails-orchestrator-detectors.adoc[leveloffset=+1] | ||
|
|
||
| [role='_additional-resources'] | ||
| .Additional resources | ||
| ifndef::upstream[] | ||
| * To learn how to use the built-in detectors with `trustyai_fms` Orchestrator server external provider for Llama Stack to detect PII, see link:{rhoaidocshome}{default-format-url}/monitoring_data_science_models#detecting-pii-by-using-guardrails-with-llama-stack[Detecting personally identifiable information (PII) by using Guardrails with Llama Stack]. | ||
| endif::[] | ||
|
|
||
| include::modules/guardrails-configuring-the-hugging-face-detector-serving-runtime.adoc[leveloffset=+2] | ||
| include::modules/guardrails-orchestrator-configmap-parameters.adoc[leveloffset=+1] | ||
| include::modules/guardrails-gateway-config-parameters.adoc[leveloffset=+1] | ||
| include::modules/guardrails-deploying-the-guardrails-orchestrator-service.adoc[leveloffset=+1] | ||
| include::modules/guardrails-auto-config.adoc[leveloffset=+1] | ||
| include::modules/guardrails-configuring-the-opentelemetry-exporter.adoc[leveloffset=+1] | ||
|
|
||
|
|
||
| ifdef::parent-context[:context: {parent-context}] | ||
| ifndef::parent-context[:!context:] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| :_module-type: ASSEMBLY | ||
|
|
||
| ifdef::context[:parent-context: {context}] | ||
| [id="using-guardrails-for-ai-safety_{context}"] | ||
| = Using Guardrails for AI safety | ||
| Use the Guardrails tools to ensure the safety and security of your generative AI applications in production. | ||
|
|
||
| == Detecting PII and sensitive data | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You've added 'include' statements but not explained what they are?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm I'm not seeing these raw includes in the preview: What are you using to preview? Or are you intending that I should add a short description under each heading (which I've taken as the action item from this comment since I don't see the rendering issue you're describing in the preview I'm looking at https://opendatahub-documentation--1036.org.readthedocs.build/en/1036/enabling-ai-safety/index.html#using-guardrails-for-ai-safety_safety) |
||
| Protect user privacy by identifying and filtering personally identifiable information (PII) in LLM inputs and outputs using built-in regex detectors or custom detection models. | ||
|
|
||
| include::modules/detecting-pii-by-using-guardrails-with-llama-stack.adoc[leveloffset=+1] | ||
| include::modules/guardrails-filtering-flagged-content-by-sending-requests-to-the-regex-detector.adoc[leveloffset=+1] | ||
|
|
||
| == Securing prompts | ||
| Prevent malicious prompt injection attacks by using specialized detectors to identify and block potentially harmful prompts before they reach your model. | ||
|
|
||
| include::modules/mitigating-prompt-injection-by-using-a-hugging-face-prompt-injection-detector.adoc[leveloffset=+1] | ||
|
|
||
| == Moderating and safeguarding content | ||
| Filter toxic, hateful, or profane content from user inputs and model outputs to maintain safe and appropriate AI interactions. | ||
|
|
||
| include::modules/detecting-hateful-and-profane-language.adoc[leveloffset=+1] | ||
| include::modules/guardrails-enforcing-configured-safety-pipelines-for-llm-inference-using-guardrails-gateway.adoc[leveloffset=+1] | ||
|
|
||
| ifdef::parent-context[:context: {parent-context}] | ||
| ifndef::parent-context[:!context:] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| --- | ||
| layout: docs | ||
| title: Enabling AI safety | ||
| permalink: /docs/enabling-ai-safety | ||
| custom_css: asciidoc.css | ||
| --- | ||
| //:self-managed: | ||
| :upstream: | ||
| include::_artifacts/document-attributes-global.adoc[] | ||
|
|
||
| :doctype: book | ||
| :toc: left | ||
| :compat-mode: | ||
| :context: safety | ||
|
|
||
| = Enabling AI safety | ||
|
|
||
| include::assemblies/enabling-ai-safety-with-guardrails.adoc[leveloffset=+1] | ||
|
|
||
| include::assemblies/using-guardrails-for-ai-safety.adoc[leveloffset=+1] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,11 +1,11 @@ | ||
| :_module-type: PROCEDURE | ||
|
|
||
| ifdef::context[:parent-context: {context}] | ||
| [id="guardrails-orchestrator-hap-scenario_{context}"] | ||
| = Monitoring user inputs with the Guardrails Orchestrator service | ||
| [id="detecting-hateful-and-profane-language_{context}"] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You've updated a number of titles to include the word 'guardrails' at the start of the title actual. This is not in keeping with our Contributor guidelines. However, it is most likely incidental given that we are switching to DITA soon.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm going to disregard this comment because its a really deep/tedious change at this point to update all the files |
||
| = Detecting hateful and profane language | ||
| [role='_abstract'] | ||
|
|
||
| The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled. | ||
| The following example demonstrates how to use Guardrails Orchestrator to monitor user inputs to your LLM, specifically to detect and protect against hateful and profane language (HAP). A comparison query without the detector enabled shows the differences in responses when guardrails is disabled versus enabled. | ||
|
|
||
| .Prerequisites | ||
|
|
||
|
|
@@ -21,7 +21,7 @@ ifdef::cloud-service[] | |
| endif::[] | ||
|
|
||
| ifdef::upstream[] | ||
| * You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/monitoring_data_science_models/#deploying-the-guardrails-orchestrator-service_monitor[Deploying the Guardrails Orchestrator]. | ||
| * You have deployed the Guardrails Orchestrator and related detectors. For more information, see link:{odhdocshome}/enabling-ai-safety#deploying-the-guardrails-orchestrator-service_safety[Deploying the Guardrails Orchestrator]. | ||
skrthomas marked this conversation as resolved.
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: " and associated detectors' instead of 'related'
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think related is fine, just my 2cents |
||
| endif::[] | ||
|
|
||
|
|
||
|
|
||


Uh oh!
There was an error while loading. Please reload this page.