Update cleanlab integration description

jwmueller · web-flow · commit f8daae06e6f0 · 2025-03-31T23:58:21.000-07:00
Signed-off-by: Jonas Mueller &lt;1390638+jwmueller@users.noreply.github.com&gt;
diff --git a/docs/user-guides/community/cleanlab.md b/docs/user-guides/community/cleanlab.md
@@ -1,12 +1,12 @@
 # Cleanlab Integration
 
-The `cleanlab trustworthiness` flow uses trustworthiness score with a default threshold of 0.6 to determine if the output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is considered "untrustworthy").
+Cleanlab's state-of-the-art [LLM uncertainty estimator](https://cleanlab.ai/blog/trustworthy-language-model/) scores the trustworthiness of any LLM response, to detect incorrect/hallucinated outputs in real-time.
 
-A high trustworthiness score generally correlates with high-quality responses. In a question-answering application, high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Trustworthiness scores are less useful for creative or open-ended requests.
+In question-answering / RAG applications: high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Low trustworthiness scores are typically incorrect/bad outputs, or complex prompts where the LLM might have output the right response this time but may output the wrong response when run on the same prompt again (so it cannot be trusted).
 
-The mathematical derivation of the score is explained in [Cleanlab's documentation](https://help.cleanlab.ai/tutorials/tlm/#how-does-the-tlm-trustworthiness-score-work), and you can also access [trustworthiness score benchmarks](https://cleanlab.ai/blog/trustworthy-language-model/).
+The trustworthiness score is further explained and comprehensively benchmarked in [Cleanlab's documentation](https://help.cleanlab.ai/tlm/).
 
-You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config:
+The `cleanlab trustworthiness` guardrail flow uses a default trustworthiness score threshold of 0.6 to determine if your LLM output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is flagged as  "untrustworthy"). You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config:
 
 ```colang
 define subflow cleanlab trustworthiness
@@ -23,12 +23,12 @@ define bot response untrustworthy
 
 ## Setup
 
-Install `cleanlab-studio` to use Cleanlab's trustworthiness score:
+Install the Python client to use Cleanlab's trustworthiness score:
 
 ```
 pip install cleanlab-studio
 ```
 
-Then, you can get an API key for free by [creating a Cleanlab account](https://app.cleanlab.ai/?signup_origin=TLM) or experiment with TLM in the [playground](https://tlm.cleanlab.ai/). You can also [email Cleanlab](mailto:sales@cleanlab.ai) for any special requests or support.
+You can get an API key for free by [creating a Cleanlab account](https://tlm.cleanlab.ai/) or experiment with the trustworthiness scores in the [playground](https://chat.cleanlab.ai/chat). Feel free to [email Cleanlab](mailto:suport@cleanlab.ai) with any questions.
 
 Lastly, set the `CLEANLAB_API_KEY` environment variable with the API key.