Skip to content

Commit 3953745

Browse files
authored
address feedback
Signed-off-by: Jonas Mueller <[email protected]>
1 parent f8daae0 commit 3953745

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/user-guides/community/cleanlab.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# Cleanlab Integration
22

3-
Cleanlab's state-of-the-art [LLM uncertainty estimator](https://cleanlab.ai/blog/trustworthy-language-model/) scores the trustworthiness of any LLM response, to detect incorrect/hallucinated outputs in real-time.
3+
Cleanlab's state-of-the-art [LLM uncertainty estimator](https://cleanlab.ai/blog/trustworthy-language-model/) scores the _trustworthiness_ of any LLM response, to detect incorrect outputs and hallucinations in real-time.
44

5-
In question-answering / RAG applications: high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Low trustworthiness scores are typically incorrect/bad outputs, or complex prompts where the LLM might have output the right response this time but may output the wrong response when run on the same prompt again (so it cannot be trusted).
5+
In question-answering or RAG applications: high trustworthiness is indicative of a correct response. In open-ended chat applications, a high score corresponds to the response being helpful and informative. Low trustworthiness scores indicate outputs that are likely bad or incorrect, or complex prompts where the LLM might have output the right response this time but might output the wrong response when run on the same prompt again (so it cannot be trusted).
66

77
The trustworthiness score is further explained and comprehensively benchmarked in [Cleanlab's documentation](https://help.cleanlab.ai/tlm/).
88

9-
The `cleanlab trustworthiness` guardrail flow uses a default trustworthiness score threshold of 0.6 to determine if your LLM output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is flagged as "untrustworthy"). You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config:
9+
The `cleanlab trustworthiness` guardrail flow uses a default trustworthiness score threshold of 0.6 to determine if your LLM output should be allowed or not. When the trustworthiness score falls below the threshold, the corresponding LLM response is flagged as _unstrustworthy_. You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, add the following flow to your config:
1010

1111
```colang
1212
define subflow cleanlab trustworthiness

0 commit comments

Comments
 (0)