Developed by | Alejandro Esquivel |
---|---|
Date of development | Jul 09, 2024 |
Validator type | Format |
Blog | |
License | Apache 2 |
Input/Output | Output |
This validator checks that generated text is parseable HTML. It relies on html5lib
to ensure that markup returned by an LLM can be parsed without errors before it is rendered or passed downstream.
-
Dependencies:
- guardrails-ai>=0.4.0
- html5lib>=1.1,<2.0
-
Foundation model access keys:
- None
$ guardrails hub install hub://guardrails/valid_html
In this example, we apply the validator to a string output generated by an LLM.
# Import Guard and Validator
from guardrails.hub import ValidHtml
from guardrails import Guard
# Setup Guard
guard = Guard().use(
ValidHtml()
)
guard.validate("<p>hello</p>") # Validator passes
guard.validate("<phello</p>") # Validator fails
Use the validator to protect HTML-bearing fields of structured outputs.
# Import Guard and Validator
from pydantic import BaseModel, Field
from guardrails.hub import ValidHtml
from guardrails import Guard
# Initialize Validator
html_validator = ValidHtml(strict=True)
# Create Pydantic BaseModel
class WebSection(BaseModel):
title: str
body: str = Field(validators=[html_validator])
# Create a Guard to check for valid Pydantic output
guard = Guard.from_pydantic(output_class=WebSection)
# Run LLM output generating JSON through guard
guard.parse("""
{
"title": "About",
"body": "<section><h1>About Us</h1></section>"
}
""")
__init__(self, *, strict=True, on_fail=None)
-
Initializes a new instance of the `ValidHtml` validator.
strict
(bool): WhenTrue
, the validator raises on the first HTML parsing error. WhenFalse
,html5lib
attempts to recover from malformed markup.on_fail
(str, Callable): The policy to enact when a validator fails. Ifstr
, must be one ofreask
,fix
,filter
,refrain
,noop
,exception
orfix_reask
. Otherwise, provide a callable invoked when the validator fails.
Parameters
validate(self, value, metadata=None) -> ValidationResult
-
Validates the given `value` using `html5lib`. If the parser cannot consume the markup—even after attempting to prepend a doctype—the validator returns a `FailResult` with the parser's error message.
value
(Any): The input value to validate.metadata
(dict | None): Optional metadata forwarded by Guardrails. It is unused by this validator but retained for API compatibility.
Parameters