Skip to content

Fix LLMJudge input handling to preserve BinaryContent as separate message part instead of stringifying #2173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

adtyavrdhn
Copy link
Contributor

@adtyavrdhn adtyavrdhn commented Jul 10, 2025

Fix LLMJudge input handling to preserve BinaryContent as separate message part instead of stringifying

Fixes #2089

@DouweM DouweM self-assigned this Jul 10, 2025
Copy link
Contributor

@DouweM DouweM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adtyavrdhn Thanks Aditya! A few suggestions.

# For non-string inputs (e.g., BinaryContent), build a list
prompt_parts: list[str | UserContent] = []
prompt_parts.append('<Input>\n')
prompt_parts.append(inputs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs may itself be a list of UserContents, in which case we should extend rather than `append

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make the signature UserContent, Sequence[UserContent] for inputs then as well? Is there a specific reason we are keeping it Any?

<Rubric>
{rubric}
</Rubric>
""")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to below, can we have build a list of strings and conditionally add the item for ExpectedOutput, and join them all at the end? That'll reduce the duplication a bit.

We can likely also merge the 2 isinstance(inputs, str) branches, and only join if isinstance(inputs, str), and return the entire list otherwise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored it to reduce duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BinaryContent is naively parsed when include_input is used in LLMJudge
2 participants