Skip to content

Conversation

MrJs133
Copy link
Contributor

@MrJs133 MrJs133 commented May 9, 2025

Support for Automated LLM Testing

Core Idea: The user provides a prompt; the candidate model generates a response based on the prompt, and a reviewer model evaluates the response and assigns scores.

As shown in the figure:
image

  1. Configure the reviewer model (currently only OpenAI is supported)
  2. Configure the candidate models (supports uploading a YAML file or entering text)
    • A template YAML file has been uploaded (it cannot be used directly; something like key must be filled in)
    • The textbox provides a template input format; each line represents the configuration for one LLM client, with elements separated by commas
  3. Provide the prompt
  4. Provide the reference answer
  5. Evaluation results will be displayed in the Output box

fix test_api_connection

When using ernie, the output timed out.
Upon analysis, I found that the issue was caused by an unclear prompt ("test"), which led the model to generate a response that exceeded our predefined limit.
I changed "test" to "Hello".

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 9, 2025
@github-actions github-actions bot added the llm label May 9, 2025
@dosubot dosubot bot added the enhancement New feature or request label May 9, 2025
@imbajin imbajin requested a review from Copilot May 9, 2025 07:59
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for automated LLM review along with a fix to the test API connection issue. The changes add new functionality for generating review results based on candidate model answers and update the test configuration to avoid timeout issues.

  • Added new functions in other_tool_utils to generate and parse LLM review results.
  • Introduced UI components that support both file and text LLM configurations.
  • Changed the test API call parameter from "test" to "hello" in the rag_demo configs.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
hugegraph_llm/src/hugegraph_llm/utils/other_tool_utils.py New utility functions for generating review responses and parsing configurations, supporting automated LLM review.
hugegraph_llm/src/hugegraph_llm/resources/demo/llm_review.yaml Demo YAML file for LLM configuration showcasing the expected format.
hugegraph_llm/src/hugegraph_llm/demo/rag_demo/other_block.py Added UI elements for new LLM testing functions and configuration input handling.
hugegraph_llm/src/hugegraph_llm/demo/rag_demo/configs_block.py Updated test API call parameters to resolve the timeout issue.
hugegraph_llm/src/hugegraph_llm/config/prompt_config.py Introduced a new review_prompt template for professional evaluation of LLM outputs.
Comments suppressed due to low confidence (1)

hugegraph_llm/src/hugegraph_llm/utils/other_tool_utils.py:50

  • In the 'judge' function, the exception block sets the error message to 'reviews' but does not return it, resulting in an implicit return of None. Add a return statement (e.g., 'return reviews') after setting the error.
            reviews = {"error": f"Review error: {str(e)}"}

@MrJs133 MrJs133 changed the title feat(llm):support auto llm review and fix test_api_connection feat(llm): support auto llm review and fix test_api_connection May 20, 2025
@imbajin imbajin requested a review from Copilot May 20, 2025 06:52
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for automated LLM testing and fixes the test_api_connection prompt used in the RAG demo. Key changes include:

  • Adding new functions to automatically run LLM tests and evaluate responses.
  • Providing new YAML and text configuration options for LLM settings.
  • Fixing an incorrect prompt in the test_api_connection call.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
hugegraph_llm/src/hugegraph_llm/utils/other_tool_utils.py Introduces functions for LLM testing and reviewing responses
hugegraph_llm/src/hugegraph_llm/resources/demo/llm_review.yaml Provides sample LLM configuration settings
hugegraph_llm/src/hugegraph_llm/demo/rag_demo/other_block.py Adds UI components for automated LLM testing
hugegraph_llm/src/hugegraph_llm/demo/rag_demo/configs_block.py Updates the prompt for test_api_connection to prevent timeouts
hugegraph_llm/src/hugegraph_llm/config/prompt_config.py Adds a detailed review_prompt for response evaluation
Comments suppressed due to low confidence (2)

hugegraph_llm/src/hugegraph_llm/demo/rag_demo/other_block.py:64

  • [nitpick] Consider using a triple-quoted string for the multiline default configuration text in the textbox. This would improve readability and ease future modifications.
inp1 = gr.Textbox(
                    value="openai, model_name, api_key, api_base, max_tokens\n" \

hugegraph_llm/src/hugegraph_llm/utils/other_tool_utils.py:168

  • [nitpick] The error message 'Please only choose one between file and text.' could be rephrased for clarity (e.g., 'Provide either a file or text for LLM configuration, not both.').
if llm_configs_file and llm_configs:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request llm size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants