Skip to content

Conversation

git-jxj
Copy link
Contributor

@git-jxj git-jxj commented Sep 12, 2025

When running a concurrent benchmark, each worker process currently re-validates the backend instance. This leads to multiple, unnecessary "Test connection" requests being sent to the target endpoint, adding startup latency and log verbosity.

Summary

This PR optimizes the startup of concurrent benchmarks by preventing redundant backend validation in worker processes.

The Problem:

When a concurrent benchmark is initiated (e.g., with --rate-type=concurrent), the main process creates and validates the backend, which includes making a "Test connection" request. However, when the worker processes are spawned, each worker receives a copy of the backend object and calls the validate() method again.

This results in N extra validation calls and "Test connection" network requests, where N is the number of worker processes. This behavior has several minor drawbacks:

  • It adds unnecessary startup latency to the benchmark.
  • It creates verbose, repetitive logs.
  • It puts a small, sharp, and unnecessary load on the target server right before the test begins.

Details

This PR introduces a _validated flag to the Backend base class, which is initialized to False.

The validate() method is modified to first check this flag. If True, it returns immediately. If False, it proceeds with the validation logic and sets the flag to True upon successful completion.

Test Plan

Concurrent greater than 1

guidellm benchmark
--target "http://10.64.24.34:8000"
--processor "Qwen/Qwen3-0.6B"
--rate-type=concurrent
--rate=5
--max-requests 5
--data='{"prompt_tokens":16, "output_tokens":16}'

old

:03 < -:--:-- ]25-09-12 15:49:08|INFO |guidellm.backend.backend:validate:127 - OpenAIHTTPBackend validating backend openai_http
25-09-12 15:49:08|INFO |guidellm.backend.backend:validate:127 - OpenAIHTTPBackend validating backend openai_http
25-09-12 15:49:08|INFO |guidellm.backend.backend:validate:127 - OpenAIHTTPBackend validating backend openai_http
25-09-12 15:49:08|INFO |guidellm.backend.backend:validate:127 - OpenAIHTTPBackend validating backend openai_http
25-09-12 15:49:08|INFO |guidellm.backend.backend:validate:127 - OpenAIHTTPBackend validating backend openai_http

new

25-09-12 15:50:49|INFO |guidellm.backend.backend:create:71 - Creating backend of type openai_http
25-09-12 15:50:49|INFO |guidellm.backend.backend:validate:130 - OpenAIHTTPBackend validating backend openai_http

Related Issues

#322

  • Resolves #

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

When running a concurrent benchmark, each worker process currently re-validates the backend instance. This leads to multiple, unnecessary "Test connection" requests being sent to the target endpoint, adding startup latency and log verbosity.

Signed-off-by: xinjun.jiang <[email protected]>
@sjmonson
Copy link
Collaborator

Closing this due to upcoming changes. We have a massive rework coming in the next release which changes this behavior to a simple HTTP health check (see feature/refactor/draft-main and #286).

@sjmonson sjmonson closed this Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants