Benchmark toolkit support #66

kerthcet · 2024-08-06T03:19:02Z

What would you like to be added:

It would be super great to support benchmarking the LLM throughputs or latencies with different backends.

Why is this needed:

Provide proofs for users.

Completion requirements:

This enhancement requires the following artifacts:

Design doc
API change
Docs update

The artifacts should be linked in subsequent comments.

kerthcet · 2024-08-06T03:19:13Z

/kind feature

kerthcet · 2024-08-08T11:01:45Z

An example would looks like:

{
  metadata: {
    name: llama3-405b-2024-07-01,
    namespace: llm,
  },
  spec: {
    endpoint: llm-1.svc.local,
    port: 8000, 
    performance: {
      traffic-shape: {
        req-rate: 10 qps,
        model-type: instruction-tuned-llm/diffusion,
        dataset: share-gpt,
        input-length: 1024,
        max-output-length: 1024,
        total-prompts: 1000,
        traffic-spike: {
          burst: 10m,
          req-rate: 20 qps,
        }
      }
    }
  },
  status: {
    status: success,
    results: gcs-bucket-1/llama3-405b-2024-07-01,
  }
}

Inspired by https://docs.google.com/document/d/1k4Q4X14hW4vftElIuYGDu5KDe2LtV1XammoG-Xi3bbQ/edit

kerthcet · 2024-09-10T00:02:07Z

Also see:

kerthcet · 2025-04-23T02:29:13Z

/help

kerthcet · 2025-04-23T02:35:09Z

We have gateway right now, I think we can push this forward.

kerthcet pinned this issue Apr 23, 2025

InftyAI-Agent added the help wanted Extra attention is needed label Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Benchmark toolkit support #66

Benchmark toolkit support #66

kerthcet commented Aug 6, 2024 •

edited

Loading

kerthcet commented Aug 6, 2024

Uh oh!

kerthcet commented Aug 8, 2024

Uh oh!

kerthcet commented Sep 10, 2024 •

edited

Loading

Uh oh!

kerthcet commented Apr 23, 2025

Uh oh!

kerthcet commented Apr 23, 2025

Uh oh!

Uh oh!

Benchmark toolkit support #66

Benchmark toolkit support #66

Comments

kerthcet commented Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kerthcet commented Aug 6, 2024

Uh oh!

kerthcet commented Aug 8, 2024

Uh oh!

kerthcet commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kerthcet commented Apr 23, 2025

Uh oh!

kerthcet commented Apr 23, 2025

Uh oh!

kerthcet commented Aug 6, 2024 •

edited

Loading

kerthcet commented Sep 10, 2024 •

edited

Loading