Skip to content

Conversation

@kliuae-amd
Copy link

@kliuae-amd kliuae-amd commented Nov 25, 2025

Purpose

Sync changes from upstream.

Upstream vLLM commit: 3085478
aiter branch: dev/perf at d0a40f55ca1d552f20f2dd55741e7309c936a9d1

Test Plan

Evaluate the following models of interest.

Test Result

LLM

DeepSeek-R1 Block Scale FP8, TP8

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  | 0.95|_  | 0.006|
|     |       |strict-match    |     5|exact_match|_  | 0.95|_  | 0.006|

DeepSeek-R1 Block Scale FP8, TP8 + EP

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.9538|_  |0.0058|
|     |       |strict-match    |     5|exact_match|_  |0.9538|_  |0.0058|

DeepSeek-R1 PTPC FP8, TP8

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.9553|_  |0.0057|
|     |       |strict-match    |     5|exact_match|_  |0.9538|_  |0.0058|

DeepSeek-R1 PTPC FP8, TP8 + EP

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.9515|_  |0.0059|
|     |       |strict-match    |     5|exact_match|_  |0.9500|_  |0.0060|

EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic PTPC FP8, TP8

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.8597|_  |0.0096|
|     |       |strict-match    |     5|exact_match|_  |0.8241|_  |0.0105|

EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic PTPC FP8, TP8 + EP

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.8673|_  |0.0093|
|     |       |strict-match    |     5|exact_match|_  |0.8347|_  |0.0102|

Qwen/Qwen3-Next-80B-A3B, TP2

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.856|_  |0.0097|
|     |       |strict-match    |     5|exact_match|_  |0.818|_  |0.0106|

Qwen/Qwen3-Omni-30B-A3B-Instruct, TP2

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|_  |0.8613|_  |0.0095|
|     |       |strict-match    |     5|exact_match|_  |0.8506|_  |0.0098|

VLM

Qwen/Qwen3-VL-235B-A22B-Instruct TP4

Metrics:
{
    "explicit_prompt_relaxed_correctness": 0.872,
    "anywhere_in_answer_relaxed_correctness": 0.8728
}

RedHatAI/Qwen3-VL-235B-A22B-Instruct-FP8-dynamic TP4

Metrics:
{
    "explicit_prompt_relaxed_correctness": 0.8736,
    "anywhere_in_answer_relaxed_correctness": 0.8756
}

RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic TP2

Metrics:
{
    "explicit_prompt_relaxed_correctness": 0.8736,
    "anywhere_in_answer_relaxed_correctness": 0.8848
}

Qwen/Qwen3-Omni-30B-A3B-Instruct TP2

Metrics:
{
    "explicit_prompt_relaxed_correctness": 0.8724,
    "anywhere_in_answer_relaxed_correctness": 0.8736
}

Omni Audio Eval

Qwen/Qwen3-Omni-30B-A3B-Instruct TP2

|        Tasks        |Version|Filter|n-shot|   Metric   |   | Value |   |Stderr|
|---------------------|------:|------|-----:|------------|---|------:|---|------|
|voicebench_openbookqa|      0|none  |     0|accuracy    |_  |91.8681|_  |   N/A|
|voicebench_openbookqa|      0|none  |     0|failure rate|_  | 0.0000|_  |   N/A|

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Nepherpitou and others added 30 commits November 19, 2025 05:21
…oject#27329)

Signed-off-by: Roman Solomatin <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Signed-off-by: Michael Goin <[email protected]>
Signed-off-by: mgoin <[email protected]>
…vllm-project#26468)

Signed-off-by: vnadathur <[email protected]>
Signed-off-by: WorldExplored <[email protected]>
Signed-off-by: Srreyansh Sethi <[email protected]>
Signed-off-by: Srreyansh Sethi <[email protected]>
Co-authored-by: WorldExplored <[email protected]>
Co-authored-by: Srreyansh Sethi <[email protected]>
Co-authored-by: vnadathur <[email protected]>
Co-authored-by: Luka Govedič <[email protected]>
…m-project#28966)

Signed-off-by: Luka Govedič <[email protected]>
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: ProExpertProg <[email protected]>
Co-authored-by: Luka Govedič <[email protected]>
Signed-off-by: Izzy Putterman <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
…28718)

Signed-off-by: QiuChunshuo <[email protected]>
Signed-off-by: FENP <[email protected]>
Signed-off-by: LookAround <[email protected]>
Signed-off-by: Jingchun Gao <[email protected]>
Signed-off-by: zhenwenqi2024 <[email protected]>
Co-authored-by: FENP <[email protected]>
Co-authored-by: LookAround <[email protected]>
Co-authored-by: Jingchun Gao <[email protected]>
Co-authored-by: zhenwenqi2024 <[email protected]>
Co-authored-by: Jingchun Gao <[email protected]>
hmellor and others added 12 commits November 25, 2025 05:24
Signed-off-by: kliuae <[email protected]>
…ct#28623)

Signed-off-by: wxsIcey <[email protected]>
Signed-off-by: Mengqing Cao <[email protected]>
Signed-off-by: Icey <[email protected]>
Co-authored-by: Mengqing Cao <[email protected]>
Signed-off-by: Rémi Delacourt <[email protected]>
Signed-off-by: Rémi Delacourt <[email protected]>
Signed-off-by: remi <[email protected]>
@wuhuikx
Copy link

wuhuikx commented Nov 25, 2025

@zhuyuhua-v will trigger an acceptance of this PR. Please don't merge after the acceptance pass.

@zhuyuhua-v
Copy link

@kliuae-amd could you please help resolve these conflicts? I'll trigger an acceptance tonight.

@kliuae-amd
Copy link
Author

Resolved merge conflicts and updated with later upstream commits. The upstream is now at:

Upstream vLLM commit: db29061
aiter dev/perf branch commit: 5f4c65eda98b46a96c6fbb3e94a96f0d537a3aab

@zhuyuhua-v
Copy link

Cherry-pick #824 to address MI355 ds r1 fp4 functionality issue.
CI: https://github.com/ROCm/rocActions/actions/runs/19754855331

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.