-
Notifications
You must be signed in to change notification settings - Fork 51
[Sync] dev/perf sync with upstream 20251124 #822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev/perf
Are you sure you want to change the base?
Conversation
Signed-off-by: Gleb Kurchanov <[email protected]>
Signed-off-by: UranusSeven <[email protected]>
…ect#28962) Signed-off-by: Lukas Geiger <[email protected]>
…oject#27329) Signed-off-by: Roman Solomatin <[email protected]> Signed-off-by: wang.yuqi <[email protected]> Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Co-authored-by: wang.yuqi <[email protected]>
Signed-off-by: Michael Goin <[email protected]> Signed-off-by: mgoin <[email protected]>
Signed-off-by: Didier Durand <[email protected]>
Signed-off-by: Tsai, Louie <[email protected]>
Signed-off-by: gnovack <[email protected]> Co-authored-by: Jee Jee Li <[email protected]>
Signed-off-by: windsonsea <[email protected]>
…ng unroll_loop (vllm-project#28847) Signed-off-by: ihb2032 <[email protected]> Co-authored-by: lyd1992 <[email protected]>
Signed-off-by: Kan Zhu <[email protected]>
…#29005) Signed-off-by: Harry Mellor <[email protected]>
…project#28961) Signed-off-by: tovam <[email protected]> Co-authored-by: Wentao Ye <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…ent. (vllm-project#28449) Signed-off-by: bruceszchen <[email protected]> Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: Didier Durand <[email protected]>
…8952) Signed-off-by: Harry Mellor <[email protected]>
… in CI (vllm-project#28625) Signed-off-by: Yanan Cao <[email protected]>
…vllm-project#26468) Signed-off-by: vnadathur <[email protected]> Signed-off-by: WorldExplored <[email protected]> Signed-off-by: Srreyansh Sethi <[email protected]> Signed-off-by: Srreyansh Sethi <[email protected]> Co-authored-by: WorldExplored <[email protected]> Co-authored-by: Srreyansh Sethi <[email protected]> Co-authored-by: vnadathur <[email protected]> Co-authored-by: Luka Govedič <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
…luggable for other device (vllm-project#26487) Signed-off-by: shen-shanshan <[email protected]>
…rmers v5 (vllm-project#28542) Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: zRzRzRzRzRzRzR <[email protected]>
…m-project#28966) Signed-off-by: Luka Govedič <[email protected]> Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: ProExpertProg <[email protected]> Co-authored-by: Luka Govedič <[email protected]>
…ect#28917) Signed-off-by: Jialin Ouyang <[email protected]>
Signed-off-by: Robert Shaw <[email protected]>
…lm-project#28942) Signed-off-by: zhyajie <[email protected]> Co-authored-by: zhyajie <[email protected]>
…ject#28701) Signed-off-by: Aleksandr Malyshev <[email protected]> Co-authored-by: Aleksandr Malyshev <[email protected]>
Signed-off-by: Izzy Putterman <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…28718) Signed-off-by: QiuChunshuo <[email protected]> Signed-off-by: FENP <[email protected]> Signed-off-by: LookAround <[email protected]> Signed-off-by: Jingchun Gao <[email protected]> Signed-off-by: zhenwenqi2024 <[email protected]> Co-authored-by: FENP <[email protected]> Co-authored-by: LookAround <[email protected]> Co-authored-by: Jingchun Gao <[email protected]> Co-authored-by: zhenwenqi2024 <[email protected]> Co-authored-by: Jingchun Gao <[email protected]>
Signed-off-by: Ryan Rock <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
) Signed-off-by: Lin, Fanli <[email protected]>
…version (vllm-project#29200) Signed-off-by: Inoki <[email protected]>
Signed-off-by: kliuae <[email protected]>
…ject#29372) Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: vllmellm <[email protected]>
vllm-project#29273) Signed-off-by: Fadi Arafeh <[email protected]>
Signed-off-by: Ryan Rock <[email protected]>
…ct#28623) Signed-off-by: wxsIcey <[email protected]> Signed-off-by: Mengqing Cao <[email protected]> Signed-off-by: Icey <[email protected]> Co-authored-by: Mengqing Cao <[email protected]>
vllm-project#29311) Signed-off-by: zhuhaoran <[email protected]>
Signed-off-by: Rémi Delacourt <[email protected]> Signed-off-by: Rémi Delacourt <[email protected]> Signed-off-by: remi <[email protected]>
) Signed-off-by: Micah Williamson <[email protected]>
|
@zhuyuhua-v will trigger an acceptance of this PR. Please don't merge after the acceptance pass. |
Signed-off-by: elvischenv <[email protected]>
…llm-project#28911) Signed-off-by: wang.yuqi <[email protected]>
|
@kliuae-amd could you please help resolve these conflicts? I'll trigger an acceptance tonight. |
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: kliuae <[email protected]>
Signed-off-by: kliuae <[email protected]>
|
Resolved merge conflicts and updated with later upstream commits. The upstream is now at: Upstream vLLM commit: db29061 |
Signed-off-by: kliuae <[email protected]>
Signed-off-by: ZhiweiYan-96 <[email protected]>
|
Cherry-pick #824 to address MI355 ds r1 fp4 functionality issue. |
Purpose
Sync changes from upstream.
Upstream vLLM commit: 3085478
aiter branch: dev/perf at d0a40f55ca1d552f20f2dd55741e7309c936a9d1
Test Plan
Evaluate the following models of interest.
Test Result
LLM
DeepSeek-R1 Block Scale FP8, TP8
DeepSeek-R1 Block Scale FP8, TP8 + EP
DeepSeek-R1 PTPC FP8, TP8
DeepSeek-R1 PTPC FP8, TP8 + EP
EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic PTPC FP8, TP8
EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic PTPC FP8, TP8 + EP
Qwen/Qwen3-Next-80B-A3B, TP2
Qwen/Qwen3-Omni-30B-A3B-Instruct, TP2
VLM
Qwen/Qwen3-VL-235B-A22B-Instruct TP4
RedHatAI/Qwen3-VL-235B-A22B-Instruct-FP8-dynamic TP4
RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic TP2
Qwen/Qwen3-Omni-30B-A3B-Instruct TP2
Omni Audio Eval
Qwen/Qwen3-Omni-30B-A3B-Instruct TP2
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.