chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19 #3850

tongyuantongyu · 2025-04-25T01:58:20Z

PR title

Please write the PR title by following template:

[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>

For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like

[TRTLLM-1000][feat] Support a new feature about cache manager

Description

Please explain the issue and the solution in short.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Signed-off-by: ZhanruiSunCh <[email protected]>

* fix test name Signed-off-by: Ivy Zhang <[email protected]> * add quickstart test for nemotron-ultra Signed-off-by: Ivy Zhang <[email protected]> * add rcca multi-node test case for deepseek-v3 Signed-off-by: Ivy Zhang <[email protected]> * add rcca info Signed-off-by: Ivy Zhang <[email protected]> --------- Signed-off-by: Ivy Zhang <[email protected]> Signed-off-by: Ivy Zhang <[email protected]>

Signed-off-by: Enwei Zhu <[email protected]>

* nvbugs/5187237 nvbugs/5112075: fix deterministic mode error * remove waive Signed-off-by: Xiwen Yu <[email protected]> * Revert "remove waive" This reverts commit 0bf5486d19906d692bfb7a6262333c296b0087ac. Signed-off-by: Xiwen Yu <[email protected]> * revert ar fusion Signed-off-by: Xiwen Yu <[email protected]> --------- Signed-off-by: Xiwen Yu <[email protected]>

Signed-off-by: taoli <[email protected]> Co-authored-by: taoli <[email protected]>

Signed-off-by: Ruodi <[email protected]> Co-authored-by: Larry <[email protected]>

Signed-off-by: Enwei Zhu <[email protected]>

Signed-off-by: Yanchao Lu <[email protected]>

Signed-off-by: Xiwen Yu <[email protected]>

* Fix: nvbugs/5222698 variable not defined Signed-off-by: Zongfei Jing <[email protected]> * Tidy code Signed-off-by: Zongfei Jing <[email protected]> --------- Signed-off-by: Zongfei Jing <[email protected]>

…-cppmanager case (NVIDIA#3685) Signed-off-by: nv-guomingz <[email protected]>

Signed-off-by: nv-guomingz <[email protected]>

* Update DeepSeek perf docs Signed-off-by: Kaiyu Xie <[email protected]> * update Signed-off-by: Kaiyu Xie <[email protected]> * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> Signed-off-by: Kaiyu Xie <[email protected]> --------- Signed-off-by: Kaiyu Xie <[email protected]> Co-authored-by: Copilot <[email protected]>

Signed-off-by: junq <[email protected]>

Signed-off-by: Jin Li <[email protected]>

* security fix cherry-pick changes from main Signed-off-by: Yibin Li <[email protected]> * fix hmac in remote mpi session (NVIDIA#3649) Signed-off-by: Yan Chunwei <[email protected]> --------- Signed-off-by: Yibin Li <[email protected]> Signed-off-by: Yan Chunwei <[email protected]> Co-authored-by: Yan Chunwei <[email protected]>

Signed-off-by: Tracin <[email protected]>

* fix FP8 kv accuracy Signed-off-by: Dylan Chen <[email protected]> * update doc Signed-off-by: Dylan Chen <[email protected]> --------- Signed-off-by: Dylan Chen <[email protected]>

Signed-off-by: Tracin <[email protected]>

Signed-off-by: Superjomn <[email protected]>

Signed-off-by: peaceh <[email protected]>

…3749) Signed-off-by: nv-guomingz <[email protected]> Co-authored-by: nv-guomingz <[email protected]>

Signed-off-by: Balaram Buddharaju <[email protected]> Co-authored-by: brb-nv <[email protected]>

Signed-off-by: Yiqing Yan <[email protected]>

…and write config.json to output log (NVIDIA#3656) Signed-off-by: Ruodi <[email protected]> Signed-off-by: Larry <[email protected]> Co-authored-by: Larry <[email protected]>

Signed-off-by: Ivy Zhang <[email protected]>

…#3758) Include Qwen2VLDecoderLayer in the smooth_qwen2_model function. Signed-off-by: Yukun He <[email protected]>

Signed-off-by: Anurag Mukkara <[email protected]> Co-authored-by: Sharan Chetlur <[email protected]>

Signed-off-by: Ivy Zhang <[email protected]>

Signed-off-by: Chuang Zhu <[email protected]>

* add skip condition to tests Signed-off-by: xinhe-nv <[email protected]> * fix error Signed-off-by: xinhe-nv <[email protected]> --------- Signed-off-by: xinhe-nv <[email protected]>

* skip_pre_ada for fp8 cases Signed-off-by: Ivy Zhang <[email protected]> * update Signed-off-by: Ivy Zhang <[email protected]> * update after rebase Signed-off-by: Ivy Zhang <[email protected]> --------- Signed-off-by: Ivy Zhang <[email protected]>

Signed-off-by: Fanrong Li <[email protected]>

Signed-off-by: Barry Kang <[email protected]> Co-authored-by: Larry <[email protected]>

Signed-off-by: Yiqing Yan <[email protected]>

…nd fix moe fallback issue. (NVIDIA#3793) * Reduce memory usage in fused moe op associated with AutoTuning. * Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens. * Add free_memory logic of workspace in min_latency_mode fused moe path. Signed-off-by: Yukun He <[email protected]> * Fix fused_moe fallback issue. (NVIDIA#3652) min_latency_mode is only set to False during warmup phase. Thus when it becomes true during inference, all tactics fall back to the default one and thus cause perf regression. Signed-off-by: Yukun He <[email protected]> --------- Signed-off-by: Yukun He <[email protected]>

…ng (NVIDIA#3797) Signed-off-by: wili-65535 <[email protected]>

Signed-off-by: Yuan Tong <[email protected]>

ZhanruiSunCh and others added 30 commits April 16, 2025 12:15

chore: bump version to 0.19.0 (NVIDIA#3598)

3471d6c

Signed-off-by: ZhanruiSunCh <[email protected]>

squash (NVIDIA#3642)

e36092b

Signed-off-by: Enwei Zhu <[email protected]>

update fp8 doc (NVIDIA#3647)

458203d

Signed-off-by: taoli <[email protected]> Co-authored-by: taoli <[email protected]>

tests: change qa perf test to trtllm-bench (NVIDIA#3619)

b1a65c0

Signed-off-by: Ruodi <[email protected]> Co-authored-by: Larry <[email protected]>

fix: FP8 quantized lm_head (NvBug 5214229) (NVIDIA#3567)

c8cea30

Signed-off-by: Enwei Zhu <[email protected]>

infra: Add PR approval protection for the release branch (NVIDIA#3634)

56c9dd4

Signed-off-by: Yanchao Lu <[email protected]>

fix: nvbugs/5231298: pytorch allreduce issue (NVIDIA#3673)

5bf8fdc

Signed-off-by: Xiwen Yu <[email protected]>

Fix: nvbugs/5222698 variable not defined (NVIDIA#3630)

1c6e85b

* Fix: nvbugs/5222698 variable not defined Signed-off-by: Zongfei Jing <[email protected]> * Tidy code Signed-off-by: Zongfei Jing <[email protected]> --------- Signed-off-by: Zongfei Jing <[email protected]>

test:sync waives.txt from main branch by disabling test_perf/gpt_350m…

07688cd

…-cppmanager case (NVIDIA#3685) Signed-off-by: nv-guomingz <[email protected]>

test:restore fp8 kv cache testing for L0 (NVIDIA#3671)

c70b24c

Signed-off-by: nv-guomingz <[email protected]>

tests: waive test_llm_multi_node (NVIDIA#3664)

fb8ddfa

Signed-off-by: junq <[email protected]>

fix: update test_user_buffers_mm_add_prologue atol (NVIDIA#3711)

a04b585

Signed-off-by: Jin Li <[email protected]>

Un-waive DS-V3-Lite tests. (NVIDIA#3621)

8f17f3f

Signed-off-by: Tracin <[email protected]>

fix: FP8 kv accuracy (NVIDIA#3675)

8a8a55a

* fix FP8 kv accuracy Signed-off-by: Dylan Chen <[email protected]> * update doc Signed-off-by: Dylan Chen <[email protected]> --------- Signed-off-by: Dylan Chen <[email protected]>

Fix script options for engines. (NVIDIA#3622)

81d1f4f

Signed-off-by: Tracin <[email protected]>

unwaive multi-node test (NVIDIA#3721)

e69d7bb

Signed-off-by: Superjomn <[email protected]>

chore : Split more tests out of gpt tests (NVIDIA#3524) (NVIDIA#3674)

e19309c

Signed-off-by: peaceh <[email protected]>

doc:add torch examples link into torch backend documentation (NVIDIA#…

ba15155

…3749) Signed-off-by: nv-guomingz <[email protected]> Co-authored-by: nv-guomingz <[email protected]>

test: Get Eagle tests working (NVIDIA#3593) (NVIDIA#3722)

611ef8e

Signed-off-by: Balaram Buddharaju <[email protected]> Co-authored-by: brb-nv <[email protected]>

Waive L0 test (NVIDIA#3756)

52e6702

Signed-off-by: Yiqing Yan <[email protected]>

waive failed case in perf test, change default max_batch_size to 512 …

793d010

…and write config.json to output log (NVIDIA#3656) Signed-off-by: Ruodi <[email protected]> Signed-off-by: Larry <[email protected]> Co-authored-by: Larry <[email protected]>

Update ds v3 parameters in stress test. (NVIDIA#3676)

792b71f

waive gemma on L20 (NVIDIA#3766)

b11cb2f

Signed-off-by: Ivy Zhang <[email protected]>

https://nvbugs/5141291: Fix convert.py script for Qwen model. (NVIDIA…

a824946

…#3758) Include Qwen2VLDecoderLayer in the smooth_qwen2_model function. Signed-off-by: Yukun He <[email protected]>

fix: PP4 fixes and cleanup (NVIDIA#3688)

851e2f5

Signed-off-by: Anurag Mukkara <[email protected]> Co-authored-by: Sharan Chetlur <[email protected]>

remove benchmark test list (NVIDIA#3643)

3e56e40

Signed-off-by: Ivy Zhang <[email protected]>

chuangz0 and others added 9 commits April 23, 2025 14:47

skip disagg deepseek test if sm!=90 (NVIDIA#3720)

f08e599

Signed-off-by: Chuang Zhu <[email protected]>

test: skip failed cases on B200 (NVIDIA#3710)

b0ac7c9

* add skip condition to tests Signed-off-by: xinhe-nv <[email protected]> * fix error Signed-off-by: xinhe-nv <[email protected]> --------- Signed-off-by: xinhe-nv <[email protected]>

add know issue to deepseek doc. (NVIDIA#3800)

2f02263

Signed-off-by: Fanrong Li <[email protected]>

Fix ModelOpt Mixtral AWQ OOM (NVIDIA#3714) (NVIDIA#3761)

41b0371

Signed-off-by: Barry Kang <[email protected]> Co-authored-by: Larry <[email protected]>

Waive L0 tests (NVIDIA#3826)

e0691e6

Signed-off-by: Yiqing Yan <[email protected]>

[doc] Better document for Draft-Target-Model (DTM) speculative decodi…

33c4d49

…ng (NVIDIA#3797) Signed-off-by: wili-65535 <[email protected]>

chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19

19bd6f8

Signed-off-by: Yuan Tong <[email protected]>

tongyuantongyu closed this May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19 #3850

chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19 #3850

Uh oh!

tongyuantongyu commented Apr 25, 2025

Uh oh!

Uh oh!

chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19 #3850

chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19 #3850

Uh oh!

Conversation

tongyuantongyu commented Apr 25, 2025

PR title

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Uh oh!