-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: Set num_microbatches=pp_size with overlap scheduler
#3878
opened Apr 26, 2025 by
amukkara
Loading…
fix: [https://nvbugspro.nvidia.com/bug/5242406][fix] Fix fp8 kvcache support
#3877
opened Apr 26, 2025 by
hlu1
Loading…
feat: Re-enable Llama4 fusion and add AllReduce CUDA Graph Fix
#3876
opened Apr 25, 2025 by
zihaok
Loading…
Draft: perf: [TRTLLM-4717][perf] Set CUDA graph max batch size and padding in throughput benchmark.
#3875
opened Apr 25, 2025 by
FrankD412
Loading…
feat: refactoring dataset generation and adding tests
others
#3866
opened Apr 25, 2025 by
hypdeb
Loading…
fix: Fix FMHA-based MLA in the generation phase and add MLA unit test
#3863
opened Apr 25, 2025 by
jinyangyuan-nvidia
Loading…
fix: [https://nvbugspro.nvidia.com/bug/5243482] If FlashMLA is used, the existence of FMHA based MLA kernels should not be checked.
#3862
opened Apr 25, 2025 by
bobboli
Loading…
infra: [TRTLLM-4475][TRTLLM-4565] Add pipeline hierarchy and basic info in the Jenkins job page
#3859
opened Apr 25, 2025 by
ZhanruiSunCh
•
Draft
fix: Detect pmix and raise error when mpirun is not used.
#3858
opened Apr 25, 2025 by
yuxianq
Loading…
feat: add health_generate route to openai serving
Community Engagement
Community want to contribute
#3856
opened Apr 25, 2025 by
dsingal0
Loading…
fix: add warmup flag into py_executor to prevent enable profiler during wa…
#3852
opened Apr 25, 2025 by
byshiue
Loading…
chore: [DEMONSTRATION ONLY] 1st Mass integration of release/0.19
#3850
opened Apr 25, 2025 by
tongyuantongyu
•
Draft
feat: Mistral-Large-2 support in the Pytorch workflow
LLM API/Workflow
new model
#3845
opened Apr 24, 2025 by
hypdeb
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.