-
Notifications
You must be signed in to change notification settings - Fork 467
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[EP] add initial support for NVSHMEM-based all-to-all
CLA Signed
This label is managed by the Meta Open Source bot.
#1569
opened Aug 14, 2025 by
tianyu-l
Loading…
added better guidance for if deprecated tokenizer path fails
CLA Signed
This label is managed by the Meta Open Source bot.
#1568
opened Aug 14, 2025 by
wesleytruong
Loading…
[Do Not Land] Debug for SDPA + CP nan issue in DeepSeekV3
CLA Signed
This label is managed by the Meta Open Source bot.
Multinode SkyPilot example
CLA Signed
This label is managed by the Meta Open Source bot.
#1564
opened Aug 13, 2025 by
alex000kim
Loading…
fix: remove redundant legacy usage of mp in checkpoint
CLA Signed
This label is managed by the Meta Open Source bot.
#1562
opened Aug 13, 2025 by
yzs981130
Loading…
[WIP] Experimental implementation of gpt-oss (grouped GEMM MoE + FlexAttention sink/sliding)
#1559
opened Aug 13, 2025 by
KhoomeiK
Loading…
[PoC] Enable flexible different layout for same mesh via a util function
CLA Signed
This label is managed by the Meta Open Source bot.
#1550
opened Aug 11, 2025 by
fduwjj
Loading…
[WIP] [mxfp8] torchao mxfp8 moe integration
CLA Signed
This label is managed by the Meta Open Source bot.
#1549
opened Aug 11, 2025 by
danielvegamyhre
•
Draft
added example for bidirectional checkpoint testing
CLA Signed
This label is managed by the Meta Open Source bot.
#1540
opened Aug 6, 2025 by
wesleytruong
Loading…
add support for simplefsdp+ep
CLA Signed
This label is managed by the Meta Open Source bot.
#1529
opened Aug 5, 2025 by
ruisizhang123
Loading…
Adding logic for cleaning up FT checkpoints
CLA Signed
This label is managed by the Meta Open Source bot.
#1528
opened Aug 5, 2025 by
bentherien
Loading…
[WIP][Dion Official Optimizer, Muon] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class
CLA Signed
This label is managed by the Meta Open Source bot.
Fix semi-sync training with 1GPU per FT replica
CLA Signed
This label is managed by the Meta Open Source bot.
#1505
opened Jul 31, 2025 by
bentherien
Loading…
perf testing
CLA Signed
This label is managed by the Meta Open Source bot.
#1488
opened Jul 29, 2025 by
ankitageorge
•
Draft
[Evaluation] Adding evaluation feature to TorchTitan
CLA Signed
This label is managed by the Meta Open Source bot.
#1470
opened Jul 28, 2025 by
raymin0223
Loading…
[autoparallel] Enable bucketing passes for autoparallel, reorder and sink_waits.
CLA Signed
This label is managed by the Meta Open Source bot.
#1463
opened Jul 25, 2025 by
IvanKobzarev
Loading…
Autoparallel support for DP-only, DP+TP, or TP-only
CLA Signed
This label is managed by the Meta Open Source bot.
#1459
opened Jul 25, 2025 by
IvanKobzarev
Loading…
[WIP] Integrate autoparallel into torchtitan
CLA Signed
This label is managed by the Meta Open Source bot.
#1458
opened Jul 25, 2025 by
IvanKobzarev
Loading…
add lr logging
CLA Signed
This label is managed by the Meta Open Source bot.
#1453
opened Jul 24, 2025 by
samsja
Loading…
[torchtitan] TorchFunctionMode + SAC issue
CLA Signed
This label is managed by the Meta Open Source bot.
#1434
opened Jul 21, 2025 by
XilunWu
Loading…
[torchtitan] CP + SDPA issue reproduce
CLA Signed
This label is managed by the Meta Open Source bot.
#1432
opened Jul 21, 2025 by
XilunWu
Loading…
[Refactor] Modular Integration Test Framework with DeepSeek-v3 Support
CLA Signed
This label is managed by the Meta Open Source bot.
#1431
opened Jul 21, 2025 by
wwwjn
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-07-14.