Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Allow EVs to be set while calling examples stale No activity in 60 days on issue or PR
#121 opened Jul 19, 2021 by roclark Loading…
Fix a crash with multiple datasets stale No activity in 60 days on issue or PR
#201 opened Mar 10, 2022 by jlamypoirier Loading…
Allow random state variables to be synced across TP. stale No activity in 60 days on issue or PR
#206 opened Apr 17, 2022 by thomasw21 Loading…
[Fix] Only warm up jit fusions without specific arguments stale No activity in 60 days on issue or PR
#229 opened Jul 5, 2022 by kisseternity Loading…
[FIX] tools/merge_mp_partitions.py before LM evaluation stale No activity in 60 days on issue or PR
#235 opened Jul 13, 2022 by donghyeonk Loading…
Update recomputation argument in examples stale No activity in 60 days on issue or PR
#237 opened Jul 24, 2022 by ktaebum Loading…
Trigger memory log based on skipped_iter stale No activity in 60 days on issue or PR
#241 opened Aug 14, 2022 by ktaebum Loading…
consider when not using distributed file system stale No activity in 60 days on issue or PR
#246 opened Aug 22, 2022 by ktaebum Loading…
Fix argument-help typo stale No activity in 60 days on issue or PR
#257 opened Nov 2, 2022 by miguelusque Loading…
Add UL2 data sampling and pretraining stale No activity in 60 days on issue or PR
#268 opened Dec 13, 2022 by janEbert Loading…
Add support for HF tokenizer stale No activity in 60 days on issue or PR
#272 opened Dec 21, 2022 by thomasw21 Loading…
add support for customized pipeline stages stale No activity in 60 days on issue or PR
#274 opened Jan 11, 2023 by cyanguwa Loading…
Add support for MegaBlocks MoEs stale No activity in 60 days on issue or PR
#288 opened Feb 22, 2023 by tgale96 Loading…
Fix two small problems stale No activity in 60 days on issue or PR
#291 opened Feb 28, 2023 by janEbert Loading…
Fix torch six import error for Torch 2.0 stale No activity in 60 days on issue or PR
#294 opened Mar 14, 2023 by ajindal1 Loading…
Fix text_generation_cli stale No activity in 60 days on issue or PR
#295 opened Mar 17, 2023 by andy-yang-1 Loading…
fix typo stale No activity in 60 days on issue or PR
#304 opened Apr 2, 2023 by Jiaxin-Wen Loading…
Print tokens_per_epoch stale No activity in 60 days on issue or PR
#309 opened Apr 13, 2023 by xu-song Loading…
Add log index map files stale No activity in 60 days on issue or PR
#310 opened Apr 13, 2023 by xu-song Loading…
polish 1f1b interleaved schedule logic stale No activity in 60 days on issue or PR
#325 opened Apr 25, 2023 by liuzhenhai93 Loading…
Fix duplicate init_process_group stale No activity in 60 days on issue or PR
#336 opened May 9, 2023 by drcege Loading…
Modify LayerNorm to support other dtype inputs stale No activity in 60 days on issue or PR
#344 opened May 12, 2023 by jacob-crux Loading…
A typing error on variable name in distrib_optimizer.py (param->model… stale No activity in 60 days on issue or PR
#350 opened May 21, 2023 by xshaun Loading…
Update run_text_generation_server.py stale No activity in 60 days on issue or PR
#353 opened May 29, 2023 by mzamini92 Loading…
Update ensemble_classifier.py stale No activity in 60 days on issue or PR
#354 opened May 29, 2023 by mzamini92 Loading…
ProTip! Adding no:label will show everything without a label.