-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Allow EVs to be set while calling examples
stale
No activity in 60 days on issue or PR
#121
opened Jul 19, 2021 by
roclark
Loading…
Fix a crash with multiple datasets
stale
No activity in 60 days on issue or PR
#201
opened Mar 10, 2022 by
jlamypoirier
Loading…
Allow random state variables to be synced across TP.
stale
No activity in 60 days on issue or PR
#206
opened Apr 17, 2022 by
thomasw21
Loading…
[Fix] Only warm up jit fusions without specific arguments
stale
No activity in 60 days on issue or PR
#229
opened Jul 5, 2022 by
kisseternity
Loading…
[FIX] tools/merge_mp_partitions.py before LM evaluation
stale
No activity in 60 days on issue or PR
#235
opened Jul 13, 2022 by
donghyeonk
Loading…
Update recomputation argument in examples
stale
No activity in 60 days on issue or PR
#237
opened Jul 24, 2022 by
ktaebum
Loading…
Trigger memory log based on skipped_iter
stale
No activity in 60 days on issue or PR
#241
opened Aug 14, 2022 by
ktaebum
Loading…
consider when not using distributed file system
stale
No activity in 60 days on issue or PR
#246
opened Aug 22, 2022 by
ktaebum
Loading…
Fix argument-help typo
stale
No activity in 60 days on issue or PR
#257
opened Nov 2, 2022 by
miguelusque
Loading…
Add UL2 data sampling and pretraining
stale
No activity in 60 days on issue or PR
#268
opened Dec 13, 2022 by
janEbert
Loading…
Add support for HF tokenizer
stale
No activity in 60 days on issue or PR
#272
opened Dec 21, 2022 by
thomasw21
Loading…
add support for customized pipeline stages
stale
No activity in 60 days on issue or PR
#274
opened Jan 11, 2023 by
cyanguwa
Loading…
Add support for MegaBlocks MoEs
stale
No activity in 60 days on issue or PR
#288
opened Feb 22, 2023 by
tgale96
Loading…
Fix two small problems
stale
No activity in 60 days on issue or PR
#291
opened Feb 28, 2023 by
janEbert
Loading…
Fix torch six import error for Torch 2.0
stale
No activity in 60 days on issue or PR
#294
opened Mar 14, 2023 by
ajindal1
Loading…
Fix text_generation_cli
stale
No activity in 60 days on issue or PR
#295
opened Mar 17, 2023 by
andy-yang-1
Loading…
Print tokens_per_epoch
stale
No activity in 60 days on issue or PR
#309
opened Apr 13, 2023 by
xu-song
Loading…
Add log index map files
stale
No activity in 60 days on issue or PR
#310
opened Apr 13, 2023 by
xu-song
Loading…
polish 1f1b interleaved schedule logic
stale
No activity in 60 days on issue or PR
#325
opened Apr 25, 2023 by
liuzhenhai93
Loading…
Fix duplicate init_process_group
stale
No activity in 60 days on issue or PR
#336
opened May 9, 2023 by
drcege
Loading…
Modify LayerNorm to support other dtype inputs
stale
No activity in 60 days on issue or PR
#344
opened May 12, 2023 by
jacob-crux
Loading…
A typing error on variable name in distrib_optimizer.py (param->model…
stale
No activity in 60 days on issue or PR
#350
opened May 21, 2023 by
xshaun
Loading…
Update run_text_generation_server.py
stale
No activity in 60 days on issue or PR
#353
opened May 29, 2023 by
mzamini92
Loading…
Update ensemble_classifier.py
stale
No activity in 60 days on issue or PR
#354
opened May 29, 2023 by
mzamini92
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.