-
Notifications
You must be signed in to change notification settings - Fork 1k
Issues: huggingface/accelerate
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
When saving checkpoint in multi-node training, using Zero3 optimization is model.safetensors file full model ?
#3381
opened Feb 6, 2025 by
KeshavSingh29
2 of 4 tasks
estimate-memory gives ~64GB for meta-llama/Llama-3.1-70B-Instruct instead of anticipated ~140GB
#3379
opened Feb 5, 2025 by
dvrogozh
Initialize model with empty weight causes OOM with offloading to disk
#3374
opened Feb 1, 2025 by
Aiden-Frost
2 of 4 tasks
loading the prodigy optimizer does not move custom parameters to the accelerator
#3372
opened Jan 29, 2025 by
bghira
4 tasks
Training hangs indefinitely on first forward pass when using TPU v3-8 in Kaggle
#3370
opened Jan 27, 2025 by
WpythonW
2 of 4 tasks
Gradient accumulation with deepSpeed has issue if not set during configuration
#3369
opened Jan 27, 2025 by
khalil-Hennara
2 of 4 tasks
DataLoaderShard wrongly yields None instead of StopIteration when its dataloader returns StopIteration immediately
#3367
opened Jan 25, 2025 by
Aleko2286
2 of 4 tasks
"@verify_operation" lead to pretrain of multi-nodes hang
#3364
opened Jan 24, 2025 by
sankexin
2 of 4 tasks
tests/test_cli.py::ModelEstimatorTester::test_no_split_modules fails after 84a67891 in Transformers
#3362
opened Jan 23, 2025 by
dvrogozh
Google Colab TPU
notebook_launcher
doesn't work
#3358
opened Jan 21, 2025 by
matinmoezzi
2 of 4 tasks
Maybe a conflict between accelerate and transformers CLIPVisionModel
#3339
opened Jan 13, 2025 by
striveAgain
2 of 4 tasks
[Feature Request] include a DeepSpeed multi-node config slurm example
contributions-welcome
deepspeed
DS related issues/PRs
#3338
opened Jan 13, 2025 by
sayakpaul
How to save self-defined model with deepspeed zero 3?
#3320
opened Jan 2, 2025 by
amoyplane
2 of 4 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.