-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any idea to run on 8 V100-32G gpus? #89
Comments
use lora |
Freezing the vision encoder saves a lot of memory. |
Could you teach me how do you solve it? |
Use PyTorch's torch.utils.checkpoint Most parts of the full model do not support the gradient checkpointing even if gradient_checkpointing=True in the config file. So you need to add torch.utils.checkpoint in the expensive computation in the modeling_qwen2_vl.py file. After that, the training memory at the first step is about 15GB, which is a huge reduction. |
thank you very much |
I tried:
OOM still occurs.
Any ideas to solve? (No A100 available now)
Thank you very much!
The text was updated successfully, but these errors were encountered: