-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use tensor_parallel_size for vllm reference in GRPO? #2814
Comments
For Multi-Node it is currently not possible, but we are working on it. Meanwhile if your training only uses <= 8carda. You can try to make vllm work on a single node while reserving two cards, and set in GRPOTrainer LLM to use tp=2. It should work. |
+1 I need multi-node training too. And Qwen72B cannot launch within 1 carda. |
+1 I need multi-node training too. help~ |
I currently need to train a 32b model using grpo. The setup includes 8 H800 GPUs. |
Can't wait to see the multi-node training available! |
what? I can't believe that GRPO not support multi-mode training for now. |
GRPO use vllm to load reference model for data sampling , The limitation is that tensor parallel are not supported.
What if the reference model is larger than One GPU can hold, for example, 72B with 40GB's H800,
Is there any setting we can set the tensor_parallel_size for vllm params?
The text was updated successfully, but these errors were encountered: