Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[GRPO Trainer] Uneven GPU Utilization When Enabling vLLM with Multi-GPU Training ⚡accelerate Related to accelerate 🚀 deepspeed Related to deepspeed 🏋 GRPO Related to GRPO
#2825 opened Feb 11, 2025 by aeroplanepaper
5 tasks done
Tool usage support in tokenizers for Agentic RL ✨ enhancement New feature or request 🏋 SFT Related to SFT
#2821 opened Feb 10, 2025 by August-murr
Support stop strings list for GRPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2820 opened Feb 10, 2025 by haoxiongliu
How to use tensor_parallel_size for vllm reference in GRPO? ⚡accelerate Related to accelerate 🏋 GRPO Related to GRPO
#2814 opened Feb 10, 2025 by bannima
[Question] Proper data format for GRPO Agent Training 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2809 opened Feb 9, 2025 by August-murr
PPO trainer does not evaluate even if I have set eval_strategy and eval_steps 🐛 bug Something isn't working 🏋 PPO Related to PPO
#2808 opened Feb 9, 2025 by ruiqi-zhong
5 tasks done
GRPO unbalanced-memory 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2805 opened Feb 8, 2025 by mdy666
5 tasks done
There may be an error in grpotrainer 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2803 opened Feb 8, 2025 by macheng6
GRPO implementation issue: VLLM usage may hurt performance. 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2802 opened Feb 8, 2025 by yynil
5 tasks done
CUDA out of memory issue for SFT Trainer 🐛 bug Something isn't working ⚡ PEFT Related to PEFT 🏋 SFT Related to SFT
#2819 opened Feb 7, 2025 by ibitec7
Error when using use_vllm=True with GRPOTrainer on V100 GPUs 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2798 opened Feb 7, 2025 by kawamou
5 tasks done
GRPO failed when training with fsdp ⚡accelerate Related to accelerate 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2796 opened Feb 7, 2025 by cuong-dyania
5 tasks done
IndexError: pop from an empty deque while using PPO and downgrading accelerate to 0.34.2 ⚡accelerate Related to accelerate 🐛 bug Something isn't working ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO
#2795 opened Feb 7, 2025 by JohnConnor123
5 tasks done
Return completions_length in kwargs for GRPO trainers ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2794 opened Feb 7, 2025 by casper-hansen
GRPOTrainer in the current main branch doesn't work (v0.14.0 works) 🐛 bug Something isn't working 🏋 GRPO Related to GRPO ⏳ needs more info Additional information or clarification is required to proceed
#2791 opened Feb 7, 2025 by zhengqigao
5 tasks done
vLLM doesn't estimate the model size properly ⚡accelerate Related to accelerate 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#2788 opened Feb 6, 2025 by Superskyyy
5 tasks done
Load/Savings Checkpoint Fails using DeepSpeed - GRPO 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed
#2787 opened Feb 6, 2025 by zaddy6
5 tasks done
Add vLLM support to LogCompletionsCallback ⚡accelerate Related to accelerate 🏋 DPO Related to DPO ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2786 opened Feb 6, 2025 by tchang1997
NashMD trainer sampling policy wrong ⚡accelerate Related to accelerate 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#2781 opened Feb 6, 2025 by zhourunlong
5 tasks done
lora don't work! OOM 🐛 bug Something isn't working ⚡ PEFT Related to PEFT
#2780 opened Feb 6, 2025 by zhangguoxin1
5 tasks done
ORPOTrainer crashes due to pickling failure if dataloader_num_workers > 0 🐛 bug Something isn't working 🏋 ORPO Related to ORPO
#2779 opened Feb 6, 2025 by kiratp
Allow vllm sub-batching to avoid CUDA out of memory ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2775 opened Feb 5, 2025 by cfpark00
ProTip! Adding no:label will show everything without a label.