generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
5
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[GRPO Trainer] Uneven GPU Utilization When Enabling vLLM with Multi-GPU Training
⚡accelerate
Related to accelerate
🚀 deepspeed
Related to deepspeed
🏋 GRPO
Related to GRPO
#2825
opened Feb 11, 2025 by
aeroplanepaper
5 tasks done
Tool usage support in tokenizers for Agentic RL
✨ enhancement
New feature or request
🏋 SFT
Related to SFT
#2821
opened Feb 10, 2025 by
August-murr
Support stop strings list for GRPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2820
opened Feb 10, 2025 by
haoxiongliu
How to use tensor_parallel_size for vllm reference in GRPO?
⚡accelerate
Related to accelerate
🏋 GRPO
Related to GRPO
#2814
opened Feb 10, 2025 by
bannima
What is the minimum GPU requirement in gigabytes for TRL intensive training?
#2813
opened Feb 10, 2025 by
lonngxiang
[Question] Proper data format for GRPO Agent Training
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2809
opened Feb 9, 2025 by
August-murr
PPO trainer does not evaluate even if I have set eval_strategy and eval_steps
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2808
opened Feb 9, 2025 by
ruiqi-zhong
5 tasks done
Please make GRPO support 'data_collator' and hence multimodal LLM
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2807
opened Feb 8, 2025 by
thusinh1969
GRPO unbalanced-memory
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2805
opened Feb 8, 2025 by
mdy666
5 tasks done
There may be an error in grpotrainer
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2803
opened Feb 8, 2025 by
macheng6
GRPO implementation issue: VLLM usage may hurt performance.
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2802
opened Feb 8, 2025 by
yynil
5 tasks done
Error when using use_vllm=True with GRPOTrainer on V100 GPUs
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2798
opened Feb 7, 2025 by
kawamou
5 tasks done
GRPO failed when training with fsdp
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2796
opened Feb 7, 2025 by
cuong-dyania
5 tasks done
IndexError: pop from an empty deque
while using PPO and downgrading accelerate to 0.34.2
⚡accelerate
#2795
opened Feb 7, 2025 by
JohnConnor123
5 tasks done
Return New feature or request
🏋 GRPO
Related to GRPO
completions_length
in kwargs for GRPO trainers
✨ enhancement
#2794
opened Feb 7, 2025 by
casper-hansen
GRPOTrainer in the current main branch doesn't work (v0.14.0 works)
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
⏳ needs more info
Additional information or clarification is required to proceed
#2791
opened Feb 7, 2025 by
zhengqigao
5 tasks done
vLLM doesn't estimate the model size properly
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2788
opened Feb 6, 2025 by
Superskyyy
5 tasks done
Load/Savings Checkpoint Fails using DeepSpeed - GRPO
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
#2787
opened Feb 6, 2025 by
zaddy6
5 tasks done
Add vLLM support to LogCompletionsCallback
⚡accelerate
Related to accelerate
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2786
opened Feb 6, 2025 by
tchang1997
NashMD trainer sampling policy wrong
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2781
opened Feb 6, 2025 by
zhourunlong
5 tasks done
lora don't work! OOM
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2780
opened Feb 6, 2025 by
zhangguoxin1
5 tasks done
ORPOTrainer crashes due to pickling failure if dataloader_num_workers > 0
🐛 bug
Something isn't working
🏋 ORPO
Related to ORPO
#2779
opened Feb 6, 2025 by
kiratp
Allow vllm sub-batching to avoid CUDA out of memory
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2775
opened Feb 5, 2025 by
cfpark00
Previous Next
ProTip!
Adding no:label will show everything without a label.