Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Allow bootstrap GRPO
#2829 opened Feb 11, 2025 by qgallouedec Draft
5 tasks
GRPO + PEFT + vLLM
#2818 opened Feb 10, 2025 by winglian Loading…
5 tasks
[draft]add grpo support for third-party devices
#2815 opened Feb 10, 2025 by ji-huazhong Loading…
5 tasks
[WIP] [Liger] Liger KTO support
#2812 opened Feb 10, 2025 by vaibhavjindal Draft
5 tasks
✨ Add vLLM guided decoding support to GRPO Trainer
#2811 opened Feb 10, 2025 by kldzj Loading…
1 of 5 tasks
GRPO Environments for custom multi-step rollouts (vLLM-only)
#2810 opened Feb 9, 2025 by willccbb Loading…
5 tasks done
GRPO - Do not load reference model when beta == 0
#2806 opened Feb 8, 2025 by ingambe Loading…
[draft] Use vLLM in LogCompletionsCallback
#2797 opened Feb 7, 2025 by tchang1997 Draft
2 of 4 tasks
Remote GRPO ref model
#2763 opened Feb 4, 2025 by edbeeching Draft
Fix device placement for GRPO attention mask in compute_loss 😴 stale No update from the author, will be closed soon
#2747 opened Feb 3, 2025 by tgaddair Loading…
feat: Add cliprange to GRPO loss
#2739 opened Feb 2, 2025 by joey00072 Draft
1 of 5 tasks
Dynamically load LoRA weights when using vLLM
#2730 opened Feb 1, 2025 by tgaddair Loading…
⚡ Fix GRPO PEFT
#2725 opened Jan 31, 2025 by qgallouedec Draft
5 tasks
WIP: RLOOV2
#2724 opened Jan 31, 2025 by mnoukhov Draft
3 tasks
🔧 Optimize GRPO VRAM Usage
#2669 opened Jan 27, 2025 by andyl98 Loading…
2 of 5 tasks
share parameters between model and ref model
#2668 opened Jan 27, 2025 by GeeeekExplorer Loading…
2 of 5 tasks
🐍 Support Python 3.13
#2593 opened Jan 20, 2025 by qgallouedec Draft
5 tasks
[WIP] [Liger] liger JSD support
#2573 opened Jan 16, 2025 by Mecoli1219 Draft
5 tasks
Reduce memory consumption when training with PPO
#2571 opened Jan 15, 2025 by summerspringwei Loading…
5 tasks
[Liger] liger DPO support
#2568 opened Jan 14, 2025 by kashif Loading…
Add _compute_score method to PPOTrainer
#2560 opened Jan 11, 2025 by oliveiraeliel Draft
2 of 5 tasks
ProTip! Adding no:label will show everything without a label.