generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
🔁 Add retry mechanism for GRPO training with configurable reward thre…
#2823
opened Feb 10, 2025 by
mandeep511
Loading…
3 of 5 tasks
[draft]add grpo support for third-party devices
#2815
opened Feb 10, 2025 by
ji-huazhong
Loading…
5 tasks
✨ Add vLLM guided decoding support to GRPO Trainer
#2811
opened Feb 10, 2025 by
kldzj
Loading…
1 of 5 tasks
GRPO Environments for custom multi-step rollouts (vLLM-only)
#2810
opened Feb 9, 2025 by
willccbb
Loading…
5 tasks done
[draft] Use vLLM in LogCompletionsCallback
#2797
opened Feb 7, 2025 by
tchang1997
•
Draft
2 of 4 tasks
Fix device placement for GRPO attention mask in compute_loss
😴 stale
No update from the author, will be closed soon
#2747
opened Feb 3, 2025 by
tgaddair
Loading…
share parameters between model and ref model
#2668
opened Jan 27, 2025 by
GeeeekExplorer
Loading…
2 of 5 tasks
[Not meant to be merged] Support branch for Trainer refactor
#2594
opened Jan 20, 2025 by
qgallouedec
•
Draft
5 tasks
Reduce memory consumption when training with PPO
#2571
opened Jan 15, 2025 by
summerspringwei
Loading…
5 tasks
Add
_compute_score
method to PPOTrainer
#2560
opened Jan 11, 2025 by
oliveiraeliel
•
Draft
2 of 5 tasks
Previous Next
ProTip!
Adding no:label will show everything without a label.