-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Pull requests: huggingface/open-r1
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[GRPO] generate with prompt containing the first <think> tag
#283
opened Feb 11, 2025 by
kashif
Loading…
Fix: Avoid empty keyword argument in VLLMModelConfig from Makefile
#246
opened Feb 8, 2025 by
mattdepaolis
Loading…
Replace the base model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B to Qwen/Qwen2.5-1.5B-Instruct in GRPO
#198
opened Feb 5, 2025 by
DVampire
Loading…
Update: Fix eval crash by disabling vLLM when using DeepSpeed
#147
opened Feb 1, 2025 by
ATaylorAerospace
Loading…
Update: pinned lighteval reference to allow PyTorch 2.5+
#142
opened Jan 31, 2025 by
ATaylorAerospace
Loading…
Replace static plan of action image with dynamic mermaid file
#111
opened Jan 29, 2025 by
INF800
Loading…
[NOT MEANT TO MERG!] GRPO reward func for coding dataset
#105
opened Jan 29, 2025 by
August-murr
Loading…
Solution for Potential Inflation of Reward Metrics for Unparseable Go…
#87
opened Jan 28, 2025 by
agulati18
Loading…
chore: update trl to grpo_vllm branch, move lighteval to uv
#30
opened Jan 25, 2025 by
gerred
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.