Skip to content

[Feat] Add support for Dr.GRPO algorithm. Provide a better format reward function for countdown task.#1

Open
Bonjir wants to merge 3 commits intoJerryWu-code:mainfrom
Bonjir:main
Open

[Feat] Add support for Dr.GRPO algorithm. Provide a better format reward function for countdown task.#1
Bonjir wants to merge 3 commits intoJerryWu-code:mainfrom
Bonjir:main

Commits

Commits on Jul 15, 2025