GitHub - s-smits/grpo-optuna: Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna

s-smits / grpo-optuna Public

Notifications You must be signed in to change notification settings
Fork 5
Star 38

Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna

38 stars 5 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Repository files navigation

⚠️ WIP -- Error prone ⚠️ Contributions more than welcome.

About

Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%