Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial GRPO exps on the Numina dataset #262

Closed
wants to merge 1 commit into from
Closed

Conversation

edbeeching
Copy link
Collaborator

Adding initial GRPO exps on the numina for the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B model

@edbeeching edbeeching requested a review from lewtun February 10, 2025 09:22
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a comment about squashing the configs into 1 and potentially renaming the config to make it easier for others to understand which one is which (just ideas, happy to go with what you prefer)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we collapse these files into a single config and share a README with the Slurm commands that override them? That way we have 1 file linked to 1 dataset

Also, WDYT about having something like config_numina_math.yaml instead of config_v00.00 which is a bit cryptic for the community?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a great idea! I’d like to work on this and help with merging the files and updating the naming conventions.

@edbeeching
Copy link
Collaborator Author

closing, as we are working on other exps that are more successful and I would prefer not to confuse members of the community

@edbeeching edbeeching closed this Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants