Skip to content

[WIP][Dion Official Optimizer, Muon] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class #1521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

lessw2020
Copy link
Contributor

@lessw2020 lessw2020 commented Aug 4, 2025

The Dion authors have released their official Dion implementation here.
Thus we can move from my prev unofficial impl here to their official implementation.

This PR:
integrates their three main optimizer files with Titan Optimizer class and Titan configs to make it available.
Dion optimizer files live under experiments/dion_optimizer.
directly added to build_optimizer for now, will look at the proper subclassing later.
adds parameterization file to classify lm head, embeddings and 2D matrix to route things appropriately with appropriate scaling factors.

  • adds logging for located lm head and embeddings to make sure these can be checked by user:
Screenshot 2025-08-05 at 10 12 40 PM

Testing:
8B llama3 trains nicely - needs more debugging to verify head, embedding, etc. are all being properly found.

Screenshot 2025-08-03 at 9 07 39 PM

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 4, 2025
@lessw2020 lessw2020 marked this pull request as draft August 4, 2025 04:19
@lessw2020 lessw2020 changed the title [WIP][Dion Official Optimizer] Integrate official Dion optimizer impl with TorchTitan and Optimizer component class [WIP][Dion Official Optimizer] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class Aug 13, 2025
@lessw2020 lessw2020 changed the title [WIP][Dion Official Optimizer] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class [WIP][Dion Official Optimizer, Muon] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant