feat: add MoE LoRA rank scaling and torch_mm to MoE LoRA#1300
Open
hemildesai wants to merge 4 commits intomainfrom
Open
feat: add MoE LoRA rank scaling and torch_mm to MoE LoRA#1300hemildesai wants to merge 4 commits intomainfrom
hemildesai wants to merge 4 commits intomainfrom
Conversation
Contributor
Author
|
/ok to test 987d4e9 |
Contributor
Author
|
/ok to test 1fb50e5 |
Contributor
Author
|
/ok to test 51a8283 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
wandb - https://wandb.ai/Nemo-automodel/automodel-moe-lora
Implement per-expert LoRA rank scaling based on LoRA Without Regret: when
moe_rank_scaling=True, MoE expert modules receive a LoRA rank ofdim // n_activated_expertswhile standard Linear modules keep the full rank. This allows training a separate LoRA per expert with appropriately reduced rank, matching the paper's recommendation. Default isFalse, preserving existing behavior.Also renames
lora_moe.pytolora_experts.pyfor clarity, updates Qwen3 MoE configs for torch_mm experts backend, and adds a new TE packed-sequence LoRA config.nemo_automodel/components/_peft/lora.py
moe_rank_scaling: bool = FalsetoPeftConfigfrom_dictdeserializationapply_lora_to_linear_modules, compute scaled rank for MoE modules when enabled, with validation (ValueError if rank < 1) and a warning on non-exact divisionnemo_automodel/components/_peft/lora_moe.py to lora_experts.py
tests/unit_tests/_peft/test_lora_moe.py
examples/llm_finetune/qwen/