Skip to content

Comments

feat: add MoE LoRA rank scaling and torch_mm to MoE LoRA#1300

Open
hemildesai wants to merge 4 commits intomainfrom
hemil/lora-torch-mm
Open

feat: add MoE LoRA rank scaling and torch_mm to MoE LoRA#1300
hemildesai wants to merge 4 commits intomainfrom
hemil/lora-torch-mm

Conversation

@hemildesai
Copy link
Contributor

@hemildesai hemildesai commented Feb 17, 2026

wandb - https://wandb.ai/Nemo-automodel/automodel-moe-lora

Implement per-expert LoRA rank scaling based on LoRA Without Regret: when moe_rank_scaling=True, MoE expert modules receive a LoRA rank of dim // n_activated_experts while standard Linear modules keep the full rank. This allows training a separate LoRA per expert with appropriately reduced rank, matching the paper's recommendation. Default is False, preserving existing behavior.

Also renames lora_moe.py to lora_experts.py for clarity, updates Qwen3 MoE configs for torch_mm experts backend, and adds a new TE packed-sequence LoRA config.

  • nemo_automodel/components/_peft/lora.py

    • Add moe_rank_scaling: bool = False to PeftConfig
    • Wire through from_dict deserialization
    • In apply_lora_to_linear_modules, compute scaled rank for MoE modules when enabled, with validation (ValueError if rank < 1) and a warning on non-exact division
  • nemo_automodel/components/_peft/lora_moe.py to lora_experts.py

    • Rename for clarity; update all imports
  • tests/unit_tests/_peft/test_lora_moe.py

    • Add 5 new tests for moe_rank_scaling:
      • Basic scaling (dim=16, n_activated=2 gives MoE rank=8, Linear rank=16)
      • Default off (both keep full rank)
      • Floor-division with warning (dim=16, n_activated=3 gives rank=5)
      • dim too small raises ValueError
      • Output equivalence with zero-init B matrices
  • examples/llm_finetune/qwen/

    • Update qwen3_moe_30b_lora.yaml (torch_mm experts, disable actckpt)
    • Update qwen3_moe_30b_te_packed_sequence.yaml (safetensors format)
    • Add qwen3_moe_30b_te_packed_sequence_lora.yaml

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 17, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@hemildesai
Copy link
Contributor Author

/ok to test 987d4e9

@hemildesai
Copy link
Contributor Author

/ok to test 1fb50e5

Signed-off-by: Hemil Desai <hemild@nvidia.com>
@hemildesai
Copy link
Contributor Author

/ok to test 51a8283

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant