Release v3.2.2 · modelscope/ms-swift

中文版

新特性

Megatron-SWIFT发布。支持TP、PP、SP、CP等并行技术对Qwen系、Llama系、Deepseek-R1蒸馏系等100+模型进行预训练和微调。支持streaming数据集和序列packing功能支持超大数据集并提升训练效率。更多内容参考Megatron-SWIFT训练文档。
支持多轮GRPO训练以适配例如Deep Search等多轮agent工具调用场景，示例代码参考这里。
- 支持mini-batch，降低训练时的显存消耗。参考GRPO训练文档。
支持iic/gme-Qwen2-VL-2B-Instruct等多模态模型的Embedding训练。具体参考embedding模型训练文档。
支持大模型和多模态大模型的多标签分类和回归任务的训练到部署。示例脚本参考这里。
支持在训练过程中使用EvalScope对模型进行评测，及时了解模型的训练效果。示例脚本参考评测文档。
书写外置plugin，以支持多模态模型LoRA训练LLM的同时，全参数训练ViT，并采用不同的学习率。避免ViT部分merge-lora造成的精度误差。示例脚本参考这里。

新模型

iic/gme-Qwen2-VL-2B-Instruct系列
Qwen/Qwen2.5-VL-32B-Instruct
LLM-Research/gemma-3-4b-it系列
deepseek-ai/DeepSeek-V3-0324
mistralai/Mistral-Small-3.1-24B-Instruct-2503系列

English Version

New Features

Release of Megatron-SWIFT: Megatron-SWIFT has been released, supporting various parallel technologies such as TP (Tensor Parallelism), PP (Pipeline Parallelism), SP (Sequence Parallelism), and CP (Context Parallelism) for pre-training and fine-tuning over 100 models, including the Qwen series, Llama series, and Deepseek-R1 distillation series. It also supports streaming datasets and sequence packing, enabling the handling of ultra-large datasets while improving training efficiency. For more details, refer to the Megatron-SWIFT Training Documentation.
Support for Multi-turn GRPO Training: Supports multi-turn GRPO training to adapt to scenarios such as multi-turn agent tool calls in Deep Search. Example code can be found here.
- Supports mini-batch training to reduce GPU memory consumption during training. Refer to the GRPO Training Documentation.
Embedding Training for Multimodal Models: Supports embedding training for multimodal models such as iic/gme-Qwen2-VL-2B-Instruct. For more information, refer to the Embedding Model Training Documentation.
Multi-label Classification and Regression Tasks for Large Models and Multimodal Large Models: Supports end-to-end training and deployment for multi-label classification and regression tasks for large models and multimodal large models. Example scripts can be found here.
Model Evaluation with EvalScope During Training: Supports model evaluation using EvalScope during training to monitor training performance in real time. Example scripts can be found in the Evaluation Documentation.
Custom External Plugin for LoRA + ViT Training: Provides an external plugin to support LoRA training for LLMs (Large Language Models) while performing full-parameter training for ViTs (Vision Transformers) with different learning rates. This avoids precision errors caused by merging LoRA into the ViT portion. Example code can be found here.

New Models

iic/gme-Qwen2-VL-2B-Instruct series
Qwen/Qwen2.5-VL-32B-Instruct
LLM-Research/gemma-3-4b-it series
deepseek-ai/DeepSeek-V3-0324
mistralai/Mistral-Small-3.1-24B-Instruct-2503 series

What's Changed

update code doc by @hjh0119 in #3498
fix readme by @Jintao-Huang in #3499
feat: swanlab config add ms-swift by @Zeyi-Lin in #3500
Support GME models by @tastelikefeet in #3513
fix docs by @tastelikefeet in #3514
Fix docs links by @tastelikefeet in #3516
fix vllm memory leak by @hjh0119 in #3515
[Docs] Easy .[all] install from git by @xihuai18 in #3518
Fix bugs by @tastelikefeet in #3520
support megatron by @Jintao-Huang in #2885
fix megatron by @Jintao-Huang in #3527
support gemma3 by @hjh0119 in #3492
fix megatron pipeline parallel by @Jintao-Huang in #3529
fix megatron tie_weight by @Jintao-Huang in #3530
support megatron llama by @Jintao-Huang in #3532
Support megatron llama3.1 3.2 by @Jintao-Huang in #3537
更新LlavaHfTemplate以适配transformers版本大于4.47时对LLaVA和LLaVA-Next模型处理图像token逻辑的修改 by @zsxm1998 in #3521
refactor llava-hf by @Jintao-Huang in #3538
fix docs by @Jintao-Huang in #3539
refactor get_megatron_model_meta by @Jintao-Huang in #3542
Gather infonce loss and support hard negative samples by @tastelikefeet in #3548
fix docs by @tastelikefeet in #3553
fix unsloth by @tastelikefeet in #3554
fix grpo mllm split modules by @hjh0119 in #3552
grpo embedding layer lora by @hjh0119 in #3531
update arguments by @Jintao-Huang in #3556
update doc by @hjh0119 in #3557
Support all models' embedding and mask fake negative by @tastelikefeet in #3563
skip grpo first wake up by @hjh0119 in #3562
move grpovllmengine import by @hjh0119 in #3568
fix bugs & support dataset_name by @Jintao-Huang in #3565
fix wrap by @tastelikefeet in #3572
Feature: add train-eval loop by @Yunnglin in #3569
compat vllm>=0.8 by @Jintao-Huang in #3574
[grpo] Fix Incorrect Placement of Data in eval_queue During async_generate by @hjh0119 in #3573
Fix lmdeploy 0.7.3 by @tastelikefeet in #3584
support vit full llm lora by @Jintao-Huang in #3575
support Mistral3.1-2503 by @hjh0119 in #3588
Support megatron packing by @Jintao-Huang in #3595
[megatron] support streaming by @Jintao-Huang in #3609
fix rft by @lxline in #3602
[template] refactor replace media tokens by @Jintao-Huang in #3614
fix top_logprobs by @Jintao-Huang in #3616
Fix bugs by @Jintao-Huang in #3619
Support multi turn grpo by @tastelikefeet in #3615
fix grpo npu context by @hjh0119 in #3597
support regression multi-label by @Jintao-Huang in #3621
Support peft 0.15 by @tastelikefeet in #3623
update grpo warning by @hjh0119 in #3598
fix grpo rm zero3 by @hjh0119 in #3626
GRPO mini batch by @hjh0119 in #3205
fix grpo warning with pt backend by @hjh0119 in #3629
compat transformers 4.50 by @Jintao-Huang in #3625
support train_sampler_random by @Jintao-Huang in #3631
fix grpo multi turn by @tastelikefeet in #3632
update docs by @Jintao-Huang in #3633
Support deepseek v3 0324 by @Jintao-Huang in #3637
fix grpo cosine reward by @hjh0119 in #3638
fix grpo lora split module by @hjh0119 in #3635
fix reward model by @Jintao-Huang in #3641
support qwen2_5_vl_32b by @Jintao-Huang in #3642
fix grpo warning by @hjh0119 in #3630
grpo reset prefix cache by @hjh0119 in #3640
fix prm by @Jintao-Huang in #3647
fix grpo pt ddp by @Jintao-Huang in #3648
[grpo] separate the epsilon by @hjh0119 in #3599
Fix template torch_dtype by @Jintao-Huang in #3651
fix grpo epsilon by @hjh0119 in #3652
update docs by @Jintao-Huang in #3653
set grpo multi turn max tokens by @hjh0119 in #3655
fix label_names by @Jintao-Huang in #3657
fix grpo vllm tp by @Jintao-Huang in #3658
compat vllm0.8.1 by @Jintao-Huang in #3656
Fix evaluation of embedding by @tastelikefeet in #3661
update readme by @Jintao-Huang in #3663
fix npu context by @Jintao-Huang in #3664

New Contributors

@xihuai18 made their first contribution in #3518

Full Changelog: v3.2.1...v3.2.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.2.2

中文版

新特性

新模型

English Version

New Features

New Models

What's Changed

New Contributors

Contributors