Overview
ms-swift (SWIFT — Scalable lightWeight Infrastructure for Fine-Tuning) is an end-to-end framework developed by the ModelScope community for training, aligning, evaluating, quantizing and deploying large language models and multimodal large models. It aims to provide day-0 support for a wide range of foundation models and a full pipeline that is friendly for both research experiments and production engineering.
Key capabilities
- Wide model coverage: Day-0 / out-of-the-box support for 600+ text-only large models and 300+ multimodal large models (examples include Qwen series, InternLM, GLM4.5, Llama4, Llava, Phi4, Ovis, etc.).
- Full-pipeline: Supports pre-training, supervised fine-tuning (SFT), human-preference alignment (DPO, GRPO family, PPO, etc.), evaluation (EvalScope backend), quantization (AWQ, GPTQ, BNB, FP8) and deployment (vLLM, LMDeploy, SGLang).
- Lightweight & quantized training: Built-in support for popular parameter-efficient fine-tuning (PEFT) methods such as LoRA, QLoRA, DoRA, LoRA+, LongLoRA, Adapter, ReFT and more, plus support for training on quantized checkpoints to reduce resource needs.
- Advanced parallelism & scalability: Integrates Megatron-style parallelism (tensor/pipeline/sequence/device/embedding parallelism), TP/PP/CP/EP/VPP and specific optimizations for MoE models. Also supports DeepSpeed ZeRO, FSDP/FSDP2 and multiple distributed strategies.
- Reinforcement and preference learning: Implements a rich GRPO algorithm family (GRPO, DAPO, GSPO, SAPO, CISPO, CHORD, RLOO, Reinforce++), DPO, KTO and other reward-based training workflows, including tools for multi-turn and agent-style training.
- Multimodal and packing support: Mixed-modality training (text, image, audio, video) with multimodal packing for faster throughput and flexible control of vision/text modules.
- Inference & deployment acceleration: Interoperability with vLLM, LMDeploy and SGLang to accelerate inference and production deployment. Provides options for streaming, merging LoRA adapters for efficient inference, and different infer backends.
- Evaluation & tooling: Built-in integration with EvalScope and many evaluation datasets; provides CLI, Python SDK and a Web UI (Gradio) for zero-threshold usage.
Typical use cases
- Rapid SFT / RLHF experiments on popular LLMs using LoRA/QLoRA or full-parameter training.
- Training and deploying multimodal LLMs with preference alignment and multimodal reward models.
- Large-scale cluster training using Megatron-SWIFT for MoE or huge-parameter models.
- Exporting and quantizing models for inference with AWQ/GPTQ/BF16/FP8 and running them on vLLM/LMDeploy backends.
Installation & quick start
- Install via pip:
pip install ms-swift -Uor install editable from source. - CLI, Python API and Web UI are provided. Example quick command lines cover single-GPU LoRA fine-tuning, Megatron distributed training, RLHF workflows, inference and export/quantization steps.
Documentation & paper
- Official documentation is hosted on ReadTheDocs (Chinese and English versions). The project also provides an arXiv/AAAI paper describing SWIFT's design and capabilities.
Target users
Researchers and engineers who need a flexible, production-oriented framework to fine-tune, align and deploy foundation and multimodal models, especially when using PEFT, parallel/distributed training, quantized training, or advanced RLHF-style algorithms.
