LogoAIAny
Icon for item

ms-swift (SWIFT: Scalable lightWeight Infrastructure for Fine-Tuning)

ms-swift (SWIFT) is an extensible, lightweight infrastructure from the ModelScope community for fine-tuning, evaluating, quantizing and deploying large language models (LLMs) and multimodal LLMs. It supports hundreds of text and multimodal models, many low-cost fine-tuning and quantized training techniques, Megatron-style model parallelism, RL/GRPO family algorithms for alignment, and multiple inference/deployment backends such as vLLM and LMDeploy. ms-swift provides CLI, Python APIs and a Web UI for end-to-end model workflows.

Introduction

Overview

ms-swift (SWIFT — Scalable lightWeight Infrastructure for Fine-Tuning) is an end-to-end framework developed by the ModelScope community for training, aligning, evaluating, quantizing and deploying large language models and multimodal large models. It aims to provide day-0 support for a wide range of foundation models and a full pipeline that is friendly for both research experiments and production engineering.

Key capabilities
  • Wide model coverage: Day-0 / out-of-the-box support for 600+ text-only large models and 300+ multimodal large models (examples include Qwen series, InternLM, GLM4.5, Llama4, Llava, Phi4, Ovis, etc.).
  • Full-pipeline: Supports pre-training, supervised fine-tuning (SFT), human-preference alignment (DPO, GRPO family, PPO, etc.), evaluation (EvalScope backend), quantization (AWQ, GPTQ, BNB, FP8) and deployment (vLLM, LMDeploy, SGLang).
  • Lightweight & quantized training: Built-in support for popular parameter-efficient fine-tuning (PEFT) methods such as LoRA, QLoRA, DoRA, LoRA+, LongLoRA, Adapter, ReFT and more, plus support for training on quantized checkpoints to reduce resource needs.
  • Advanced parallelism & scalability: Integrates Megatron-style parallelism (tensor/pipeline/sequence/device/embedding parallelism), TP/PP/CP/EP/VPP and specific optimizations for MoE models. Also supports DeepSpeed ZeRO, FSDP/FSDP2 and multiple distributed strategies.
  • Reinforcement and preference learning: Implements a rich GRPO algorithm family (GRPO, DAPO, GSPO, SAPO, CISPO, CHORD, RLOO, Reinforce++), DPO, KTO and other reward-based training workflows, including tools for multi-turn and agent-style training.
  • Multimodal and packing support: Mixed-modality training (text, image, audio, video) with multimodal packing for faster throughput and flexible control of vision/text modules.
  • Inference & deployment acceleration: Interoperability with vLLM, LMDeploy and SGLang to accelerate inference and production deployment. Provides options for streaming, merging LoRA adapters for efficient inference, and different infer backends.
  • Evaluation & tooling: Built-in integration with EvalScope and many evaluation datasets; provides CLI, Python SDK and a Web UI (Gradio) for zero-threshold usage.
Typical use cases
  • Rapid SFT / RLHF experiments on popular LLMs using LoRA/QLoRA or full-parameter training.
  • Training and deploying multimodal LLMs with preference alignment and multimodal reward models.
  • Large-scale cluster training using Megatron-SWIFT for MoE or huge-parameter models.
  • Exporting and quantizing models for inference with AWQ/GPTQ/BF16/FP8 and running them on vLLM/LMDeploy backends.
Installation & quick start
  • Install via pip: pip install ms-swift -U or install editable from source.
  • CLI, Python API and Web UI are provided. Example quick command lines cover single-GPU LoRA fine-tuning, Megatron distributed training, RLHF workflows, inference and export/quantization steps.
Documentation & paper
  • Official documentation is hosted on ReadTheDocs (Chinese and English versions). The project also provides an arXiv/AAAI paper describing SWIFT's design and capabilities.
Target users

Researchers and engineers who need a flexible, production-oriented framework to fine-tune, align and deploy foundation and multimodal models, especially when using PEFT, parallel/distributed training, quantized training, or advanced RLHF-style algorithms.

Information

  • Websitegithub.com
  • AuthorsModelScope community, Yuze Zhao, Jintao Huang, Jinghan Hu, Xingjun Wang, Yunlin Mao, Daoze Zhang, Zeyinzi Jiang, Zhikai Wu, Baole Ai, Ang Wang, Wenmeng Zhou, Yingda Chen
  • Published date2023/08/01

More Items