AIAny - ms-swift (SWIFT: Scalable lightWeight Infrastructure for Fine-Tuning)

Overview

ms-swift (SWIFT — Scalable lightWeight Infrastructure for Fine-Tuning) is an end-to-end framework developed by the ModelScope community for training, aligning, evaluating, quantizing and deploying large language models and multimodal large models. It aims to provide day-0 support for a wide range of foundation models and a full pipeline that is friendly for both research experiments and production engineering.

Key capabilities

Wide model coverage: Day-0 / out-of-the-box support for 600+ text-only large models and 300+ multimodal large models (examples include Qwen series, InternLM, GLM4.5, Llama4, Llava, Phi4, Ovis, etc.).
Full-pipeline: Supports pre-training, supervised fine-tuning (SFT), human-preference alignment (DPO, GRPO family, PPO, etc.), evaluation (EvalScope backend), quantization (AWQ, GPTQ, BNB, FP8) and deployment (vLLM, LMDeploy, SGLang).
Lightweight & quantized training: Built-in support for popular parameter-efficient fine-tuning (PEFT) methods such as LoRA, QLoRA, DoRA, LoRA+, LongLoRA, Adapter, ReFT and more, plus support for training on quantized checkpoints to reduce resource needs.
Advanced parallelism & scalability: Integrates Megatron-style parallelism (tensor/pipeline/sequence/device/embedding parallelism), TP/PP/CP/EP/VPP and specific optimizations for MoE models. Also supports DeepSpeed ZeRO, FSDP/FSDP2 and multiple distributed strategies.
Reinforcement and preference learning: Implements a rich GRPO algorithm family (GRPO, DAPO, GSPO, SAPO, CISPO, CHORD, RLOO, Reinforce++), DPO, KTO and other reward-based training workflows, including tools for multi-turn and agent-style training.
Multimodal and packing support: Mixed-modality training (text, image, audio, video) with multimodal packing for faster throughput and flexible control of vision/text modules.
Inference & deployment acceleration: Interoperability with vLLM, LMDeploy and SGLang to accelerate inference and production deployment. Provides options for streaming, merging LoRA adapters for efficient inference, and different infer backends.
Evaluation & tooling: Built-in integration with EvalScope and many evaluation datasets; provides CLI, Python SDK and a Web UI (Gradio) for zero-threshold usage.

Typical use cases

Rapid SFT / RLHF experiments on popular LLMs using LoRA/QLoRA or full-parameter training.
Training and deploying multimodal LLMs with preference alignment and multimodal reward models.
Large-scale cluster training using Megatron-SWIFT for MoE or huge-parameter models.
Exporting and quantizing models for inference with AWQ/GPTQ/BF16/FP8 and running them on vLLM/LMDeploy backends.

Installation & quick start

Install via pip: pip install ms-swift -U or install editable from source.
CLI, Python API and Web UI are provided. Example quick command lines cover single-GPU LoRA fine-tuning, Megatron distributed training, RLHF workflows, inference and export/quantization steps.

Documentation & paper

Official documentation is hosted on ReadTheDocs (Chinese and English versions). The project also provides an arXiv/AAAI paper describing SWIFT's design and capabilities.

Target users

Researchers and engineers who need a flexible, production-oriented framework to fine-tune, align and deploy foundation and multimodal models, especially when using PEFT, parallel/distributed training, quantized training, or advanced RLHF-style algorithms.

ms-swift (SWIFT: Scalable lightWeight Infrastructure for Fine-Tuning)

Introduction

Overview

Key capabilities

Typical use cases

Installation & quick start

Documentation & paper

Target users

Information

Categories

Tags

More Items

Genesis

MemU

OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation