LogoAIAny
Icon for item

DeepSpeed

Open-source deep-learning optimisation library from Microsoft that scales PyTorch training and inference to trillions of parameters with maximum efficiency.

Introduction

DeepSpeed is an Apache-2.0-licensed software suite created by Microsoft Research to push the limits of distributed deep learning.

  • Extreme scale & speed – ZeRO-based data parallelism, pipeline/tensor/ expert parallelism and NVMe offload let you train dense or sparse Transformer models with billions to trillions of parameters on commodity or cloud hardware.
  • Four innovation pillars
    • DeepSpeed-Training (ZeRO, 3D-Parallel, MoE, ZeRO-Infinity)
    • DeepSpeed-Inference (custom kernels, heterogeneous memory, low-latency serving)
    • DeepSpeed-Compression (quantisation, pruning, ZeroQuant, XTC)
    • DeepSpeed4Science (AI-for-science system innovations)
  • Drop-in for PyTorch – wrap any torch.nn.Module with a single API call; integrate with Megatron-LM, Hugging Face, AzureML and more.
  • Developer tooling – autotuner, FLOPs profiler, communication logger, rich tutorials and active GitHub community with fortnightly releases.

DeepSpeed has powered headline models such as MT-530B and BLOOM, achieving up to 15× cost reduction and single-digit-millisecond inference latencies compared with prior state-of-the-art systems.

Information

Categories