AIAny - DeepSpeed

DeepSpeed is an Apache-2.0-licensed software suite created by Microsoft Research to push the limits of distributed deep learning.

Extreme scale & speed – ZeRO-based data parallelism, pipeline/tensor/ expert parallelism and NVMe offload let you train dense or sparse Transformer models with billions to trillions of parameters on commodity or cloud hardware.
Four innovation pillars –
- DeepSpeed-Training (ZeRO, 3D-Parallel, MoE, ZeRO-Infinity)
- DeepSpeed-Inference (custom kernels, heterogeneous memory, low-latency serving)
- DeepSpeed-Compression (quantisation, pruning, ZeroQuant, XTC)
- DeepSpeed4Science (AI-for-science system innovations)
Drop-in for PyTorch – wrap any torch.nn.Module with a single API call; integrate with Megatron-LM, Hugging Face, AzureML and more.
Developer tooling – autotuner, FLOPs profiler, communication logger, rich tutorials and active GitHub community with fortnightly releases.

DeepSpeed has powered headline models such as MT-530B and BLOOM, achieving up to 15× cost reduction and single-digit-millisecond inference latencies compared with prior state-of-the-art systems.

DeepSpeed

Introduction

Information

Categories

Tags

More Items

NVIDIA NeMo

Unsloth

CatBoost