NVIDIA’s model-parallel training library for GPT-like transformers at multi-billion-parameter scale.
An open platform for training, serving and evaluating chat-oriented LLMs—powering Vicuna & Chatbot Arena.
Megatron-LM pioneered tensor and pipeline model parallelism, enabling training of GPT-style models up to hundreds of billions of parameters with high GPU efficiency.