AIAny - fairseq

Overview

fairseq is an extensible, high-performance sequence modeling toolkit developed by Facebook AI Research (FAIR) and implemented in Python with PyTorch. It aims to provide reference implementations of important sequence-modeling architectures and research papers, plus practical tooling for training and evaluating models for translation, language modeling, summarization, speech recognition/generation, and other text/audio sequence tasks.

Key features

Reference implementations of many architectures and papers (Transformer, convolutional seq2seq, LSTM-based models, LightConv/DynamicConv, non-autoregressive methods, etc.).
Speech and self-supervised audio support (wav2vec, wav2vec 2.0, XLS-R, wav2vec-U variants, and multilingual/speech-scaling examples).
Multi-GPU and distributed training (data and model parallel workflows).
Fast generation on CPU and GPU with multiple decoding strategies: beam search, diverse beam, top-k/top-p sampling, and lexically constrained decoding.
Support for large-batch training via gradient accumulation and mixed-precision (FP16) for faster training and reduced memory footprint.
Full parameter and optimizer state sharding and CPU offloading for training very large models.
Extensible registry for models, criterions, tasks, optimizers and schedulers; Hydra-based flexible configuration for experiments.

Supported tasks and example models

fairseq provides examples and pretrained checkpoints for: machine translation (including WMT recipes), language modeling (transformer & conv LMs), bilingual and multilingual models (XLM-R, mBART), speech self-supervised pretraining and fine-tuning (wav2vec/wav2vec2, XLS-R), speech-to-speech translation, video-language models (VideoCLIP/VLM), and many research reproductions (RoBERTa, BART, Levenshtein Transformer, non-autoregressive methods, etc.). The repository organizes many paper-driven examples in the examples/ folder to help reproduce or extend published work.

Installation & requirements

Requires PyTorch (recommended >= 1.10.0) and Python >= 3.8.
NVIDIA GPU + NCCL recommended for training; optional libraries include NVIDIA apex for optimized mixed-precision, PyArrow for large dataset handling, and Docker configs (careful with shared memory settings).
Typical developer install: git clone https://github.com/pytorch/fairseq then pip install --editable ./ (or pip install fairseq for released versions).

Ecosystem & integration

Pretrained models are available via a convenient torch.hub interface and example scripts.
Integrates with other projects (for example, xFormers for optimized attention implementations) and is widely used in research reproductions.
Documentation is hosted on ReadTheDocs and the project maintains community channels (GitHub, Twitter, Google Group) for discussions and support.

License & citation

fairseq is MIT-licensed (the license applies to code and released pretrained models).
The recommended citation: Ott et al., "fairseq: A Fast, Extensible Toolkit for Sequence Modeling", NAACL-HLT 2019 (Demonstrations).

Notes

The GitHub project was created on 2017-08-29 and has accumulated many community contributions; over time the repository has grown to include speech, multilingual, and multimodal components in addition to core NLP models.
fairseq is designed both for research reproduction (reference implementations) and for practical model training at scale, making it a common toolkit in NLP and speech research pipelines.

fairseq

Introduction

Overview

Key features

Supported tasks and example models

Installation & requirements

Ecosystem & integration

License & citation

Notes

Information

Categories

Tags

More Items

MLX Examples

Agent Reinforcement Trainer (ART)

SkyPilot