Best learning resources for AI
Open-source framework for building, shipping and running containerized AI services with a single command.
Netflix’s human-centric framework for building and operating real-life data-science and ML workflows with idiomatic Python and production-grade scaling.
A Kubernetes-native workflow engine (originally at Lyft, now LF AI & Data) that provides strongly-typed, versioned data/ML pipelines at scale.
reveals that language model performance improves predictably as you scale up model size, dataset size, and compute, following smooth power-law relationships. It shows that larger models are more sample-efficient, and optimally efficient training uses very large models on moderate data, stopping well before convergence. The work provided foundational insights that influenced the development of massive models like GPT-3 and beyond, shaping how the AI community understands trade-offs between size, data, and compute in building ever-stronger models.
This paper introduces GPT-3, a 175-billion-parameter autoregressive language model that achieves impressive zero-shot, one-shot, and few-shot performance across diverse NLP tasks without task-specific fine-tuning. Its scale allows it to generalize from natural language prompts, rivaling or surpassing prior state-of-the-art models that require fine-tuning. The paper’s impact is profound: it demonstrated the power of scaling laws, reshaped research on few-shot learning, and sparked widespread adoption of large-scale language models, influencing advancements in AI applications, ethical debates, and commercial deployments globally.
A PyTorch-based system for large-scale model parallel training, memory optimization, and heterogeneous acceleration.
An extensible open-source MLOps framework that lets teams design portable, reproducible pipelines decoupled from infra stacks.
The book provides a comprehensive yet accessible introduction to probabilistic modeling and inference, covering topics like graphical models, Bayesian methods, and approximate inference. It balances theory with practical examples, making complex probabilistic concepts understandable for newcomers and useful for practitioners. Its impact lies in shaping how students and researchers approach uncertainty in machine learning, offering a unifying probabilistic perspective that has influenced research, teaching, and real-world applications across fields such as AI, robotics, and data science.
This tutorial offers a detailed, line-by-line PyTorch implementation of the Transformer model introduced in "Attention Is All You Need." It elucidates the model's architecture—comprising encoder-decoder structures with multi-head self-attention and feed-forward layers—enhancing understanding through annotated code and explanations. This resource serves as both an educational tool and a practical guide for implementing and comprehending Transformer-based models.
This book offers a comprehensive introduction to algorithmic information theory: it defines plain and prefix Kolmogorov complexity, explains the incompressibility method, relates complexity to Shannon information, and develops tests of randomness culminating in Martin-Löf randomness and Chaitin’s Ω. It surveys links to computability theory, mutual information, algorithmic statistics, Hausdorff dimension, ergodic theory, and data compression, providing numerous exercises and historical notes. By unifying complexity and randomness, it supplies rigorous tools for measuring information content, proving combinatorial lower bounds, and formalizing the notion of random infinite sequences, thus shaping modern theoretical computer science.
A Yunshan Networks open-source observability stack that delivers zero-code eBPF-based tracing, metrics and continuous profiling for cloud-native & AI workloads.
Experience unparalleled image generation capabilities with SDXL Turbo and Stable Diffusion XL. Our models use shorter prompts and generate descriptive images with enhanced composition and realistic aesthetics.