Best learning resources for AI
Microsoft’s high-performance, cross-platform inference engine for ONNX and GenAI models.
An Iguazio-backed open-source framework that orchestrates data/ML/LLM pipelines with serverless execution, tracking and monitoring.
This paper introduces GPT-2, showing that large-scale language models trained on diverse internet text can perform a wide range of natural language tasks in a zero-shot setting — without any task-specific training. By scaling up to 1.5 billion parameters and training on WebText, GPT-2 achieves state-of-the-art or competitive results on benchmarks like language modeling, reading comprehension, and question answering. Its impact has been profound, pioneering the trend toward general-purpose, unsupervised language models and paving the way for today’s foundation models in AI.
Open-source, node-based workflow-automation platform for designing and running complex integrations and AI-powered flows.
NVIDIA’s model-parallel training library for GPT-like transformers at multi-billion-parameter scale.
Open-source framework for building, shipping and running containerized AI services with a single command.
Netflix’s human-centric framework for building and operating real-life data-science and ML workflows with idiomatic Python and production-grade scaling.
A Kubernetes-native workflow engine (originally at Lyft, now LF AI & Data) that provides strongly-typed, versioned data/ML pipelines at scale.
reveals that language model performance improves predictably as you scale up model size, dataset size, and compute, following smooth power-law relationships. It shows that larger models are more sample-efficient, and optimally efficient training uses very large models on moderate data, stopping well before convergence. The work provided foundational insights that influenced the development of massive models like GPT-3 and beyond, shaping how the AI community understands trade-offs between size, data, and compute in building ever-stronger models.
This paper introduces GPT-3, a 175-billion-parameter autoregressive language model that achieves impressive zero-shot, one-shot, and few-shot performance across diverse NLP tasks without task-specific fine-tuning. Its scale allows it to generalize from natural language prompts, rivaling or surpassing prior state-of-the-art models that require fine-tuning. The paper’s impact is profound: it demonstrated the power of scaling laws, reshaped research on few-shot learning, and sparked widespread adoption of large-scale language models, influencing advancements in AI applications, ethical debates, and commercial deployments globally.
A PyTorch-based system for large-scale model parallel training, memory optimization, and heterogeneous acceleration.
An extensible open-source MLOps framework that lets teams design portable, reproducible pipelines decoupled from infra stacks.