Best learning resources for AI
CNCF-incubating model inference platform (formerly KFServing) that provides Kubernetes CRDs for scalable predictive and generative workloads.
Open-source, high-performance server for deploying and scaling AI/ML models on GPUs or CPUs, supporting multiple frameworks and cloud/edge targets.
This paper introduces a Relational Memory Core that embeds multi-head dot-product attention into recurrent memory to enable explicit relational reasoning. Evaluated on synthetic distance-sorting, program execution, partially-observable reinforcement learning and large-scale language-modeling benchmarks, it consistently outperforms LSTM and memory-augmented baselines, setting state-of-the-art results on WikiText-103, Project Gutenberg and GigaWord. By letting memories interact rather than merely store information, the approach substantially boosts sequential relational reasoning and downstream task performance.
An open-source platform from Databricks that manages the entire machine-learning lifecycle with experiment tracking, model packaging, registry and deployment.
The BERT (Bidirectional Encoder Representations from Transformers) paper introduced a powerful pre-trained language model that uses deep bidirectional transformers and masked language modeling to capture both left and right context. Unlike prior unidirectional models, BERT achieved state-of-the-art performance across 11 NLP tasks (like GLUE, SQuAD) by enabling fine-tuning with minimal task-specific adjustments. Its impact reshaped NLP by setting a new standard for transfer learning, greatly improving accuracy on tasks such as question answering, sentiment analysis, and natural language inference, and inspiring a wave of follow-up models like RoBERTa, ALBERT, and T5.
OpenVINO is an open-source toolkit from Intel that streamlines the optimization and deployment of AI inference models across a wide range of Intel® hardware.
This paper introduces GPipe, a model-parallelism library designed to train large neural networks efficiently using pipeline parallelism. It partitions models across accelerators, processes micro-batches in parallel, and supports synchronous gradient updates. GPipe enables near-linear scaling with the number of devices while maintaining model quality and training stability. It achieves state-of-the-art performance in large-scale image classification (AmoebaNet) and multilingual machine translation (6B parameter Transformer), demonstrating flexibility across tasks. Its impact lies in making massive model training more practical and accessible across diverse architectures without relying on high-speed interconnects or custom model designs.
Microsoft’s high-performance, cross-platform inference engine for ONNX and GenAI models.
An Iguazio-backed open-source framework that orchestrates data/ML/LLM pipelines with serverless execution, tracking and monitoring.
This paper introduces GPT-2, showing that large-scale language models trained on diverse internet text can perform a wide range of natural language tasks in a zero-shot setting — without any task-specific training. By scaling up to 1.5 billion parameters and training on WebText, GPT-2 achieves state-of-the-art or competitive results on benchmarks like language modeling, reading comprehension, and question answering. Its impact has been profound, pioneering the trend toward general-purpose, unsupervised language models and paving the way for today’s foundation models in AI.
Open-source, node-based workflow-automation platform for designing and running complex integrations and AI-powered flows.
NVIDIA’s model-parallel training library for GPT-like transformers at multi-billion-parameter scale.