AIAny - Machine Learning Foundation Papers

Deep Residual Learning for Image Recognition

2015

Kaiming He, Xiangyu Zhang +2

The paper “Deep Residual Learning for Image Recognition” (ResNet, 2015) introduced residual networks with shortcut connections, allowing very deep neural networks (over 100 layers) to be effectively trained by reformulating the learning task into residual functions (F(x) = H(x) − x). This innovation solved the degradation problem in deep models, achieving state-of-the-art results on ImageNet (winning ILSVRC 2015) and COCO challenges. Its impact reshaped the design of deep learning architectures across vision and non-vision tasks, becoming a foundational backbone in modern AI systems.

foundation 30u30 paper vision

Pointer Networks

NaN

Oriol Vinyals, Meire Fortunato +1

We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output. We call this architecture a Pointer Net (Ptr-Net). We show Ptr-Nets can be used to learn approximate solutions to three challenging geometric problems -- finding planar convex hulls, computing Delaunay triangulations, and the planar Travelling Salesman Problem -- using training examples alone. Ptr-Nets not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries. We show that the learnt models generalize beyond the maximum lengths they were trained on. We hope our results on these tasks will encourage a broader exploration of neural learning for discrete problems.

foundation 30u30 paper

Variational Lossy Autoencoder

2016

Xi Chen, Diederik P. Kingma +6

This paper proposes the Variational Lossy Autoencoder (VLAE), a VAE that uses autoregressive priors and decoders to deliberately discard local detail while retaining global structure. By limiting the receptive field of the PixelCNN decoder and employing autoregressive flows as the prior, the model forces the latent code to capture only high-level information, yielding controllable lossy representations. Experiments on MNIST, Omniglot, Caltech-101 Silhouettes and CIFAR-10 set new likelihood records for VAEs and demonstrate faithful global reconstructions with replaced textures. VLAE influenced research on representation bottlenecks, pixel-VAE hybrids, and state-of-the-art compression and generation benchmarks.

30u30 paper vision

Neural Message Passing for Quantum Chemistry

2017

Justin Gilmer, Samuel S. Schoenholz +3

This paper introduces Message Passing Neural Networks (MPNNs), a unifying framework for graph-based deep learning, and applies it to quantum-chemistry property prediction, achieving state-of-the-art accuracy on the QM9 benchmark and approaching chemical accuracy on most targets. Its impact includes popularising graph neural networks, influencing subsequent work in cheminformatics, materials discovery, and the broader machine-learning community by demonstrating how learned message passing can replace hand-engineered molecular descriptors.

foundation 30u30 paper science chemistry

A simple neural network module for relational reasoning

2017

Adam Santoro, David Raposo +5

This paper introduces Relation Networks, a plug-and-play neural module that explicitly computes pair-wise object relations. When appended to standard CNN/LSTM encoders the module yields super-human 95.5 % accuracy on CLEVR, solves 18/20 bAbI tasks, and infers hidden links in dynamic physical systems, inspiring later work on relational reasoning across vision, language and RL.

foundation 30u30 paper

Attention Is All You Need

2017

Ashish Vaswani, Noam Shazeer +6

The paper “Attention Is All You Need” (2017) introduced the Transformer — a novel neural architecture relying solely on self-attention, removing recurrence and convolutions. It revolutionized machine translation by dramatically improving training speed and translation quality (e.g., achieving 28.4 BLEU on English-German tasks), setting new state-of-the-art benchmarks. Its modular, parallelizable design opened the door to large-scale pretraining and fine-tuning, ultimately laying the foundation for modern large language models like BERT and GPT. This paper reshaped the landscape of NLP and deep learning, making attention-based models the dominant paradigm across many tasks.

NLP LLM AIGC 30u30 paper+1

Relational recurrent neural networks

2018

Adam Santoro, Ryan Faulkner +8

This paper introduces a Relational Memory Core that embeds multi-head dot-product attention into recurrent memory to enable explicit relational reasoning. Evaluated on synthetic distance-sorting, program execution, partially-observable reinforcement learning and large-scale language-modeling benchmarks, it consistently outperforms LSTM and memory-augmented baselines, setting state-of-the-art results on WikiText-103, Project Gutenberg and GigaWord. By letting memories interact rather than merely store information, the approach substantially boosts sequential relational reasoning and downstream task performance.

foundation 30u30 paper NLP LLM

Category

Explore by categories

All

AI Leaderboard

AI Agent Tutorials

AI Coding Tutorials

AI Agent Papers

Chatbot

Machine Learning Foundation Books

AI Train

AI Deploy

AI Client

Machine Learning Foundation Papers

Machine Learning Foundation Tutorials

AI Image Demos

AI Agent

Large Language Model Tutorials

Large Language Model Papers

Machine Learning Engineering Papers

Computer Vision Tutorials

Computer Vision Papers

Natural Language Processing Papers

Reinforcement Learning Papers

Speech Technology Papers

AI API

AI Coding

AI Image

AI Video

MLOps

MCP Client

MCP Server

AI Video Papers

AI Audio

AI Infra

Deep Residual Learning for Image Recognition

Pointer Networks

Variational Lossy Autoencoder

Neural Message Passing for Quantum Chemistry

A simple neural network module for relational reasoning

Attention Is All You Need

Relational recurrent neural networks