AIAny - Natural Language Processing Papers

ReAct: Synergizing Reasoning and Acting in Language Models

2022

Shunyu Yao, Jeffrey Zhao +5

This paper introduces ReAct, an approach that integrates reasoning and acting in large language models (LLMs). ReAct enables LLMs to generate both reasoning traces and task-specific actions in an interleaved manner. This synergy allows reasoning to help induce, track, and update action plans, while actions interface with external sources like knowledge bases to gather more information, overcoming issues of hallucination and error propagation in prior methods.

paper LLM NLP ai-agent google+1

Neural Machine Translation by Jointly Learning to Align and Translate

2014

Dzmitry Bahdanau, Kyunghyun Cho +1

This paper introduces an attention-based encoder–decoder NMT architecture that learns soft alignments between source and target words while translating, eliminating the fixed-length bottleneck of earlier seq2seq models. The approach substantially improves BLEU, especially on long sentences, and matches phrase-based SMT on English-French without additional hand-engineered features. The attention mechanism it proposes became the foundation for virtually all subsequent NMT systems and inspired attention-centric models like the Transformer, reshaping machine translation and sequence modeling across NLP.

30u30 paper NLP translation

Attention Is All You Need

2017

Ashish Vaswani, Noam Shazeer +6

The paper “Attention Is All You Need” (2017) introduced the Transformer — a novel neural architecture relying solely on self-attention, removing recurrence and convolutions. It revolutionized machine translation by dramatically improving training speed and translation quality (e.g., achieving 28.4 BLEU on English-German tasks), setting new state-of-the-art benchmarks. Its modular, parallelizable design opened the door to large-scale pretraining and fine-tuning, ultimately laying the foundation for modern large language models like BERT and GPT. This paper reshaped the landscape of NLP and deep learning, making attention-based models the dominant paradigm across many tasks.

NLP LLM AIGC 30u30 paper+1

Relational recurrent neural networks

2018

Adam Santoro, Ryan Faulkner +8

This paper introduces a Relational Memory Core that embeds multi-head dot-product attention into recurrent memory to enable explicit relational reasoning. Evaluated on synthetic distance-sorting, program execution, partially-observable reinforcement learning and large-scale language-modeling benchmarks, it consistently outperforms LSTM and memory-augmented baselines, setting state-of-the-art results on WikiText-103, Project Gutenberg and GigaWord. By letting memories interact rather than merely store information, the approach substantially boosts sequential relational reasoning and downstream task performance.

foundation 30u30 paper NLP LLM

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2018

Jacob Devlin, Ming-Wei Chang +2

The BERT (Bidirectional Encoder Representations from Transformers) paper introduced a powerful pre-trained language model that uses deep bidirectional transformers and masked language modeling to capture both left and right context. Unlike prior unidirectional models, BERT achieved state-of-the-art performance across 11 NLP tasks (like GLUE, SQuAD) by enabling fine-tuning with minimal task-specific adjustments. Its impact reshaped NLP by setting a new standard for transfer learning, greatly improving accuracy on tasks such as question answering, sentiment analysis, and natural language inference, and inspiring a wave of follow-up models like RoBERTa, ALBERT, and T5.

NLP paper

Category

Explore by categories

All

AI Leaderboard

AI Agent Tutorials

AI Coding Tutorials

AI Agent Papers

Chatbot

Machine Learning Foundation Books

AI Train

AI Deploy

AI Client

Machine Learning Foundation Papers

Machine Learning Foundation Tutorials

AI Image Demos

AI Agent

Large Language Model Tutorials

Large Language Model Papers

Machine Learning Engineering Papers

Computer Vision Tutorials

Computer Vision Papers

Natural Language Processing Papers

Reinforcement Learning Papers

Speech Technology Papers

AI API

AI Coding

AI Image

AI Video

MLOps

MCP Client

MCP Server

ReAct: Synergizing Reasoning and Acting in Language Models

Neural Machine Translation by Jointly Learning to Align and Translate

Attention Is All You Need

Relational recurrent neural networks

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding