AIAny - GPT Series Papers

Introduced the two-stage recipe behind the GPT lineage: unsupervised generative pre-training on unlabeled text, then supervised fine-tuning per task. A single 12-layer Transformer decoder beat bespoke architectures on 9 of 12 NLP benchmarks.

openai transformers foundation-model paper LLM+1

Large Language Model Papers2019

GPT2: Language Models are Unsupervised Multitask Learners

Alec Radford, Jeffrey Wu +4OpenAI

A 1.5B-parameter model trained only to predict the next token on diverse web text does translation, summarization, and QA zero-shot, with no fine-tuning. It recast NLP tasks as conditional language modeling and sparked the staged-release misuse debate.

LLM NLP openai paper

Large Language Model Papers2020

GPT3: Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann +29OpenAI

At 175 billion parameters, this autoregressive model becomes a strong few-shot learner: it handles translation, QA, and reasoning from a few prompt examples with no gradient updates, establishing in-context learning as an alternative to fine-tuning.

LLM NLP openai paper

Large Language Model Papers2021

Codex: Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek +2OpenAI

Showed that fine-tuning a GPT model on public GitHub code yields a capable program synthesizer, and introduced HumanEval — the docstring-to-function benchmark that still anchors code-generation evaluation. A production variant powers GitHub Copilot.

openai code codex copilot evaluation+2

Large Language Model Papers2022

InstructGPT: Training Language Models to Follow Instructions with Human Feedback

Long Ouyang, Jeff Wu +4OpenAI

Made reinforcement learning from human feedback (RLHF) the standard alignment recipe: collect demonstrations and preference rankings, train a reward model, then optimize with PPO. A 1.3B aligned model was preferred over the 175B GPT-3 by human raters.

openai RL paper LLM NLP

Large Language Model Papers2023

GPT-4 Technical Report

Josh Achiam, Steven Adler +277OpenAI

A multimodal model that accepts image and text inputs and returns text, scoring at human level on professional exams — including a bar exam in the top 10%. Its performance was forecast from models using 1/1000th the compute, showing predictable scaling.

LLM NLP openai paper multimodal

Collection

Explore by collections