Model Release Timeline

Chronological history of frontier LLM releases from the 2017 Transformer paper through the Claude 4.x family in April 2026 — essential context for calibrating what constitutes progress vs expected trajectory.

The frontier LLM release history. Understanding the trajectory helps calibrate what's impressive vs expected.


2017–2019: Foundations

DateModelLabSignificance
Jun 2017TransformerGoogleArchitecture that replaced RNNs
Oct 2018BERTGoogleBidirectional transformers; NLP benchmarks shattered
Feb 2019GPT-2OpenAI1.5B params; "too dangerous to release" (briefly)

2020–2021: Scale

DateModelLabSignificance
May 2020GPT-3OpenAI175B; in-context learning; API release
Jun 2021GitHub CopilotGitHub/OpenAIFirst mass-market AI coding tool
Aug 2021CodexOpenAICode-focused GPT; powers Copilot

2022: Alignment and ChatGPT

DateModelLabSignificance
Apr 2022PaLM 540BGoogleScaling laws confirmed at 540B params
Dec 2022ChatGPTOpenAIConsumer breakout; 1M users in 5 days
Dec 2022Constitutional AI paperAnthropicCAI framework published

2023: Open Source and Multimodal

DateModelLabSignificance
Feb 2023LLaMA 1MetaOpen weights leak; open-source ecosystem born
Mar 2023GPT-4OpenAIVision + text; SWE-bench debut model
Mar 2023Claude 1AnthropicFirst public Claude; Constitutional AI
Jul 2023Llama 2MetaOfficially open; commercial license
Nov 2023Mistral 7BMistralBest 7B model; MoE architecture introduced
Nov 2023Mixtral 8x7BMistralMoE outperforms GPT-3.5; open weights
Dec 2023Gemini 1.0GoogleNative multimodal; Ultra/Pro/Nano tiers

2024: Reasoning and Agents

DateModelLabSignificance
Mar 2024Claude 3 (Haiku/Sonnet/Opus)AnthropicHaiku/Sonnet/Opus tiers; Opus #1 on Arena
Apr 2024Llama 3 8B/70BMetaBest open model at each tier
May 2024GPT-4oOpenAIUnified audio/vision/text; realtime API
Jun 2024Claude 3.5 SonnetAnthropicReclaimed #1; 50% SWE-bench; Claude Artifacts
Sep 2024o1OpenAIReasoning model with internal chain-of-thought
Sep 2024Llama 3.1 405BMetaFirst open model competitive at frontier
Oct 2024Claude 3.5 HaikuAnthropicSonnet-level quality at Haiku price
Nov 2024Qwen 2.5AlibabaBest Chinese open model family

2025: Agents Take Over

DateModelLabSignificance
Jan 2025DeepSeek V3/R1DeepSeeko1-level reasoning, open weights, 96% cheaper API
Feb 2025Claude 3.7 SonnetAnthropicExtended thinking; 70.3% SWE-bench Verified
Mar 2025Gemini 2.5 ProGoogleStrong reasoning; 1M context becomes standard
Apr 2025Claude 3.5 Sonnet refreshAnthropicAgent capabilities; Claude Code ships
Jun 2025GPT-4.5OpenAIImproved instruction following
Jul 2025Llama 3.1 405B InstructMetaBest open model for coding
Oct 2025Claude Haiku 4.5Anthropic73.3% SWE-bench at lowest price tier

2026 (April): Claude 4 Family

DateModelLabSignificance
Feb 2026Claude Opus 4.6Anthropic80.8% SWE-bench; 91.3% GPQA
Feb 2026Claude Sonnet 4.6Anthropic79.6% SWE-bench; $3/$15 per M tokens
Apr 2026Claude Opus 4.7AnthropicLatest Opus; released April 16 2026
Apr 2026Gemini 3 (unconfirmed)Google[unverified]

[Source: Perplexity research, 2026-04-29 — post-August 2025 dates are from research]


Key Themes by Year

2017–2020: Architecture innovation (Transformer → BERT → GPT-3)
2021–2022: Alignment research + consumer products (ChatGPT)
2023: Open source democratisation (Llama, Mistral); multimodal goes mainstream
2024: Reasoning models (o1); agents start shipping
2025: Agents are the product; open models reach frontier; Claude Code ships
2026: Claude Opus 4.x takes SWE-bench lead; Anthropic overtakes OpenAI on revenue


Key Facts

  • GPT-3 (May 2020): 175B parameters; first mass in-context learning via API
  • ChatGPT hit 1M users in 5 days (December 2022)
  • LLaMA 1 leaked in Feb 2023; open-source ecosystem formed around it
  • Claude 3.7 Sonnet (Feb 2025): extended thinking; 70.3% SWE-bench Verified
  • DeepSeek R1 (Jan 2025): o1-level reasoning; 96% cheaper; MIT license; GRPO-only training
  • Claude Opus 4.6 (Feb 2026): 80.8% SWE-bench Verified; 91.3% GPQA Diamond
  • 2026 theme: Anthropic overtakes OpenAI on revenue; Claude Code ships as agentic tool

Connections

Open Questions

  • What does GPT-5 look like and when does it ship?
  • Does the open-source tier continue to narrow the gap with frontier models at each generation?
  • When does the Transformer architecture get meaningfully supplanted in production?