WED, 03 JUN 2026 · 18:35:42 UTC

Claude Sonnet 4

by Anthropic·USA·Released

The workhorse Claude tier — extended thinking at a fraction of Opus pricing.

textvisioncodechatreasoningagentstoolslong-contextcomputer-use
Vendor site
· 0 reviews

About this model

Sonnet 4 is the workhorse tier of the Claude 4 family — released alongside Opus 4 in May 2025 and priced at one-fifth the cost ($3/M input, $15/M output). On Anthropic's evaluation Sonnet 4 actually matches or slightly exceeds Opus 4 on SWE-bench Verified at 72.7%, which makes it the surprising default choice for many coding workloads.

Sonnet 4 shares the same extended-thinking mechanism and MCP tool-call format as Opus 4. The main quality gap shows up on harder GPQA-style scientific reasoning where Opus 4's longer thinking budget pays off. For typical chat, coding, and tool-use workloads Sonnet 4 is essentially indistinguishable from Opus at a fraction of the cost.

Strengths

  • Sonnet 4 matches Opus 4 on SWE-bench Verified at 1/5 the price
  • Same MCP tool-call ergonomics as Opus — swap models without code changes
  • Extended thinking available as opt-in
  • Default tier in Cursor, Claude Code, and most production AI products

Limitations

  • Trails Opus 4 on hardest scientific / mathematical reasoning
  • Same 200K context — smaller than Gemini 2.5 Pro's 1M
  • Closed weights

When to use it

  • Default tier for production coding agents
  • High-volume customer-facing chatbots needing tool use
  • RAG pipelines at scale
  • Two-model architectures with auto-escalation to Opus 4 on confidence drop

Architecture & training

Anthropic has confirmed Sonnet 4 shares the Opus 4 pretraining corpus and post-training pipeline (Constitutional AI + RLHF) with a smaller activated-parameter count. The fact that Sonnet 4 matches Opus on coding benchmarks while being substantially cheaper has prompted broader industry questions about whether 'flagship' tiers are still worth their price premium for many workloads.

Benchmarks

BenchmarkScoreBar
GPQA75.4
MMLU88.3
SWE-bench Verified72.7

Reviews · 0

Sign in to leave a rating.

Stories about Claude Sonnet 4

More →
NEWAnthropic News

Anthropic Launches Project Glasswing with Major Tech and Finance Partners to Defensively Deploy AI Cybersecurity Model

Anthropic announced Project Glasswing, a coalition including AWS, Apple, Google, Microsoft, and others, to use its unreleased Claude Mythos Preview model for defensive cybersecurity. The initiative aims to address the dual-use risk of advanced AI vulnerability-discovery capabilities by finding and fixing flaws in critical software before malicious actors can exploit them.

Compare against

All models →