Claude Sonnet 4

by Anthropic·USA·Released May 22, 2025

The workhorse Claude tier — extended thinking at a fraction of Opus pricing.

textvisioncodechatreasoningagentstoolslong-contextcomputer-use

Vendor site

— · 0 reviews

About this model

Sonnet 4 is the workhorse tier of the Claude 4 family — released alongside Opus 4 in May 2025 and priced at one-fifth the cost ($3/M input, $15/M output). On Anthropic's evaluation Sonnet 4 actually matches or slightly exceeds Opus 4 on SWE-bench Verified at 72.7%, which makes it the surprising default choice for many coding workloads.

Sonnet 4 shares the same extended-thinking mechanism and MCP tool-call format as Opus 4. The main quality gap shows up on harder GPQA-style scientific reasoning where Opus 4's longer thinking budget pays off. For typical chat, coding, and tool-use workloads Sonnet 4 is essentially indistinguishable from Opus at a fraction of the cost.

Strengths

•Sonnet 4 matches Opus 4 on SWE-bench Verified at 1/5 the price
•Same MCP tool-call ergonomics as Opus — swap models without code changes
•Extended thinking available as opt-in
•Default tier in Cursor, Claude Code, and most production AI products

Limitations

•Trails Opus 4 on hardest scientific / mathematical reasoning
•Same 200K context — smaller than Gemini 2.5 Pro's 1M
•Closed weights

When to use it

→Default tier for production coding agents
→High-volume customer-facing chatbots needing tool use
→RAG pipelines at scale
→Two-model architectures with auto-escalation to Opus 4 on confidence drop

Architecture & training

Anthropic has confirmed Sonnet 4 shares the Opus 4 pretraining corpus and post-training pipeline (Constitutional AI + RLHF) with a smaller activated-parameter count. The fact that Sonnet 4 matches Opus on coding benchmarks while being substantially cheaper has prompted broader industry questions about whether 'flagship' tiers are still worth their price premium for many workloads.

Benchmarks

Benchmark	Score	Bar
GPQA	75.4
MMLU	88.3
SWE-bench Verified	72.7

Reviews · 0

Stories about Claude Sonnet 4

Import AIJul 19, 2026

Anthropic reports 8x surge in code output, sees early signs of recursive self-improvement

Anthropic says code merged into its codebase jumped 8x in 2026 compared to 2021–2024, which it views as preliminary evidence that AI is already beginning to accelerate the development of better AI.

InterconnectsJul 18, 2026

Z.ai's GLM-5.2 stuns AI community, rivaling Claude and OpenAI on top benchmarks

The open-weight GLM-5.2 model is the first to crack into top-tier coding agent performance alongside Anthropic and OpenAI's best — and it landed right after the U.S. effectively banned Claude Fable 5.

The DecoderJul 18, 2026

Anthropic cuts Claude Fable 5 limits, pushes Pro users to API pricing

Anthropic is slashing Fable 5 usage in its subscription plans by up to 50 percent while nudging Pro and Team Standard users toward pay-per-use API pricing. The move comes as competitive pressure mounts from cheaper models like OpenAI's GPT-5.6 Sol.

Anthropic NewsJul 6, 2026

Anthropic Details Claude Fable 5 Cyber Safeguards and Jailbreak Framework

Anthropic detailed the cybersecurity safety classifiers for its globally redeployed Claude Fable 5 model and released an early draft of an AI jailbreak severity framework developed with Glasswing partners. The company also launched a HackerOne program for researchers to submit potential cyber jailbreaks.

Compare against

All models →

Claude Opus 4

USA

Anthropic's frontier model with extended thinking, leading SWE-bench Verified.

Anthropic200K ctx$15.00 / $75.00

Claude 3.5 Haiku

USA

Anthropic's fast tier — sub-second responses for high-throughput workloads.

Anthropic200K ctx$0.80 / $4.00

GLM-4.5

China

Zhipu's flagship — agentic-first MoE with strong coding + tool-use benchmarks.

Zhipu AI128K ctx$0.60 / $2.20Open

Qwen3-Coder