Claude 3.5 Haiku
by Anthropic·USA·Released
Anthropic's fast tier — sub-second responses for high-throughput workloads.
About this model
Claude 3.5 Haiku is Anthropic's fast tier — released October 2024 with sub-second median latency and aggressive pricing ($0.80/M input, $4/M output). Despite the small-tier branding, Haiku 3.5 actually outperforms the original Claude 3 Opus on several benchmarks including HumanEval (88.1%), which surprised the industry on release.
Anthropic has not yet released a Haiku 4 variant, so 3.5 remains the recommendation for latency-sensitive workloads. The model retains full tool-use support — there's no feature gap vs the larger tiers, just a quality gap on the hardest tasks.
Strengths
- •Sub-second median latency for short prompts
- •Cheapest Claude model — $0.80/M input
- •Full tool-use support — no feature gap vs higher tiers
- •Beats Claude 3 Opus on HumanEval coding benchmark
Limitations
- •Quality gap vs Sonnet 4 on reasoning-heavy tasks
- •No extended thinking mode
- •No Haiku 4 yet — sits a generation behind the Claude 4 family
When to use it
- →High-throughput classification and tagging
- →Real-time chat with sub-second latency budgets
- →Routing layer in multi-model architectures
- →Embedding-stage extraction for RAG pipelines
- →Batch jobs where cost per token dominates
Architecture & training
Released as part of the Claude 3.5 generation in October 2024, sharing the pretraining corpus and constitutional-AI post-training with Sonnet 3.5. Anthropic has confirmed Haiku 3.5 is a smaller distilled variant rather than a separately-trained model — which explains how it inherits the safety and tool-use behaviour of the larger Claude 3.5 tiers.
Benchmarks
| Benchmark | Score | Bar |
|---|---|---|
| MMLU | 75.0 | |
| HumanEval | 88.1 |