o3-mini
by OpenAI·USA·Released
Cheap, fast reasoning — o1-level math/code at a fraction of the cost.
About this model
o3-mini (January 2025) is the small reasoning model that mostly obsoleted o1 for everyday use — better math, better coding, dramatically cheaper. At $1.10/M input and $4.40/M output, o3-mini costs roughly 1/13 of o1's per-token rates while scoring higher on most benchmarks (AIME 87.3%, vs o1's 83.3%).
The 'mini' branding is misleading; this is the smaller variant of o3 (which OpenAI hasn't released to the public). It includes function calling and streaming, unlike o1, which makes it suitable for agent workloads. The downside vs o1 is that it doesn't have vision input.
o3-mini ships with a 'reasoning effort' parameter (low / medium / high) that controls how much test-time compute to spend. Low is fast and cheap; high produces longer chains of thought and higher accuracy on hard problems.
Strengths
- •Beats o1 on most benchmarks at ~1/13 the cost
- •Supports function calling and streaming (unlike o1)
- •Adjustable reasoning effort: low / medium / high
- •Excellent on competitive coding (Codeforces 2073 rating)
Limitations
- •No vision input (o1 has it; GPT-4o has it)
- •Still slow vs GPT-4o for non-reasoning tasks
- •Hidden reasoning tokens charged at output rate (less of an issue than o1)
When to use it
- →Math-heavy production workloads (educational tools, scientific computing)
- →Code review and bug detection
- →Tool-using agents where reasoning quality matters
- →Cost-conscious alternative to o1
Architecture & training
OpenAI has positioned o3 as 'the next step in our reasoning model series' but has not disclosed architecture details. The o3 family was previewed in December 2024 with o3 (full) scoring breakthrough results on the ARC-AGI benchmark; o3-mini followed in January 2025 with a tuned cost/quality target. The shift from o1 to o3 came alongside the rapid emergence of DeepSeek R1 and the broader industry recognition that reasoning-by-default is the new frontier.
Benchmarks
| Benchmark | Score | Bar |
|---|---|---|
| AIME | 87.3 | |
| GPQA | 77.0 | |
| Codeforces | 2073.0 |