o3-mini

by OpenAI·USA·Released Jan 31, 2025

Cheap, fast reasoning — o1-level math/code at a fraction of the cost.

textreasoningmathcodetools

Vendor site

— · 0 reviews

About this model

o3-mini (January 2025) is the small reasoning model that mostly obsoleted o1 for everyday use — better math, better coding, dramatically cheaper. At $1.10/M input and $4.40/M output, o3-mini costs roughly 1/13 of o1's per-token rates while scoring higher on most benchmarks (AIME 87.3%, vs o1's 83.3%).

The 'mini' branding is misleading; this is the smaller variant of o3 (which OpenAI hasn't released to the public). It includes function calling and streaming, unlike o1, which makes it suitable for agent workloads. The downside vs o1 is that it doesn't have vision input.

o3-mini ships with a 'reasoning effort' parameter (low / medium / high) that controls how much test-time compute to spend. Low is fast and cheap; high produces longer chains of thought and higher accuracy on hard problems.

Strengths

•Beats o1 on most benchmarks at ~1/13 the cost
•Supports function calling and streaming (unlike o1)
•Adjustable reasoning effort: low / medium / high
•Excellent on competitive coding (Codeforces 2073 rating)

Limitations

•No vision input (o1 has it; GPT-4o has it)
•Still slow vs GPT-4o for non-reasoning tasks
•Hidden reasoning tokens charged at output rate (less of an issue than o1)

When to use it

→Math-heavy production workloads (educational tools, scientific computing)
→Code review and bug detection
→Tool-using agents where reasoning quality matters
→Cost-conscious alternative to o1

Architecture & training

OpenAI has positioned o3 as 'the next step in our reasoning model series' but has not disclosed architecture details. The o3 family was previewed in December 2024 with o3 (full) scoring breakthrough results on the ARC-AGI benchmark; o3-mini followed in January 2025 with a tuned cost/quality target. The shift from o1 to o3 came alongside the rapid emergence of DeepSeek R1 and the broader industry recognition that reasoning-by-default is the new frontier.

Benchmarks

Benchmark	Score	Bar
AIME	87.3
GPQA	77.0
Codeforces	2073.0

o3-mini

About this model

Strengths

Limitations

When to use it

Architecture & training

Benchmarks

Reviews · 0

Stories about o3-mini

OpenAI Outlines Strategies for AI Investments and Business Models in New Blog Series

OpenAI Sitemap Shows Widespread Business, Partner, and Academy Page Updates

OpenAI Blog Links Reveal Wave of Enterprise AI Case Studies and Internal Tools

OpenAI's Latest Blog Posts Show AI Tackling Diseases and Black Holes

Compare against

o1

GLM-4.5

Qwen3-Coder

Kimi K2

About this model

✓ Strengths

× Limitations

When to use it

Architecture & training

Benchmarks

Reviews · 0

Stories about o3-mini

OpenAI Outlines Strategies for AI Investments and Business Models in New Blog Series

OpenAI Sitemap Shows Widespread Business, Partner, and Academy Page Updates

OpenAI Blog Links Reveal Wave of Enterprise AI Case Studies and Internal Tools

OpenAI's Latest Blog Posts Show AI Tackling Diseases and Black Holes

Compare against

o1

GLM-4.5

Qwen3-Coder

Kimi K2

Strengths

Limitations