WED, 03 JUN 2026 · 18:35:42 UTC

Gemini 2.5 Pro

by Google DeepMind·USA·Released

Google's deep-thinking flagship with a 1M-token context window.

textvisionaudiovideocodechatreasoningagentstoolslong-context
Vendor site
· 0 reviews

About this model

Gemini 2.5 Pro (March 2025) was Google's first 'thinking' model — like OpenAI's o-series, it spends additional compute on internal reasoning before responding. Unlike o1 / o3-mini, the reasoning trace is visible to the user, which Google argues helps debug agent workflows.

The model ships with a 1M-token context window (2M in some configurations) and tops several reasoning benchmarks at release — 84% on GPQA Diamond, 86.7% on MATH. It also brings the strongest video understanding of any frontier model, courtesy of the multimodal-from-scratch Gemini architecture.

Pricing is tiered by context length: $1.25/M input for ≤200K tokens, $2.50/M for >200K. Output is $10/M (or $15/M past 200K). Google offers a generous free tier via AI Studio for prototyping.

Strengths

  • 1M-token context (2M in some configs) at competitive pricing
  • Visible reasoning traces — easier to debug than OpenAI's o-series
  • Top-of-leaderboard at launch on GPQA Diamond (84%)
  • Strongest video understanding of any frontier model
  • Generous AI Studio free tier

Limitations

  • Tool-call format is Google-specific, not MCP
  • Coding scores trail Claude 4 family on SWE-bench Verified
  • Pricing structure complicates capacity planning (tier change at 200K tokens)

When to use it

  • Whole-corpus document analysis (1M+ token inputs)
  • Video analysis and content moderation at scale
  • Multi-step reasoning where chain-of-thought visibility matters
  • Workspace-native assistants (Docs, Gmail, Sheets)

Architecture & training

DeepMind has confirmed Gemini 2.5 uses a sparse Mixture-of-Experts architecture trained natively on interleaved text/image/audio/video tokens. The thinking capability was added in post-training via a process Google calls 'Gemini Thinking' — a variant of large-scale RL on chain-of-thought generation. Training infrastructure is Google's TPU v5p superpods.

Benchmarks

BenchmarkScoreBar
GPQA84.0
MATH86.7
MMLU85.8
SWE-bench Verified63.8

Reviews · 0

Sign in to leave a rating.

Stories about Gemini 2.5 Pro

More →

Compare against

All models →