DeepSeek R1

Open weights

by DeepSeek·China·Released Jan 20, 2025

Open-weights reasoning model — o1-comparable quality with full chain-of-thought visible.

textcodereasoningmathcode

Vendor site Paper

— · 0 reviews

About this model

DeepSeek R1 (January 2025) was DeepSeek's answer to OpenAI's o1 — and the first open-weights reasoning model to reach o1-comparable quality. R1 is built on the same 671B MoE backbone as V3 but post-trained with large-scale RL on chain-of-thought generation.

Unlike OpenAI's o-series, R1's full reasoning trace is visible to users (which OpenAI hides). The reasoning traces have become a popular dataset for distilling reasoning capability into smaller open-weights models — DeepSeek released several R1-distilled variants (Qwen-based and Llama-based) alongside the main model.

Released under MIT license — the most permissive license used by any frontier-class model. The combination of R1's release timing, open weights, and low API pricing triggered a substantial market reaction and ongoing industry rethinking of competitive moats.

Strengths

•o1-comparable reasoning quality with open weights
•MIT license — most permissive of any frontier-class model
•Visible reasoning traces — usable as training data for smaller models
•Cheap via DeepSeek API ($2.19/M output)
•R1-distilled variants extend reasoning to Qwen and Llama base models

Limitations

•64K context — smaller than top frontier
•Reasoning traces can be very long, making total token cost unpredictable
•Same Chinese-origin procurement friction as V3
•Less polished UI / developer ergonomics than OpenAI's o-series

When to use it

→Math and competition-style reasoning at lower cost than o1
→Open-weights reasoning research and distillation experiments
→Self-hosted reasoning agents under permissive license
→Educational tools needing visible chain-of-thought

Architecture & training

DeepSeek's R1 technical report is one of the most-cited papers of 2025 for its detailed account of post-training methodology — including a 'cold-start' SFT phase on long reasoning traces, followed by large-scale RL using a rule-based reward model that emphasises correctness and reasoning-trace coherence. The paper also documents the distillation procedure used for the smaller R1-Distill variants.

Benchmarks

Benchmark	Score	Bar
AIME	79.8
GPQA	71.5
MATH	97.3
Codeforces	2029.0

DeepSeek R1

About this model

Strengths

Limitations

When to use it

Architecture & training

Benchmarks

Reviews · 0

Stories about DeepSeek R1

Z.ai's GLM-5.2 stuns AI community, rivaling Claude and OpenAI on top benchmarks

Compare against

DeepSeek V3

GLM-4.5

Qwen3-Coder

Kimi K2

About this model

✓ Strengths

× Limitations

When to use it

Architecture & training

Benchmarks

Reviews · 0

Stories about DeepSeek R1

Z.ai's GLM-5.2 stuns AI community, rivaling Claude and OpenAI on top benchmarks

Compare against

DeepSeek V3

GLM-4.5

Qwen3-Coder

Kimi K2

Strengths

Limitations