WED, 03 JUN 2026 · 18:36:01 UTC

Qwen 2.5-72B Instruct

Open weights

by Alibaba Cloud·China·Released

Open-weights 72B from the Qwen 2.5 family — Apache 2.0, runs on a single 8×A100 node.

textcodechatreasoningtoolslong-context
Vendor site Paper
· 0 reviews

About this model

Qwen 2.5-72B Instruct (September 2024) is the largest open-weights model in the Qwen 2.5 family — released under Apache 2.0, runs on a single 8×A100 or 8×H100 node, and has become the default open-weights choice for teams that find Llama 3.3 70B's Community License too restrictive.

On benchmarks it's broadly comparable to Llama 3.3 70B, with stronger Chinese-language performance and competitive English scores. The Apache 2.0 license is genuinely permissive — no MAU caps, no commercial-use restrictions.

Strengths

  • Apache 2.0 — most permissive license among frontier-adjacent models
  • No MAU restrictions (unlike Llama Community License)
  • Strong Chinese + English bilingual capability
  • Mature fine-tuning ecosystem (LoRA / QLoRA support is excellent)

Limitations

  • Closed Qwen 2.5-Max outperforms it on hardest tasks
  • Smaller activated parameter count limits hardest reasoning
  • Less benchmark coverage on English-only tasks than Llama family

When to use it

  • On-prem deployments under strict licensing requirements
  • Fine-tuning for company-specific corpora
  • Bilingual applications (Chinese-English)
  • Cost-sensitive serving where Apache 2.0 matters

Architecture & training

72B-parameter dense transformer. Pretrained on Alibaba's TaoBao-affiliated GPU clusters. The Qwen 2.5 technical report describes the training mix as ~36% Chinese, ~36% English, ~14% code, with the remainder being other languages and structured data. Post-training uses RLHF and Direct Preference Optimisation (DPO).

Benchmarks

BenchmarkScoreBar
MATH83.1
MMLU86.1
HumanEval86.6

Reviews · 0

Sign in to leave a rating.

Compare against

All models →