WED, 03 JUN 2026 · 18:34:02 UTC

Qwen3

Open weights

by Alibaba Cloud·China·Released

Alibaba's latest open-weights generation — dense + MoE variants with hybrid reasoning mode.

textcodechatreasoningagentstoolslong-context
Vendor site Paper
· 0 reviews

About this model

Qwen3 (April 2025) is Alibaba Cloud's latest open-weights generation, shipping eight variants across two architectures: dense (0.6B → 32B) and Mixture-of-Experts (30B-A3B and 235B-A22B). The headline feature is a unified 'hybrid thinking' mode — a single model can flip between fast non-thinking responses and deeper chain-of-thought reasoning controlled by a flag in the prompt, similar to Claude Opus 4's extended thinking but exposed differently.

Qwen3-235B-A22B competes with closed frontier models on most benchmarks (MMLU-Pro 75%, AIME 2025 81.5%) while shipping under Apache 2.0. The dense 32B variant is particularly popular for fine-tuning given its size class. Alibaba has positioned Qwen3 as the lab's serious bet on owning the global open-weights conversation alongside DeepSeek and Meta.

Strengths

  • Apache 2.0 — most permissive license in its quality tier
  • Hybrid thinking mode toggleable per request
  • Eight variants covering 0.6B → 235B param scale
  • Strong multilingual: 119 languages supported
  • Tight fine-tuning ecosystem (LoRA / QLoRA / vLLM)

Limitations

  • Hybrid thinking adds prompt complexity vs single-mode models
  • MoE serving still requires specialist infra (vLLM, SGLang)
  • English chat quality marginally trails Llama 3.3 70B on subjective tests

When to use it

  • Open-weights deployments needing top-tier reasoning
  • Cost-sensitive serving where Apache 2.0 matters
  • Multilingual applications (119 languages)
  • Fine-tuning for vertical specialisation

Architecture & training

Qwen3 is trained on Alibaba's PAI infrastructure. The technical report (arXiv 2505.09388) details a three-stage process: standard pretraining, mid-training on long-context + reasoning data, and post-training using both SFT and RLHF. The MoE variants use 128 experts with top-8 routing. Hybrid thinking mode is achieved via a fine-tuned chat template that responds to a `/think` or `/no_think` flag.

Benchmarks

BenchmarkScoreBar
MMLU-Pro75.2
AIME 202581.5
LiveCodeBench70.7

Reviews · 0

Sign in to leave a rating.

Compare against

All models →