The model layer
The open-weights frontier is now genuinely competitive with closed models. Meta's Llama 3.3 70B matches its own 405B sibling at a fifth of the serving cost. DeepSeek's V3 (671B MoE, 37B active per token) was trained for a reported $5.6M and ships with quality competitive with Claude 3.5 Sonnet — released under a permissive licence that allows commercial use without caveats. Its sibling R1 is the first open reasoning model at o1 quality, released under MIT.
Moonshot's Kimi K2 (1T MoE, 32B active) is the strongest open-weights coding agent at the time of writing — 65.8% on SWE-bench Verified, comparable to Claude Sonnet 4. Alibaba's Qwen 2.5 72B is Apache 2.0 and ships alongside specialist Coder, Math, and Audio variants that share the same backbone. Mistral's European stack — Large 2 for chat, Codestral 25 for autocomplete — extends the picture with EU-resident inference. And the small-model regime is dominated by Microsoft Research's Phi-4 at 14B, MIT-licensed and small enough to run on a single consumer GPU.
The Chinese surge — and why the geopolitics is the wrong frame
The story of 2025 was the Chinese labs proving that frontier capability and permissive licensing aren't mutually exclusive. DeepSeek, Qwen, Kimi, and the smaller open-weight efforts from 01.AI and Yi collectively shipped four world-class model families with quality matching or beating their American closed-weight counterparts — all open under MIT or Apache variants.
Procurement teams sometimes hesitate on Chinese-origin models on jurisdictional grounds. The pragmatic view: the model weights are static files you serve on your own infrastructure, in your own region, with your own observability. The training-data provenance question is real but applies to every closed model equally — at least with open weights you can audit them. Where geopolitical concern is genuinely warranted is the hosted API layer (chat.deepseek.com, chat.qwen.ai) where queries leave your perimeter. That's a different decision than the model itself.
The tooling layer
Around the model layer sits an increasingly mature open-source tooling stack. Aider, Cline, and Continue are the open coding agents — three different takes on bringing Claude-Code-style workflows to your editor, all running against whichever model you point them at. Civitai is the hub for open image-generation models. Hugging Face remains the gravitational centre of model distribution.
The infrastructure layer is similarly open: vLLM and Ollama for serving, LangChain and LlamaIndex for orchestration, Weaviate, Chroma, Qdrant, and Milvus for vector search, LangFuse and Helicone for observability. None of these have a closed-source equivalent that meaningfully outperforms them; the closed alternatives compete on support and integration rather than capability.
When to pick open over closed
Open weights win cleanly when any of these apply: you need to fine-tune on proprietary data, you have strict data-residency or audit requirements, you serve enough volume that per-token cost dominates (typically above 50M tokens/day), or you simply want infrastructure independence from the trio of frontier labs.
Closed models still win for the hardest agentic workloads (Claude Opus 4 leads SWE-bench Verified), the latest reasoning research (o1, o3-mini), and the most integrated multimodal experiences (GPT-4o voice, Gemini 2.5 Pro video). For most production workloads in 2026 the answer is hybrid: open for the bulk, closed for the hard edge cases.
Below: every open-source tool we currently track, grouped by category. Tools with the ◯ Open source tag are downloadable / forkable; Chinese-origin tools are surfaced exactly the same way as Western ones — we don't maintain separate listings.