Llama 3.1 405B
Open weightsby Meta AI·USA·Released
Meta's largest open-weights model — 405B dense parameters under permissive license.
About this model
Llama 3.1 405B (July 2024) was the model that proved frontier-class quality could come from open weights. At launch it traded blows with Claude 3.5 Sonnet and GPT-4o on most benchmarks while being fully downloadable. Meta's training run reportedly used 15.6T tokens on 16,000 H100 GPUs over ~54 days.
Llama 3.1 405B is still relevant for research and for fine-tuning workloads that benefit from the larger parameter count, but for most serving workloads the smaller Llama 3.3 70B is more practical — same quality, much cheaper to serve.
Strengths
- •Frontier-class open-weights model at launch
- •Detailed Meta technical paper describing the training methodology
- •405B dense parameters — useful for research on scaling
- •Llama Community License — commercial use OK for most companies
Limitations
- •Largely superseded by Llama 3.3 70B for production serving
- •Requires substantial infrastructure (16+ H100s for full-precision)
- •Same Community License constraints as the rest of the family
When to use it
- →Research on large-scale dense transformers
- →Fine-tuning where 405B capacity matters
- →Reference implementation for evaluating smaller distilled models
Architecture & training
Meta's published Llama 3 technical report is the most detailed account of frontier-model training methodology released by any major lab — covering data curation, RLHF pipeline, infrastructure, and evaluation in extensive detail. The 405B model was the result of careful Chinchilla-style scaling decisions with explicit attention to data quality.
Benchmarks
| Benchmark | Score | Bar |
|---|---|---|
| MMLU | 88.6 | |
| GSM8K | 96.8 | |
| HumanEval | 89.0 |