AI21 (Research)
ResearchIsrael·HQ Tel Aviv·Est. 2017·Part of AI21 Labs
AI21's research org — Mamba-architecture pioneers.
our score
Our take
Respected NLP research group with production-grade Mamba-Transformer hybrids, but group-level valuation pressure looms amid intense foundation-model competition.
At a glance
- Best known for
- Pioneering Mamba-Transformer hybrid LLM architectures (Jamba)
- Biggest strength
- Production-grade structured state-space research ahead of major labs
- Biggest risk
- Commercial platform (AI21 Labs) failing to monetize research lead before competitors catch up
- Stage
- Private (group valuation ~$1.4B, last round unknown)
- Primary revenue
- No direct revenue; funded by parent AI21 Labs (APIs, enterprise licenses, Wordtune subscriptions)
What they do
AI21 Research is the Tel Aviv-based research division of AI21 Labs, the Israeli AI company founded by Yoav Shoham, Ori Goshen, and Amnon Shashua. The group focuses on foundational large language model architecture, with particular emphasis on efficiency and long-context capabilities. Their signature contribution is the Jamba architecture—released in 2024—which combined Mamba's structured state-space model (SSM) layers with traditional Transformer attention in a hybrid block design, producing what they claimed was the first production-grade SSM-Transformer hybrid at the 256K context scale.
The research team operates at the intersection of theoretical advances in sub-quadratic sequence modeling and practical systems engineering for inference cost reduction. Their work targets the fundamental bottleneck of LLM deployment: attention's quadratic scaling with sequence length makes long documents, agentic loops, and multi-turn conversations prohibitively expensive. By demonstrating that Mamba layers could substitute for attention in portions of the stack while maintaining quality, they influenced the broader field's turn toward "attention-free" or "attention-lite" architectures in 2024-2025.
Unlike pure academic labs, AI21 Research ships models intended for real deployment through AI21 Labs' commercial API and Studio platform, though the research org itself does not directly monetize. Their outputs feed into Jurassic-series models and potentially future iterations. They compete conceptually with DeepMind's Mixture-of-Depths work, Meta's recent SSM explorations, and Mistral's efficiency-focused releases, though none of these competitors have identical hybrid architectures in market.
Origin story
AI21 Labs was founded in 2017 in Tel Aviv by Yoav Shoham (Stanford professor, serial entrepreneur in AI), Ori Goshen (formerly VP Engineering at CrowdAnalytica), and Amnon Shashua (co-founder of Mobileye, professor at Hebrew University). The trio combined academic credibility in AI with Israeli systems-engineering culture and Shashua's proven track record in building billion-dollar technology companies.
The research group emerged organically as the technical engine behind AI21's model development, with the Jurassic-1 series (2021) establishing early credibility as a GPT-3-class model with competitive Hebrew and multilingual capabilities. The pivotal shift came in 2023-2024 with the team's bet on Mamba, the structured state-space architecture proposed by Albert Gu and Tri Dao. While most labs treated Mamba as promising but unproven for LLM scale, AI21 Research engineered the hybrid approach—interleaving Mamba and attention blocks—that became Jamba, released in March 2024 with 256K context window support.
The group's trajectory since has been less publicly visible. AI21 Labs raised significant funding (reported $155M in 2023 at $1.4B valuation, though exact research arm allocation unclear), but the broader company's commercial products—Wordtune consumer writing assistant, AI21 Studio developer platform—have not achieved breakout scale against ChatGPT, Claude, or enterprise-focused competitors. The research org's continued independence and headcount (50-100 employees) suggests parent-company commitment to architectural differentiation, though commercial pressure may constrain long-horizon projects.
Key products
Jamba
2024Hybrid Mamba-Transformer LLM architecture combining structured state-space and attention layers, supporting 256K context windows; released as open weights and through AI21 Studio API for developers and researchers evaluating efficient long-context models.
Jurassic-2 / Jamba derivatives
Subsequent model iterations building on Jamba architecture for AI21 Labs' commercial API and enterprise deployments; specific branding and timing less publicly documented than original Jamba release.
Research publications & open weights
Academic papers and model releases advancing SSM-Transformer hybrid techniques, including training recipes and evaluation benchmarks that influenced the field's turn toward sub-quadratic architectures.
Leadership
- YS
Yoav Shoham
Co-founder, AI21 Labs; Stanford Professor Emeritus
Serial AI entrepreneur (previously TradingDynamics, Katango); provides research org with academic credibility and Silicon Valley network
- OG
Ori Goshen
Co-CEO, AI21 Labs
Engineering-focused co-founder; oversees product-commercialization bridge from research outputs
- AS
Amnon Shashua
Co-founder, AI21 Labs; also Mobileye CEO/Intel senior VP
Provides capital-raising leverage and Israeli tech ecosystem stature; attention split with Mobileye/Intel autonomous driving responsibilities
Funding history
- 2020Series A$34.5MPitango, TPY Capital, others
- 2023Series B$155MGoogle, Nvidia, Pitango, Walden Catalyst (reported, exact allocation to research arm unclear)
Strengths & risks
Strengths
- +First production-grade Mamba-Transformer hybrid (Jamba) with proven 256K context capability
- +Tel Aviv location accesses elite Israeli ML talent at lower cost than US Bay Area
- +Co-founders combine academic prestige (Stanford, Hebrew U), entrepreneurial track record, and deep capital markets access
- +Hybrid architecture directly addresses inference-cost pressure enterprises face with pure attention models
- +Research outputs feed real commercial API rather than staying purely academic
Risks
- ⚠Parent AI21 Labs has not achieved tier-1 API platform status vs OpenAI, Anthropic, Google, or even Cohere/Mistral in many minds
- ⚠Major labs (Meta, Google, Mistral) rapidly integrating SSM/efficiency techniques into own stacks, eroding architectural differentiation
- ⚠No public information on dedicated research funding runway; dependent on parent company commercial performance
- ⚠Amnon Shashua's divided attention with Mobileye/Intel responsibilities may constrain strategic involvement
- ⚠If Mamba-based approaches hit scaling walls or new efficiency techniques emerge, first-mover advantage dissipates quickly
Recent moves
Jamba open-weights release with 256K context
Mar 2024Released Jamba as open weights and API-accessible model, claiming first production hybrid SSM-Transformer; 256K context window positioned against Claude and Gemini long-context offerings.
Integration into AI21 Labs commercial stack
2024Jamba architecture reportedly informing subsequent Jurassic-series iterations and enterprise API offerings, though specific product branding less visible than original release.
Competitive position
AI21 Research occupies a defensible but narrowing niche: they were genuinely first to ship a credible Mamba-Transformer hybrid at scale, beating Meta's later Lama SSM experiments and Google's Mixture-of-Depths/Ring Attention work to market by months. This earned them significant researcher attention and some enterprise pilot interest for cost-sensitive long-context applications. However, 'first' in architecture does not guarantee platform adoption.
Against OpenAI and Anthropic, they lose on general model capability, ecosystem lock-in, and enterprise sales machinery. Against Mistral and Cohere, the competition is closer on efficiency-focused EU/Israeli alternative positioning, but Mistral's commercial momentum and Cohere's enterprise sales focus have arguably outpaced AI21 Labs' platform traction. Against pure open-sourceefforts, AI21 Research's models are more polished but less radically open than some community expectations.
Where they win: organizations with specific long-document or multi-turn cost constraints willing to trade absolute capability for inference efficiency; Israeli/European enterprises seeking non-US model providers; and ML engineers wanting to experiment with SSM architectures without building from scratch. Where they lose: general 'default' API choice, consumer awareness, and increasingly, raw architectural novelty as the field converges on hybrid efficient designs.
What to watch
- 01Whether Jamba 2 or successor releases close capability gap with GPT-4o/Claude 3.5-class models while maintaining efficiency edge
- 02Parent AI21 Labs' reported revenue and API growth metrics—research funding depends on commercial viability
- 03Meta, Google, or Mistral releases explicitly matching or exceeding Jamba's hybrid SSM-Transformer approach
- 04Any strategic pivot toward vertical-specific models (legal, medical, financial) where long-context efficiency wins
- 05Yoav Shoham and Ori Goshen's continued active involvement versus potential succession or acquisition dynamics
Frequently asked questions
Is AI21 Research the same as AI21 Labs?
AI21 Research is the dedicated research division within AI21 Labs, the broader commercial company. They build the core model architectures; other teams handle products like Wordtune and the developer API platform.
What makes Jamba different from standard Transformer models?
Jamba interleaves Mamba structured state-space model layers with traditional attention blocks, reducing the quadratic computational cost of long sequences while maintaining quality. It was the first production LLM to demonstrate this hybrid approach at 256K context scale.
Can I use Jamba models commercially?
Jamba was released with open weights for research and commercial use subject to its license, and is accessible through AI21 Studio's API. Check current terms as licensing for derivative models may have evolved.
How does AI21 Research compete with OpenAI or DeepMind research?
They don't match those labs' scale or compute budgets, but focus on architectural efficiency innovations—particularly sub-quadratic sequence modeling—that can disproportionately impact deployment costs for real applications.
Is the Mamba-Transformer hybrid approach proven to scale?
Jamba demonstrated viability at meaningful scale, but the field is evolving rapidly. Pure Mamba approaches have shown limitations on certain reasoning tasks; hybrid designs like Jamba's represent pragmatic compromise rather than solved problem.
Why Tel Aviv rather than Silicon Valley or London?
The founders are Israeli, and Tel Aviv offers exceptional ML talent—particularly in systems engineering and NLP—at competitive costs compared to US markets. The location also appeals to European customers with data-sovereignty preferences.
What's the relationship with Mobileye and Intel?
Co-founder Amnon Shashua leads Mobileye (Intel subsidiary) independently; this provides AI21 Labs with credibility and capital-raising advantages but means his operational attention is divided across organizations.
Is AI21 Labs profitable, and does it affect research funding?
Public financial details are limited. The group valuation of ~$1.4B and reported $155M Series B suggest investor confidence, but AI21 Research's continued investment depends on parent company commercial traction that remains less visible than major competitors.
The bottom line
AI21 Research sits at a genuine technical frontier as one of the few teams to get structured state-space models (Mamba) into production-grade LLMs before the major labs fully caught up. Their Jamba architecture demonstrated that attention-free sub-quadratic approaches could work at scale, not just in theory. However, the group operates within AI21 Labs, whose broader commercialization (Wordtune, AI21 Studio) has struggled to match the hype cycle of OpenAI or Anthropic. The $1.4B group valuation reflects investor patience, not proven enterprise dominance.
The critical variable is whether AI21 Labs can convert this research edge into sustained platform revenue, or whether Jamba becomes a footnote as Google, Meta, and Mistral integrate similar efficiency gains into their own stacks. If Mamba-derived architectures become standard but AI21 doesn't capture the tooling/API layer, the research prestige won't translate to economic returns. Watch for whether Jamba 2 or subsequent releases maintain architectural leadership, or if the group pivots toward narrower enterprise verticals where their Tel Aviv-based efficiency focus wins on cost-of-inference.
Key products
- Jamba (research)