Modal Labs

Infrastructure

USA·HQ New York·Est. 2021

Serverless GPU compute for AI workloads.

7.5

our score

Our take

Modal is the developer-favorite serverless GPU platform turning Python functions into elastic AI compute without infrastructure overhead.

At a glance

Best known for: Serverless GPU compute for Python-based AI workloads
Biggest strength: Developer experience and Python-native function-as-a-service abstraction
Biggest risk: Hyperscalers and GPU clouds copying the serverless DX or undercutting on price
Stage: Series B
Primary revenue: Usage-based fees for serverless GPU compute and related cloud infrastructure

What they do

Modal Labs operates a serverless compute platform purpose-built for AI and machine learning workloads. The company abstracts away infrastructure management—provisioning, container orchestration, autoscaling, and scheduling—allowing engineers to deploy Python functions to cloud GPUs using native decorators and syntax. Customers write standard Python code, and Modal handles the rest, spinning up GPU-accelerated containers on demand for tasks like fine-tuning large language models, running batch inference pipelines, and executing ad-hoc evaluation jobs. The platform sits in the cloud infrastructure layer, bridging raw GPU providers and the application layer.

The company primarily sells to AI research teams, startups, and engineering organizations that need elastic GPU access without maintaining Kubernetes clusters or managing instance lifecycles. Modal’s core offering is complemented by Modal Sandboxes, which provide isolated execution environments for untrusted or exploratory code. By focusing on developer experience—fast cold starts, simple dependency management, and seamless Python integration—Modal has become particularly popular among frontier AI labs and fast-moving startups that prioritize iteration speed over building internal ML platforms.

Origin story

Modal was founded in 2021 in New York by Erik Bernhardsson and Akshat Bubna, engineers with deep backgrounds in large-scale distributed systems. Bernhardsson previously served as CTO at Better.com and spent years at Spotify building data and recommendation infrastructure, experience that shaped Modal’s focus on developer ergonomics at scale. The company emerged during the early generative AI boom, betting that the surge in training, fine-tuning, and inference workloads would demand a higher-level abstraction than raw VMs or Kubernetes. Rather than competing on price-per-GPU-hour alone, Modal positioned itself as the 'easy button' for AI compute, optimizing for Python developer velocity and rapid iteration.

After launching its core platform and gaining traction among top AI research labs, the team raised an $80 million Series B at a roughly $650 million valuation. Modal has since scaled to a lean team of 30–60 employees while serving a growing roster of AI-native enterprises and startups, maintaining a capital-efficient profile even as it expands its product surface area beyond core serverless functions.

Key products

Modal

Core serverless platform that deploys Python functions to cloud GPUs and CPUs with automatic scaling, dependency management, and scheduling. Used by AI teams for fine-tuning, batch inference, and distributed evaluation jobs.

Modal Sandboxes

Isolated, ephemeral execution environments that run untrusted or exploratory Python code on cloud GPUs. Popular for dynamic AI evaluations, third-party code execution, and security-sensitive workloads.

Leadership

EB
Erik Bernhardsson
CEO & Co-founder
Former CTO at Better.com and engineering leader at Spotify; known for building large-scale data and ML infrastructure.
AB
Akshat Bubna
Co-founder
Co-founded Modal after engineering work on distributed systems; leads core platform architecture and technical strategy.

Funding history

Year

Round

Amount

Lead investors

2024
Series B
$80M
Undisclosed

Strengths & risks

Strengths

+Best-in-class Python developer experience with minimal infrastructure boilerplate
+Strong brand affinity among frontier AI labs and high-growth startups
+Efficient GPU cold-start and autoscaling technology purpose-built for AI workloads
+Lean, capital-efficient team generating significant revenue per employee
+Native support for fine-tuning, batch inference, and distributed Python jobs

Risks

⚠Hyperscalers adding GPU serverless options that replicate Modal's core convenience
⚠Margin compression if GPU supply normalizes and bare-metal clouds cut prices
⚠Customer concentration in volatile AI startup and research funding environment
⚠Execution risk scaling from beloved developer tool to enterprise-grade platform

Recent moves

Closed $80M Series B at $650M valuation
2024
The round fuels expansion of Modal's engineering and go-to-market teams amid surging demand for serverless AI compute.
Launched Modal Sandboxes for isolated GPU execution
2024
Sandboxes enable secure, ephemeral environments for untrusted code, opening evals and AI agent use cases.

Competitive position

Modal occupies a distinct niche in the crowded AI infrastructure stack. Against hyperscalers like AWS, Google Cloud, and Azure, it wins on developer experience and AI-specific optimizations—its Python-native decorator model and fast GPU cold starts remain ahead of generic serverless offerings. Compared to raw GPU clouds such as CoreWeave, Lambda Labs, and Vast.ai, Modal trades price-per-GPU-hour for operational simplicity, appealing to teams that lack DevOps resources. It also differentiates from inference-specific platforms like Replicate, Together AI, and Baseten by supporting general-purpose Python workloads rather than just model APIs.

However, Modal faces pressure from both directions. Hyperscalers are rapidly adding GPU support to their serverless containers, which could erode Modal's convenience advantage at the enterprise level. Meanwhile, commodity GPU clouds remain cheaper for long-running, predictable workloads. Modal's ability to maintain loyalty will depend on continuing to deliver superior DX, expanding into enterprise-grade features like VPC networking and compliance certifications, and avoiding margin compression as GPU supply normalizes.

What to watch

01SOC 2 and enterprise compliance certifications to unlock regulated buyers
02Gross margin trends as GPU supply eases and pricing pressure increases
03Adoption of Modal Sandboxes for AI agent and third-party code execution
04Churn rates if AWS, GCP, or Azure improve native serverless GPU offerings

Frequently asked questions

How is Modal different from AWS Lambda?

Modal is purpose-built for GPU-intensive AI workloads, supporting longer timeouts, custom containers, and Python-native dependency management that general-purpose serverless platforms do not optimize for.

What kinds of AI teams use Modal?

Frontier research labs, startups, and enterprise AI teams use Modal for fine-tuning, batch inference, and evaluation jobs where iteration speed matters more than managing Kubernetes clusters.

Do I need to containerize my application to use Modal?

No. Modal automatically containersizes Python functions based on your dependencies, though you can supply custom Docker images if your workflow requires them.

What GPU types does Modal support?

Modal offers a range of Nvidia GPUs including A100 and H100 instances, with specific configurations varying by region, availability, and customer demand.

Is Modal only for research, or can it handle production workloads?

While popular in research, Modal supports production use cases with autoscaling, scheduled jobs, and persistent volumes, though enterprise buyers should verify current compliance certifications.

How does Modal pricing work?

Modal charges based on the GPU or CPU type and the exact duration your function runs, billed by the second, with no upfront infrastructure costs.

What are Modal Sandboxes?

Modal Sandboxes are isolated, ephemeral environments for running untrusted or exploratory code on GPUs, commonly used for third-party code execution and dynamic AI evaluations.

The bottom line

Modal has established itself as the default 'easy button' for AI engineers who need GPU compute without DevOps overhead, winning intense loyalty among research labs and startups for its Python-native experience. The critical challenge ahead is graduating from a beloved developer tool to a mission-critical enterprise platform with the compliance, networking, and security features that large buyers demand. If Modal can broaden its feature set without sacrificing developer experience, it has a clear shot at becoming the dominant serverless layer for AI workloads; if hyperscalers close the UX gap or GPU margin pressure intensifies before Modal locks in enterprise accounts, its growth could stall.

Visit Modal Labs

Key products

Modal
Modal Sandboxes

Modal Labs

At a glance

What they do

Origin story

Key products

Modal

Modal Sandboxes

Leadership

Funding history

Strengths & risks

Strengths

Risks

Recent moves

Closed $80M Series B at $650M valuation

Launched Modal Sandboxes for isolated GPU execution

Competitive position

What to watch

Frequently asked questions

Key products

In the news

Understanding Multimodal LLMs: Integrating Vision, Language, and Audio

Google Unveils Gemma 4 12B: Open-Source Multimodal AI for Local Laptops

Google drops Gemma 4 12B: tiny multimodal model runs locally on 16GB VRAM

Related companies

Runway

Nvidia

Meta AI

xAI