Skip to content
No PriorsNo Priors

No Priors Ep. 96 | With Modal CEO and Founder Erik Bernhardsson

Today on No Priors, Elad chats with Erik Bernhardsson, founder and CEO of Modal Labs, a platform simplifying ML workflows by providing a serverless infrastructure designed to streamline deployment, scaling, and development for AI engineers. Erik talks about his early work on Spotify’s ML algorithms, what Modal offers today, and his vision for building an end-to-end solution for AI engineers. They dive into GPU trends, cloud vs on-premise setups, and when to train custom models vs use off-the-shelf solutions. Erik also shares his thoughts on the evolving role of AI in fields like coding, physics, and music. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Bernhardsson Show Notes: 0:00 Introduction 0:22 Erik's early interest in ML infra 1:22 Founding Modal Labs 4:17 State of GPU use today and what’s to come 7:14 Modal's end-to-end vision 9:00 Differentiating amongst competition 10:20 Cloud vs on-premise 12:35 Popular AI models 13:20 Gaps in AI infrastructure 14:55 Insights on vector databases 16:48 Training models vs off-the-shelf models 17:47 AI’s impact on coding and physics 22:14 AI's impact on music

Elad GilhostErik BernhardssonguestSarah Guohost
Jan 8, 202523mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

Modal’s Erik Bernhardsson Rethinks Cloud Infrastructure For Modern AI Workloads

  1. Erik Bernhardsson, founder and CEO of Modal, explains how his experience building ML infrastructure at Spotify and Better.com led him to create a serverless, cloud-native compute platform optimized for AI and data applications.
  2. Modal offers a multi-tenant pool of CPUs and GPUs with a Python-first serverless interface, aiming to make cloud development feel as fast and simple as local development while abstracting away Docker, Kubernetes, and capacity planning.
  3. The discussion covers inefficiencies in current GPU usage, the shift from training to inference-heavy workloads, and the importance of flexible, usage-based GPU access—especially for bursty workloads like generative AI and experimental training.
  4. They also explore broader AI infrastructure topics: vector databases and AI-native storage, when to train your own models, physics and biology simulations, and the long-term impact of AI on software engineering demand.

IDEAS WORTH REMEMBERING

5 ideas

Flexible, usage-based GPU access is critical for modern AI workloads.

Traditional long-term GPU reservations misalign with volatile inference and experimental training needs; Modal instead charges only for actual container runtime and pools GPUs across customers to handle bursty demand.

Abstracting away Kubernetes and Docker can dramatically boost ML developer productivity.

By turning plain Python functions into serverless cloud functions and managing containerization, scheduling, and file systems internally, Modal aims to make cloud development feel as responsive as local coding.

A multi-tenant, cloud-native design enables better capacity utilization and scale.

Running all customers on a shared compute pool lets Modal offer near-instant access to large numbers of GPUs, offloading capacity planning and idle-resource risk from individual teams.

Inference is the current ‘killer app,’ but end-to-end ML support is the real goal.

While most Modal usage today is inference—especially for generative audio, video, image, and music—many customers already use it for preprocessing, and the company plans to support more training workflows over time.

Owning your model is often key to defensibility in AI-heavy products.

For companies where model quality is core to the product, relying solely on generic models weakens the moat; training specialized models (especially in audio, video, and imaging) can become a major differentiator.

WORDS WORTH SAVING

5 quotes

I always wanted to build basically a better infrastructure for data, AI, and machine learning.

Erik Bernhardsson

Working with the cloud is arguably kind of annoying… my idea was: what if cloud development feels almost as good as local development?

Erik Bernhardsson

For GPUs, the main way to get access has been to sign long-term contracts, and fundamentally that’s just not how startups should do it.

Erik Bernhardsson

Our goal has always been to build a platform and cover the end-to-end use case… the entire machine learning life cycle end-to-end.

Erik Bernhardsson

Every decade, engineers get 10 times more productive, and it turns out that just unlocks more latent demand for software engineers.

Erik Bernhardsson

Origin and vision of Modal as a serverless AI/ML infrastructure platformFlexible, usage-based GPU capacity and multi-tenant cloud computeEnd-to-end ML lifecycle support: preprocessing, training, and inferenceDifferentiation from hyperscalers and other AI infrastructure providersFuture of vector databases and AI-native data storage modelsWhen companies should train their own models versus using off-the-shelfAI’s impact on software engineering, scientific computing, and simulation

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome