No PriorsNo Priors Ep. 96 | With Modal CEO and Founder Erik Bernhardsson
At a glance
WHAT IT’S REALLY ABOUT
Modal’s Erik Bernhardsson Rethinks Cloud Infrastructure For Modern AI Workloads
- Erik Bernhardsson, founder and CEO of Modal, explains how his experience building ML infrastructure at Spotify and Better.com led him to create a serverless, cloud-native compute platform optimized for AI and data applications.
- Modal offers a multi-tenant pool of CPUs and GPUs with a Python-first serverless interface, aiming to make cloud development feel as fast and simple as local development while abstracting away Docker, Kubernetes, and capacity planning.
- The discussion covers inefficiencies in current GPU usage, the shift from training to inference-heavy workloads, and the importance of flexible, usage-based GPU access—especially for bursty workloads like generative AI and experimental training.
- They also explore broader AI infrastructure topics: vector databases and AI-native storage, when to train your own models, physics and biology simulations, and the long-term impact of AI on software engineering demand.
IDEAS WORTH REMEMBERING
5 ideasFlexible, usage-based GPU access is critical for modern AI workloads.
Traditional long-term GPU reservations misalign with volatile inference and experimental training needs; Modal instead charges only for actual container runtime and pools GPUs across customers to handle bursty demand.
Abstracting away Kubernetes and Docker can dramatically boost ML developer productivity.
By turning plain Python functions into serverless cloud functions and managing containerization, scheduling, and file systems internally, Modal aims to make cloud development feel as responsive as local coding.
A multi-tenant, cloud-native design enables better capacity utilization and scale.
Running all customers on a shared compute pool lets Modal offer near-instant access to large numbers of GPUs, offloading capacity planning and idle-resource risk from individual teams.
Inference is the current ‘killer app,’ but end-to-end ML support is the real goal.
While most Modal usage today is inference—especially for generative audio, video, image, and music—many customers already use it for preprocessing, and the company plans to support more training workflows over time.
Owning your model is often key to defensibility in AI-heavy products.
For companies where model quality is core to the product, relying solely on generic models weakens the moat; training specialized models (especially in audio, video, and imaging) can become a major differentiator.
WORDS WORTH SAVING
5 quotesI always wanted to build basically a better infrastructure for data, AI, and machine learning.
— Erik Bernhardsson
Working with the cloud is arguably kind of annoying… my idea was: what if cloud development feels almost as good as local development?
— Erik Bernhardsson
For GPUs, the main way to get access has been to sign long-term contracts, and fundamentally that’s just not how startups should do it.
— Erik Bernhardsson
Our goal has always been to build a platform and cover the end-to-end use case… the entire machine learning life cycle end-to-end.
— Erik Bernhardsson
Every decade, engineers get 10 times more productive, and it turns out that just unlocks more latent demand for software engineers.
— Erik Bernhardsson
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome