Skip to content
No PriorsNo Priors

No Priors Ep. 70 | With Cartesia Co-Founders Karan Goel & Albert Gu

This week on No Priors, Sarah Guo and Elad Gil sit down with Karan Goel and Albert Gu from Cartesia. Karan and Albert first met as Stanford AI Lab PhDs, where their lab invented Space Models or SSMs, a fundamental new primitive for training large-scale foundation models. In 2023, they Founded Cartesia to build real-time intelligence for every device. One year later, Cartesia released Sonic which generates high quality and lifelike speech with a model latency of 135ms—the fastest for a model of this class. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @krandiash | @_albertgu Show Notes: 0:00 Introduction 0:28 Use Cases for Cartesia and Sonic 1:32 Karan Goel & Albert Gu’s professional backgrounds 5:06 State Space Models (SSMs) versus Transformer Based Architectures 11:51 Domain Applications for Hybrid Approaches 13:10 Text to Speech and Voice 17:29 Data, Size of Models and Efficiency 20:34 Recent Launch of Text to Speech Product 25:01 Multi-modality & Building Blocks 25:54 What’s Next at Cartesia? 28:28 Latency in Text to Speech 29:30 Choosing Research Problems Based on Aesthetic 31:23 Product Demo 32:48 Cartesia Team & Hiring

Sarah GuohostAlbert GuguestKaran GoelguestElad Gilhost
Jun 27, 202434mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
June 27, 2024
Duration
34m
Channel
No Priors
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

This week on No Priors, Sarah Guo and Elad Gil sit down with Karan Goel and Albert Gu from Cartesia. Karan and Albert first met as Stanford AI Lab PhDs, where their lab invented Space Models or SSMs, a fundamental new primitive for training large-scale foundation models. In 2023, they Founded Cartesia to build real-time intelligence for every device. One year later, Cartesia released Sonic which generates high quality and lifelike speech with a model latency of 135ms—the fastest for a model of this class. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @krandiash | @_albertgu Show Notes: 0:00 Introduction 0:28 Use Cases for Cartesia and Sonic 1:32 Karan Goel & Albert Gu’s professional backgrounds 5:06 State Space Models (SSMs) versus Transformer Based Architectures 11:51 Domain Applications for Hybrid Approaches 13:10 Text to Speech and Voice 17:29 Data, Size of Models and Efficiency 20:34 Recent Launch of Text to Speech Product 25:01 Multi-modality & Building Blocks 25:54 What’s Next at Cartesia? 28:28 Latency in Text to Speech 29:30 Choosing Research Problems Based on Aesthetic 31:23 Product Demo 32:48 Cartesia Team & Hiring

SPEAKERS

  • Sarah Guo

    host
  • Albert Gu

    guest
  • Karan Goel

    guest
  • Elad Gil

    host

EPISODE SUMMARY

In this episode of No Priors, featuring Sarah Guo and Albert Gu, No Priors Ep. 70 | With Cartesia Co-Founders Karan Goel & Albert Gu explores cartesia Bets on State Space Models for Real-Time Voice AI Cartesia co-founders Karan Goel and Albert Gu discuss their work on state space models (SSMs) like S4 and Mamba as efficient, elegant alternatives and complements to Transformers. They explain why SSMs are particularly well-suited for perceptual and multimodal data such as audio, and how this underpins their flagship product Sonic, a low-latency text-to-speech engine. The conversation covers technical trade-offs between SSMs and Transformers, hybrid architectures, and the potential to run powerful multimodal models on consumer devices instead of only in data centers. They also outline Cartesia’s roadmap toward multimodal conversational agents, on-device inference, and building a broader “rebellion” against Transformer-only thinking.

RELATED EPISODES

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 5 | With Huggingface’s Clem Delangue

No Priors Ep. 5 | With Huggingface’s Clem Delangue

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome