Skip to content
Lenny's PodcastLenny's Podcast

Chip Huyen: Why RAG wins come from data prep, not vector DBs

Preparing data and talking to users beats agonizing over which vector database; Huyen says post-training, not new models, drives real AI product wins.

Chip HuyenguestLenny Rachitskyhost
Oct 23, 20251h 22mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
October 23, 2025
Duration
1h 22m
Channel
Lenny's Podcast
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

Chip Huyen is a core developer on Nvidia’s Nemo platform, a former AI researcher at Netflix, and taught machine learning at Stanford. She’s a two-time founder and the author of two widely read books on AI, including AI Engineering, which has been the most-read book on the O’Reilly platform since its launch. Unlike many AI commentators, Chip has built multiple successful AI products and platforms and works directly with enterprises on their AI strategies, giving her unique visibility into what’s actually happening inside companies building AI products. *We discuss:*

  1. What people think makes AI apps better vs. what actually makes AI apps better
  2. What pre-training vs. post-training is, and why fine-tuning should be your last resort
  3. How RLHF (reinforcement learning from human feedback) actually works
  4. Why data quality matters more than which vector database you choose
  5. Why high performers are seeing the most gains from AI coding tools
  6. Why most AI problems are actually UX issues

*Brought to you by:* Dscout—The UX platform to capture insights at every stage: from ideation to production: https://www.dscout.com/ Justworks—The all-in-one HR solution for managing your small business with confidence: https://www.justworks.com Persona—A global leader in digital identity verification: https://withpersona.com/lenny *Transcript:* https://www.lennysnewsletter.com/p/al-engineering-101-with-chip-huyen *My biggest takeaways (for paid newsletter subscribers):* https://www.lennysnewsletter.com/i/176081814/my-biggest-takeaways-from-this-conversation *Where to find Chip Huyen:*

*Where to find Lenny:*

*In this episode, we cover:* (00:00) Introduction to Chip Huyen (04:28) Chip’s viral LinkedIn post (07:05) Understanding AI training: pre-training vs. post-training (08:50) Language modeling explained (13:55) The importance of post-training (15:20) Reinforcement learning and human feedback (22:23) The importance of evals in AI development (31:55) Retrieval augmented generation (RAG) explained (38:50) Challenges in AI tool adoption (43:19) Challenges in measuring productivity (45:20) The three-bucket test (49:10) The future of engineering roles (55:31) ML Engineers vs. AI engineers (57:12) Looking forward: the impact of AI (01:05:48) Model capabilities vs. perceived performance (01:08:23) Lightning round and final thoughts *Referenced:*

•Inside the expert network training every frontier AI model | Garrett Lord (Handshake CEO): https://www.lennysnewsletter.com/p/inside-handshake-garrett-lord

...References continued at: https://www.lennysnewsletter.com/p/al-engineering-101-with-chip-huyen *Recommended books:*

_Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com._ Lenny may be an investor in the companies discussed.

SPEAKERS

  • Chip Huyen

    guest
  • Lenny Rachitsky

    host

EPISODE SUMMARY

In this episode of Lenny's Podcast, featuring Chip Huyen and Lenny Rachitsky, Chip Huyen: Why RAG wins come from data prep, not vector DBs explores chip Huyen Explains Real-World AI Engineering, Beyond Hype And Headlines Chip Huyen joins Lenny to demystify AI engineering, focusing on how real products get built and improved versus what people *think* matters. She contrasts pre-training, post-training, fine-tuning, RAG, RLHF, evals, and test-time compute, always tying concepts back to concrete product decisions. A recurring theme is that teams over-index on new models, tools, and news, and under-invest in talking to users, preparing better data, and designing robust end-to-end systems. She also shares what she’s seeing inside enterprises: where GenAI is actually delivering value, how org structures and engineering roles are shifting, and why we’re in an “idea crisis” despite unprecedented AI capabilities.

RELATED EPISODES

How to build a company that withstands any era | Eric Ries, Lean Startup author

How to build a company that withstands any era | Eric Ries, Lean Startup author

Head of Claude Code: What happens after coding is solved | Boris Cherny

Head of Claude Code: What happens after coding is solved | Boris Cherny

Building product at Stripe: craft, metrics, and customer obsession | Jeff Weinstein (Product lead)

Building product at Stripe: craft, metrics, and customer obsession | Jeff Weinstein (Product lead)

Building a world-class data org | Jessica Lachs (VP of Analytics and Data Science at DoorDash)

Building a world-class data org | Jessica Lachs (VP of Analytics and Data Science at DoorDash)

What most people miss about marketing | Rory Sutherland (Vice Chairman of Ogilvy UK, author)

What most people miss about marketing | Rory Sutherland (Vice Chairman of Ogilvy UK, author)

5 essential questions to craft a winning strategy | Roger Martin (author, advisor, speaker)

5 essential questions to craft a winning strategy | Roger Martin (author, advisor, speaker)

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome