Skip to content
Lenny's PodcastLenny's Podcast

Chip Huyen: Why RAG wins come from data prep, not vector DBs

Preparing data and talking to users beats agonizing over which vector database; Huyen says post-training, not new models, drives real AI product wins.

Chip HuyenguestLenny Rachitskyhost
Oct 23, 20251h 22mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
October 23, 2025
Duration
1h 22m
Channel
Lenny's Podcast
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

Chip Huyen is a core developer on Nvidia’s Nemo platform, a former AI researcher at Netflix, and taught machine learning at Stanford. She’s a two-time founder and the author of two widely read books on AI, including AI Engineering, which has been the most-read book on the O’Reilly platform since its launch. Unlike many AI commentators, Chip has built multiple successful AI products and platforms and works directly with enterprises on their AI strategies, giving her unique visibility into what’s actually happening inside companies building AI products. *We discuss:*

  1. What people think makes AI apps better vs. what actually makes AI apps better
  2. What pre-training vs. post-training is, and why fine-tuning should be your last resort
  3. How RLHF (reinforcement learning from human feedback) actually works
  4. Why data quality matters more than which vector database you choose
  5. Why high performers are seeing the most gains from AI coding tools
  6. Why most AI problems are actually UX issues

*Brought to you by:* Dscout—The UX platform to capture insights at every stage: from ideation to production: https://www.dscout.com/ Justworks—The all-in-one HR solution for managing your small business with confidence: https://www.justworks.com Persona—A global leader in digital identity verification: https://withpersona.com/lenny *Transcript:* https://www.lennysnewsletter.com/p/al-engineering-101-with-chip-huyen *My biggest takeaways (for paid newsletter subscribers):* https://www.lennysnewsletter.com/i/176081814/my-biggest-takeaways-from-this-conversation *Where to find Chip Huyen:*

*Where to find Lenny:*

*In this episode, we cover:* (00:00) Introduction to Chip Huyen (04:28) Chip’s viral LinkedIn post (07:05) Understanding AI training: pre-training vs. post-training (08:50) Language modeling explained (13:55) The importance of post-training (15:20) Reinforcement learning and human feedback (22:23) The importance of evals in AI development (31:55) Retrieval augmented generation (RAG) explained (38:50) Challenges in AI tool adoption (43:19) Challenges in measuring productivity (45:20) The three-bucket test (49:10) The future of engineering roles (55:31) ML Engineers vs. AI engineers (57:12) Looking forward: the impact of AI (01:05:48) Model capabilities vs. perceived performance (01:08:23) Lightning round and final thoughts *Referenced:*

•Inside the expert network training every frontier AI model | Garrett Lord (Handshake CEO): https://www.lennysnewsletter.com/p/inside-handshake-garrett-lord

...References continued at: https://www.lennysnewsletter.com/p/al-engineering-101-with-chip-huyen *Recommended books:*

_Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com._ Lenny may be an investor in the companies discussed.

SPEAKERS

  • Chip Huyen

    guest
  • Lenny Rachitsky

    host

EPISODE SUMMARY

In this episode of Lenny's Podcast, featuring Chip Huyen and Lenny Rachitsky, Chip Huyen: Why RAG wins come from data prep, not vector DBs explores chip Huyen Explains Real-World AI Engineering, Beyond Hype And Headlines Chip Huyen joins Lenny to demystify AI engineering, focusing on how real products get built and improved versus what people *think* matters. She contrasts pre-training, post-training, fine-tuning, RAG, RLHF, evals, and test-time compute, always tying concepts back to concrete product decisions. A recurring theme is that teams over-index on new models, tools, and news, and under-invest in talking to users, preparing better data, and designing robust end-to-end systems. She also shares what she’s seeing inside enterprises: where GenAI is actually delivering value, how org structures and engineering roles are shifting, and why we’re in an “idea crisis” despite unprecedented AI capabilities.

RELATED EPISODES

What happens after coding is solved? | Fiona Fung (Claude Code & Cowork)

What happens after coding is solved? | Fiona Fung (Claude Code & Cowork)

The hidden pattern behind successful products | Mark Pincus (FarmVille, Words with Friends, & more)

The hidden pattern behind successful products | Mark Pincus (FarmVille, Words with Friends, & more)

Tony Fadell: How to build real taste (and why AI makes it matter more)

Tony Fadell: How to build real taste (and why AI makes it matter more)

The most rational take on AI you’ll hear this year

The most rational take on AI you’ll hear this year

AI predictions: Job markets, Codex beats Claude, and the death of org charts | Dan Shipper

AI predictions: Job markets, Codex beats Claude, and the death of org charts | Dan Shipper

Why the next AI boom is physical AI | Caitlin Kalinowski (ex-OpenAI, Meta, Apple)

Why the next AI boom is physical AI | Caitlin Kalinowski (ex-OpenAI, Meta, Apple)

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.