Skip to content
Y CombinatorY Combinator

Paul Buchheit: Why Evals, Not Code, Are the Real AI Moat

Predicting the next token dissolved the paperclip maximizer fear; now eval sets, not codebases, are the moat, as Jerry shows with 50% growth post-GPT-4.

Garry TanhostPaul BuchheitguestHarj TaggarhostJared FriedmanhostDiana Huhost
Jan 23, 202539mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

AI Agents Explode Startup Growth, Redefining Work, Wealth, And Software

  1. The episode covers how AI, especially AI agents for businesses, is radically accelerating startup growth, with many YC companies hitting revenue milestones in months that once took years. The hosts describe unprecedented enterprise demand for AI, where buyers already want solutions and technical founders win by simply delivering products that actually work. They argue that the true moats are now eval datasets, prompting expertise, rapid iteration, and willingness to constantly rebuild on the latest models. The conversation broadens into how AI can increase human agency, reshape labor and wealth (machine money vs human money), and why we may be on a “good timeline” for AI development rather than a dystopian one.

IDEAS WORTH REMEMBERING

5 ideas

AI startups are growing faster with fewer people, changing growth benchmarks.

YC now sees entire batches averaging 10% weekly growth and companies going from zero to $12M ARR in a year, often with very small teams; hitting $1M ARR in 6–12 months is becoming a baseline expectation for strong AI startups.

Demand for AI agents in enterprises is so strong that sales friction has flipped.

Instead of convincing customers they need AI, founders find enterprises under internal pressure to adopt it; the bottleneck is no longer demand but building agents that truly perform at or above human level for tasks like support and sales.

Eval sets and prompting are emerging as key competitive moats, not just code.

Founders report that meticulously labeled evaluation datasets and refined prompting strategies are often more valuable than their codebases, enabling reliable, predictable AI behavior that competitors can’t easily copy even if they use the same base models.

Relentless iteration and willingness to rebuild on new AI capabilities is crucial.

Successful teams frequently throw away old architectures, switch tools (e.g., from vector DBs to PGVector), and rewrite stacks every few months, gaining an edge over larger organizations that can’t update as quickly.

AI tools are redefining productivity expectations for engineers and designers.

Tools like Cursor and Claude are becoming must-haves; some founders won’t hire engineers who don’t use AI codegen, and designers are skipping Figma in favor of text-to-code workflows, raising the baseline of what one person can produce.

WORDS WORTH SAVING

5 quotes

This is the first time no one's saying no. Everyone is saying yes, and like, more.

Harj Taggar

It sounds like what's driving the growth is that the demand is already there. And so you just have to show up with a product that works.

Paul Buchheit

The most valuable thing that his company has built is not the code base. It's the eval set.

Diana Hu (paraphrasing a founder at the retreat)

We actually found the right objective function, which is simply to predict the next token… and the great thing about that is we've been able to create this intelligence that doesn't have this drive to survive.

Paul Buchheit

It's never been a better time to be a founder, that's for sure.

Harj Taggar

Unprecedented growth rates and ambition among AI startups in recent YC batchesEnterprise demand for AI agents and why sales are easier but building is harderEval datasets, prompting, and rapid iteration as core AI-era startup moatsImpact of AI tools on developer productivity, hiring, and the future of SaaSAI’s role in wealth creation, job displacement fears, and “machine vs human money”Regulation, agency, and the importance of open competition among AI labsHistorical perspective from YC and OpenAI on how we reached the current AI moment

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome