Lenny's Podcast

Jason Droege: Why AI still needs someone to dig up the road

Through Meta's $14B stake and expert networks of doctors, engineers, and PhDs: production AI now takes 6 to 12 months, and Uber Eats lessons still apply.

Lenny RachitskyhostJason Droegeguest

Oct 8, 20251h 24mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Scale AI’s New CEO Explains Future Of AI, Data, And Entrepreneurship

Jason Droege, new CEO of Scale AI and creator of Uber Eats, discusses how AI progress is increasingly driven by expert human data, rigorous evaluations, and real-world enterprise deployments rather than just model architecture and compute.
He breaks down the Meta–Scale $14B deal, clarifies that Scale remains an independent company, and details how expert networks (doctors, engineers, PhDs) now label and evaluate complex tasks to move models from ‘knowing’ to ‘doing’.
Droege shares on-the-ground realities of deploying AI in enterprises—why production systems take 6–12 months, why most quick POCs fail, and why humans will remain in the loop for a long time.
Drawing on his experience scaling Uber Eats from zero to a $20B run-rate business, he also offers concrete lessons on customer obsession, business model selection, gross margins, risk-taking, and building durable teams.

IDEAS WORTH REMEMBERING

5 ideas

AI progress now depends heavily on expert human input, not just big datasets.

Early gen-AI systems used cheap, generalist labelers; today’s frontier models require highly skilled experts (engineers, doctors, PhDs) who spend hours on tasks like building full websites or explaining nuanced medical topics, and defining what ‘good’ looks like in each domain.

Evaluations (evals) are becoming the backbone of serious AI deployments.

Enterprises and governments increasingly need robust eval suites that encode ‘what good looks like’ for their specific processes (e.g., medical chart review, insurance claims), because off-the-shelf benchmarks and generic RAG/fine-tuning only get you part of the way.

Real enterprise AI impact typically takes 6–12 months, not weeks.

POCs often stall at 60–70% performance; getting to production-grade reliability requires months of iteration, domain data collection, policy and legal approvals, change management, and careful human–AI handoffs, which news headlines and demo videos tend to obscure.

Humans will remain central to AI training and oversight for the foreseeable future.

Droege argues that as long as new human skills, judgments, and organizational contexts matter, models will need fresh human-generated data and judgment; ‘no-human-in-the-loop’ would imply a world where no new human knowledge is worth encoding, which he sees as very far off.

Choosing the right market and business model dramatically improves odds of success.

When launching Uber Eats, Droege systematically compared ideas (convenience vans, grocery, point-to-point delivery) and prioritized a marketplace with strong unit economics, clear incrementality for restaurants, and the potential for network effects and large scale.

WORDS WORTH SAVING

5 quotes

The general trend right now is going from models knowing things to models doing things.

— Jason Droege

With any of these major tech revolutions, headlines tell one story, and then on the ground… someone’s got to dig up the road or run the undersea cable.

— Jason Droege

If you have a human process that is 10 or 20% accurate, AI is awesome. If it’s 98% accurate and you expect AI to get you the remaining 2%, we’re not totally there yet.

— Jason Droege

At the point at which you don’t need external human data in models, we’ve gotten to a level of advancement that is almost unfathomable.

— Jason Droege

Not losing is a precursor to winning. Survival is just part of the game.

— Jason Droege

Details and implications of Scale AI’s $14B Meta deal and current independenceEvolution of data labeling: from low-cost generalists to expert and in-house enterprise labelingReinforcement learning, agentic systems, and the shift from models ‘knowing’ to models ‘doing’Enterprise AI adoption: why POCs fail, what real deployments look like, and timelinesFuture demand for human experts and the long-term role of humans in AI trainingFrameworks for choosing new businesses (e.g., Uber Eats) and evaluating gross marginsHiring philosophy, team composition, and leadership lessons from Scour, Uber, and Scale

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.