Y CombinatorAlexandr Wang: Building Scale AI, Transforming Work With Agents & Competing With China
Garry Tan and Alexandr Wang on alexandr Wang on Scale AI, Agentic Workflows, and U.S.–China AI Rivalry.
In this episode of Y Combinator, featuring Garry Tan and Alexandr Wang, Alexandr Wang: Building Scale AI, Transforming Work With Agents & Competing With China explores alexandr Wang on Scale AI, Agentic Workflows, and U.S.–China AI Rivalry Alexandr Wang recounts Scale AI’s evolution from a YC-era “API for human labor” into a core infrastructure and applications provider for frontier AI labs, enterprises, and the U.S. Department of Defense. He explains how focusing early on self‑driving car data, then shifting to foundation model data and agentic applications, positioned Scale as the “NVIDIA of data.” Wang outlines a future of work where humans increasingly manage swarms of AI agents rather than being replaced by them, and describes how reinforcement learning and hard evaluations like Humanity’s Last Exam are driving model capabilities. He also warns about China’s rapid progress in AI—especially in data, manufacturing, and espionage—and argues that U.S. strategic advantage will hinge on compute, energy, and maintaining frontier models.
At a glance
WHAT IT’S REALLY ABOUT
Alexandr Wang on Scale AI, Agentic Workflows, and U.S.–China AI Rivalry
- Alexandr Wang recounts Scale AI’s evolution from a YC-era “API for human labor” into a core infrastructure and applications provider for frontier AI labs, enterprises, and the U.S. Department of Defense. He explains how focusing early on self‑driving car data, then shifting to foundation model data and agentic applications, positioned Scale as the “NVIDIA of data.” Wang outlines a future of work where humans increasingly manage swarms of AI agents rather than being replaced by them, and describes how reinforcement learning and hard evaluations like Humanity’s Last Exam are driving model capabilities. He also warns about China’s rapid progress in AI—especially in data, manufacturing, and espionage—and argues that U.S. strategic advantage will hinge on compute, energy, and maintaining frontier models.
IDEAS WORTH REMEMBERING
5 ideasNarrow early focus can bootstrap you into much larger markets.
Scale’s decision to specialize in self-driving car data, despite investor skepticism about market size, let it build a strong business quickly—even though that niche alone couldn’t support a gigantic company. That early success created the foundation and credibility to expand into language models, enterprise AI, and defense.
Data and evals are becoming the core strategic assets in AI.
Wang frames Scale as the “NVIDIA of data,” arguing that as models scale, data, environments, and hard evaluations become the true differentiators. He predicts that each firm’s core IP will be its own specialized, fine‑tuned models and the proprietary data and evals behind them—assets they must guard as tightly as codebases and databases.
The future of work is humans managing agents, not being replaced by them.
He describes an arc from AI as assistant, to single-agent pair programming, to swarms of agents handling complex workflows. In his view, the end state is a “manager of agents” economy where humans set vision, debug failures, coordinate agents, and satisfy human-driven demand, rather than being fully automated away.
Reinforcement learning and agentic workflows unlock new capability curves.
Wang notes that recent gains are less about pretraining scale and more about reasoning and RL-based techniques that turn messy human workflows into trainable environments. By converting repetitive, information-heavy processes into RL data (e.g., hiring briefs, research syntheses), organizations can systematically automate and improve them.
Hard, unsolved tasks are critical to steering the AI frontier.
Through Humanity’s Last Exam, Scale and research partners collect novel, extremely difficult scientific problems from top researchers, producing a benchmark that current models perform poorly on but rapidly improve against. Wang emphasizes that such hard evals both measure and shape progress, becoming the yardsticks labs optimize for.
WORDS WORTH SAVING
5 quotesThe need for data will basically grow to consume all available information and knowledge that humans have.
— Alexandr Wang
My belief is that the terminal state of the economy is just large‑scale humans manage agents, in a nutshell.
— Alexandr Wang
Startups have to switch from ‘What’s the narrowest market I can win?’ to ‘Where are the infinite markets, and how do I build toward them?’
— Alexandr Wang
The AI industry really continues to suffer from a lack of very hard evals and very hard tests that show the frontier of model capabilities.
— Alexandr Wang
You can tell people who are just phoning it in versus people who hang onto their work as so incredibly monumental and important that they do great work.
— Alexandr Wang
QUESTIONS ANSWERED IN THIS EPISODE
5 questionsIf every company’s core IP becomes its specialized model and data, how should startups decide what to keep proprietary versus what to share or open-source?
Alexandr Wang recounts Scale AI’s evolution from a YC-era “API for human labor” into a core infrastructure and applications provider for frontier AI labs, enterprises, and the U.S. Department of Defense. He explains how focusing early on self‑driving car data, then shifting to foundation model data and agentic applications, positioned Scale as the “NVIDIA of data.” Wang outlines a future of work where humans increasingly manage swarms of AI agents rather than being replaced by them, and describes how reinforcement learning and hard evaluations like Humanity’s Last Exam are driving model capabilities. He also warns about China’s rapid progress in AI—especially in data, manufacturing, and espionage—and argues that U.S. strategic advantage will hinge on compute, energy, and maintaining frontier models.
What concrete policies or investments should the U.S. prioritize to maintain a durable lead over China in AI, especially around energy, compute, and supply chains?
How can organizations practically identify which of their current workflows are best suited to be converted into agentic, RL-trainable processes?
At what point do swarms of AI agents become so capable that even the “manager of agents” role starts to erode, and how would we recognize that threshold?
What are the ethical and strategic risks of moving toward agent-driven warfare, where critical military decisions can be compressed from days to minutes?
EVERY SPOKEN WORD
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome