Why The Next AI Breakthroughs Will Be In Reasoning, Not Scaling

Harj Taggar on aI’s Next Leap: Reasoning Engines Transform Science, Chips, And Startups.

Harj TaggarhostGarry TanhostJared FriedmanhostDiana HuhostDiana Huhost

Nov 14, 202435m

Sam Altman’s AGI/ASI timeline and techno-optimist vision for abundanceOpenAI’s o1 model and the shift from pure scaling to advanced reasoningReal-world applications of o1 in chip/PCB design and CAD/airfoil engineeringReinforcement learning, chain-of-thought training, and o1’s architectural inspirationsStartup strategy: moats via evals, proprietary data, and deep integrationsAI transforming rote work like large-scale customer supportNew startup opportunities in hard tech and the physical/atom world

In this episode of Y Combinator, featuring Harj Taggar and Garry Tan, Why The Next AI Breakthroughs Will Be In Reasoning, Not Scaling explores aI’s Next Leap: Reasoning Engines Transform Science, Chips, And Startups The hosts discuss Sam Altman’s recent AGI/ASI essay and argue that the next major AI breakthroughs will come from reasoning-focused models like OpenAI’s o1, not just larger model scaling. They highlight how o1’s chain-of-thought and reinforcement-learning-inspired architecture unlocks step-function improvements in hard domains such as chip design, CAD/airfoils, and complex customer support. Several YC-backed companies are showcased as concrete examples of o1 enabling system-level engineering, advanced physics simulations, and high-accuracy automation that older models like GPT-4o couldn’t reliably handle. The conversation closes with implications for startup moats, the centrality of evals and proprietary data, and how reasoning models may usher in a “Star Trek” style age of physical-world abundance if steered well.

AI’s Next Leap: Reasoning Engines Transform Science, Chips, And Startups

The hosts discuss Sam Altman’s recent AGI/ASI essay and argue that the next major AI breakthroughs will come from reasoning-focused models like OpenAI’s o1, not just larger model scaling. They highlight how o1’s chain-of-thought and reinforcement-learning-inspired architecture unlocks step-function improvements in hard domains such as chip design, CAD/airfoils, and complex customer support. Several YC-backed companies are showcased as concrete examples of o1 enabling system-level engineering, advanced physics simulations, and high-accuracy automation that older models like GPT-4o couldn’t reliably handle. The conversation closes with implications for startup moats, the centrality of evals and proprietary data, and how reasoning models may usher in a “Star Trek” style age of physical-world abundance if steered well.

Key Takeaways

Reasoning-focused models mark a step-change beyond simple scaling.

OpenAI’s o1 introduces chain-of-thought and reinforcement-learning-style training that lets models ‘think through’ problems, enabling capabilities (like complex chip system design) that GPT-4o could not handle with the same prompts.

AI can now meaningfully automate expert-level hardware and engineering tasks.

Companies like Diode Computer and Camfer use o1 to perform high-level PCB system design, component selection, and multi-equation airfoil simulations—work that previously required specialized electrical or mechanical engineers.

Eval sets and proprietary workflows are becoming core startup moats.

The hosts argue that writing tens of thousands of high-quality eval cases, especially using non-public, domain-specific data, is a durable advantage when everyone accesses similar base models.

Advanced reasoning dramatically increases automation viability in messy domains.

Gigaml’s customer support product jumped from about 70% error in hard cases to roughly 5% error (around 85% accuracy) using o1 and rigorous evals, making AI agents credible for complex, non-rules-based support.

Strong technical teams matter more, not less, in the o1 era.

While some fear AI will commoditize engineering, the panel believes the highest value will accrue to teams that can push models the final 10%—through clever prompts, evals, UI, integrations, and domain-specific reasoning.

The biggest upside is in the physical world: atoms, not just bits.

Given o1’s strength in math and physics, the hosts see major new opportunities in mechanical, electrical, chemical, and bioengineering—areas like fusion, fluid mechanics, and advanced materials that can drive real-world abundance.

We’re early in a rapidly compounding capability curve.

The current o1-preview is already transformative, with full o1, o2, and o3 expected soon, and Altman targeting up to four orders of magnitude more compute—implying today’s impressive abilities are the worst these systems will ever be.

Notable Quotes

“It’s the worst that these models are ever going to be right now, right this moment.”
— Gary

“What was missing from its ability to actually do science and accelerate technological progress is it needs to be able to think through things.”
— Jared

“They went from 0% accuracy to 85% accuracy.”
— Diana (about Gigaml’s o1-powered customer support)

“All of the value is probably going to be captured by the strongest technical teams who can build on top of whatever the base level of tech is and get the final 10%.”
— Harj

“It can’t just be helping people click a little bit faster. It’s gotta be things that actually create real-world abundance for everyone.”
— Gary

Questions Answered in This Episode

How does o1’s chain-of-thought and reinforcement learning approach fundamentally differ from traditional next-token prediction models?

The hosts discuss Sam Altman’s recent AGI/ASI essay and argue that the next major AI breakthroughs will come from reasoning-focused models like OpenAI’s o1, not just larger model scaling. ...

What kinds of engineering and scientific problems are likely to be unlocked first as o1 and its successors improve?

How can startups practically build and maintain large, high-quality eval sets that become defensible moats?

What governance or safety mechanisms are needed if AI systems start designing chips and physical systems beyond typical human oversight?

In a world where base models handle more reasoning, where should human experts focus their effort to stay uniquely valuable?

EVERY SPOKEN WORD

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome