Skip to content
Dwarkesh PodcastDwarkesh Podcast

Leopold Aschenbrenner — 2027 AGI, China/US super-intelligence race, & the return of history

Chatted with my friend Leopold Aschenbrenner about the trillion dollar cluster, unhobblings + scaling = 2027 AGI, CCP espionage at AI labs, leaving OpenAI and starting an AGI investment firm, dangers of outsourcing clusters to the Middle East, & The Project. Read the new essay series from Leopold this episode is based on here: https://situational-awareness.ai/ 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkeshpatel.com/p/leopold-aschenbrenner * Apple Podcasts: https://podcasts.apple.com/us/podcast/leopold-aschenbrenner-china-us-super-intelligence-race/id1516093381?i=1000657821539 * Spotify: https://open.spotify.com/episode/5NQFPblNw8ewxKolIDpiYN?si=6NaTHAugT2SxZrspW3lziw * Follow me on Twitter: https://twitter.com/dwarkesh_sp * Follow Leopold on Twitter: https://x.com/leopoldasch 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 The trillion-dollar cluster and unhobbling 00:21:20 AI 2028: The return of history 00:41:15 Espionage & American AI superiority 01:09:09 Geopolitical implications of AI 01:32:12 State-led vs. private-led AI 02:13:12 Becoming Valedictorian of Columbia at 19 02:31:24 What happened at OpenAI 02:46:00 Intelligence explosion 03:26:47 Alignment 03:42:15 On Germany, and understanding foreign perspectives 03:57:53 Dwarkesh's immigration story and path to the podcast 04:03:16 Random questions 04:08:47 Launching an AGI hedge fund 04:20:03 Lessons from WWII 04:29:57 Coda: Frederick the Great

Leopold AschenbrennerguestDwarkesh Patelhost
Jun 4, 20244h 32mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 7:46

    Why trillion-dollar AI clusters are plausible (compute, power, CapEx trajectories)

    Leopold lays out a straight-line extrapolation from GPT-4-scale training to gigawatt and eventually “trillion-dollar” clusters, emphasizing electricity demand, data center buildouts, and industrial scaling. He argues that the key constraint becomes power generation and infrastructure, not just algorithms or code.

    • Training compute scaling trend (~0.5 OOM/year) as the backbone of the forecast
    • Translating model scale into cluster cost, GPU counts, and megawatts/gigawatts
    • Why inference demand may dwarf training as AI products proliferate
    • Energy production as the binding constraint and a coming ‘industrial’ phase for AI
  2. 7:46 – 11:37

    From cluster size to capabilities: the ‘drop-in remote worker’ vision of AGI

    Dwarkesh pushes for a concrete mapping from 1GW/10GW clusters to model capability. Leopold predicts near-term models exceeding typical college graduates, and then ‘AGI’ as an agent that can run long-horizon tasks like a coworker—less chatbot, more autonomous worker integrated into tools and workflows.

    • 2025–26: models smarter than most college grads (per-token intelligence)
    • AGI around 10GW era (late-2020s) as ‘agent’ rather than chatbot
    • ‘Unhobbling’ as the practical unlock: tool use, computer control, long-horizon tasks
    • Economic diffusion may be delayed until models become easy to integrate into workflows
  3. 11:37 – 15:22

    Unlocking the test-time compute overhang (System 2, planning, and RL)

    Leopold argues that today’s models are bottlenecked less by raw intelligence and more by their inability to reliably iterate, correct errors, and plan over long contexts. He frames the next leap as enabling “System 2” cognition—planning tokens, self-critique, and error correction—so models can effectively spend millions of tokens per problem.

    • Test-time compute as a huge latent capability multiplier (millions of tokens)
    • Why current models ‘get stuck’ despite partial chain-of-thought abilities
    • Two routes to agents: scaling reliability vs ‘System 2’ unhobbling
    • Analogy: driving on autopilot vs concentrating in a construction zone
  4. 15:22 – 21:55

    Why ‘unhobbling’ might be feasible: pretraining as representation magic + RL bootstrapping

    Dwarkesh challenges whether System 2 is hard to acquire; Leopold responds that pretraining creates rich world representations and that RL/synthetic data can distill deliberate cognition into weights. He compares human self-learning (reading, struggling, feedback loops) to RL’s potential to generate the most informative training signals.

    • Pretraining’s real gift: representations and generalization, not ‘just next-token’
    • RLHF as prior unhobbling; next step is agentic cognition and self-improvement loops
    • Synthetic data / self-play as the path past the ‘data wall’
    • Human learning analogy: read → think → practice → fail → click → distill
  5. 21:55 – 25:05

    The ‘midgame’ societal reaction: AI’s next COVID-like moment and the return of history

    Leopold predicts repeated ‘2023-like’ shocks as each model generation shifts expectations and accelerates CapEx and revenue. He frames the coming period as a return to historical stakes—where nations mobilize industrially and politically when they believe decisive power is on the line.

    • Quiet periods hide compounding progress; big jumps reset the Overton window
    • Revenue scaling as the engine that finances ever-larger clusters
    • Analogy to early-2020 COVID perception vs abrupt societal mobilization
    • ‘Return of history’: societies can re-enter wartime-like mobilization modes
  6. 25:05 – 58:08

    From AGI to superintelligence: automating AI R&D and compressing centuries of progress

    Leopold outlines an ‘intelligence explosion’ narrative: once AGI can do AI research, it accelerates algorithmic progress dramatically, then spills into broader R&D and robotics. He emphasizes that even small temporal leads could translate into overwhelming military and technological advantages.

    • Automating AI research as the highest-leverage early application
    • Potential for ‘decades of ML progress in a year’ once research is scaled
    • Robotics and industrial capacity as downstream consequences of cognitive superintelligence
    • Military implications: compressing a century of tech progress into <10 years
  7. 58:08 – 1:11:45

    Espionage and the fragility of leads: why secrets and weights matter

    Dwarkesh asks whether China can simply catch up via parallel invention; Leopold argues that stolen weights and algorithmic secrets could erase US leads, and that current lab security is far below state-adversary standards. He distinguishes between today’s low-stakes model theft and the catastrophic implications once weights correspond to AGI/ASI.

    • Threat model 1: stealing weights (copying the ‘end product’)
    • Threat model 2: stealing algorithmic breakthroughs needed to surpass the data wall
    • Why even 6–12 months of lead can be decisive under fast takeoff dynamics
    • Current frontier labs’ security posture described as ‘startup-level,’ not state-resistant
  8. 1:11:45 – 1:18:35

    State-level capabilities and escalation: Stuxnet, skiffs, and ‘nuclear deterrence for data centers’

    Leopold argues that true defense against state espionage requires measures resembling military programs: air-gapped environments, vetted hardware/software supply chains, clearances, and government involvement. He also highlights a destabilizing window where superintelligence exists before full industrialization, making data centers tempting first-strike targets.

    • Examples of state espionage: zero-days, coercion, deep infiltration tactics
    • Security escalation path: from economic espionage defenses to state-adversary defenses
    • Data centers as vulnerable strategic assets during early superintelligence
    • Potential need for extreme deterrence logic to prevent kinetic/sabotage attacks
  9. 1:18:35 – 1:30:50

    Can the US and China cooperate? Arms control instability under intelligence explosion incentives

    Dwarkesh pushes for a cooperative frame; Leopold argues arms control is hardest when breakout is decisive and fast, unlike stable nuclear MAD equilibria. He proposes a sequencing logic: first build an undeniable democratic-coalition lead, then offer a bargain (“Atoms for Peace”-style benefit sharing) from a position of strength.

    • Why arms control is easier when marginal advantage doesn’t change outcomes (nukes)
    • Why AGI/ASI breakout could be highly destabilizing if a small lead becomes decisive
    • Proposal: narrow democratic coalition for frontier development + broader benefit-sharing tier
    • Offer China a deal only after a clear, enforceable lead exists
  10. 1:30:50 – 4:32:06

    State-led vs private-led AI: what ‘government project’ could mean and why privatization may fail

    The conversation turns to governance structure: Dwarkesh worries that nationalization concentrates power in the monopoly-on-violence institution, while Leopold worries private labs cannot secure against states and could yield chaotic multipolar races (companies + China/Russia). Leopold sketches checks-and-balances ideas (Congress, courts, constitutionally constrained AIs, allied coalitions) while conceding the path likely becomes government-involved regardless.

    • Why open-source/decentralized AGI is unlikely under trillion-dollar cluster economics
    • Private-lab world risks: thin leads, internal chaos, proliferation, and inadequate security
    • Government-project vision: public–private partnerships, allied pooling, legal constraints
    • Debate over reversibility: nationalization as ‘one-way’ vs crisis-driven late mobilization

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.