Skip to content
Dwarkesh PodcastDwarkesh Podcast

Dylan Patel — The single biggest bottleneck to scaling AI compute

Dylan Patel, founder of SemiAnalysis, provides a deep dive into the 3 big bottlenecks to scaling AI compute: logic, memory, and power. And walks through the economics of labs, hyperscalers, foundries, and fab equipment manufacturers. Learned a ton about every single level of the stack. Enjoy! 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkesh.com/p/dylan-patel * Apple Podcasts: https://podcasts.apple.com/us/podcast/dylan-patel-deep-dive-on-the-3-big-bottlenecks-to/id1516093381?i=1000755126873 * Spotify: https://open.spotify.com/episode/5qiibwoBWY5rXyflK7WJzH?si=SX4ajSKXT-KeNtaHsiTNzw 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 - Mercury has already saved me a bunch of time this tax season. Last year, I used Mercury to request W-9s from all the contractors I worked with. Then, when it came time to issue 1099s this year, I literally just clicked a button and Mercury sent them out. Learn more at https://mercury.com - Labelbox noticed that even when voice models appear to take interruptions in stride, their performance degrades. To figure out why, they built a new evaluation pipeline called EchoChain. EchoChain diagnoses voice models’ specific failure modes, letting you understand what your model needs to truly handle interruptions. Check it out at https://labelbox.com/dwarkesh - Jane Street is basically a research lab with a trading desk attached – and their infrastructure backs this up. They’ve got tens of thousands of GPUs, hundreds of thousands of CPU cores, and exabytes of storage. This is what it takes to find subtle signals hidden deep within noisy market data. If this sounds interesting, you can explore open positions at https://janestreet.com/dwarkesh To sponsor a future episode, visit https://dwarkesh.com/advertise. 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – Why an H100 is worth more today than 3 years ago 00:24:52 – Nvidia secured TSMC allocation early; Google is getting squeezed 00:34:34 – ASML will be the #1 constraint for AI compute scaling by 2030 00:55:47 – Can't we just use TSMC's older fabs? 01:05:37 – When will China outscale the West in semis? 01:16:01 – The enormous incoming memory crunch 01:42:34 – Scaling power in the US will not be a problem 01:54:44 – Space GPUs aren't happening this decade 02:14:07 – Why aren't more hedge funds making the AGI trade? 02:18:30 – Will TSMC kick Apple out from N2? 02:24:16 – Robots and Taiwan risk

Dwarkesh PatelhostDylan Patelguest
Mar 13, 20262h 30mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

AI compute scaling faces semiconductor tools, memory, and allocation bottlenecks

  1. Hyperscaler AI CapEx comes online over multiple years because large portions are pre-spent on long-lead items like turbines, power agreements, and data center buildouts well ahead of GPU deployment.
  2. In a supply-constrained world, older GPUs can become more valuable over time because model improvements raise the economic output per GPU faster than hardware obsolescence lowers it.
  3. By the late 2020s, the dominant bottleneck to scaling AI compute shifts upstream to semiconductor manufacturing—especially ASML’s EUV tool output—because fabs and tools have multi-year lead times that can’t be “sped up” like data centers.
  4. A major near-term constraint is memory (HBM/DRAM): AI demand pulls wafer capacity away from consumer devices, driving higher prices and potentially shrinking low- and mid-range smartphone/PC markets.
  5. Power, land, and permitting can be worked around via “behind-the-meter” generation and alternative sources, making them secondary constraints compared with chip, memory, and tool supply chains; space data centers are therefore unlikely this decade.

IDEAS WORTH REMEMBERING

5 ideas

AI CapEx is front-loaded into long-lead infrastructure, not just GPUs.

Patel argues much of Big Tech’s headline CapEx is deposits and commitments for future power (turbines, PPAs) and data center construction years out, enabling rapid scaling later even if this year’s deployed GW is smaller.

Compute scarcity flips GPU “depreciation” intuition.

If demand is constrained by supply (not by newer chips), pricing reflects the value extractable today; with better models running cheaper per token, an H100 can be worth more now than years ago despite newer generations.

Early long-term compute contracts create enduring margin advantages.

Labs that locked multi-year deals earlier avoid today’s higher spot/shorter-term rates; late buyers may pay materially higher $/GPU-hour or accept revenue-share markups through hyperscaler platforms.

By ~2028–2030, ASML EUV tools become the hard scaling governor.

He estimates ~2 million EUV “passes” per 1 GW of leading-edge AI compute and ~3.5 EUV tools per GW; with EUV tool production only rising from ~70/year toward ~100+/year, tool availability constrains total chip output.

“Just use older fabs” is possible but inefficient and not equivalent.

Older nodes lose more than FLOP/price suggests because multi-chip scaling penalties (latency/bandwidth across dies/racks) dominate; architectural advances (networking, memory hierarchy, packaging) don’t port cleanly to 7nm-era designs.

WORDS WORTH SAVING

5 quotes

an H100 is worth more today than it was three years ago.

Dylan Patel

by '28, '29, the bottleneck falls to the lowest rung on the supply chain, which is ASML, right? ASML makes the world's most complicated machine, i.e. an EUV tool.

Dylan Patel

it's funny to think about the numbers, right? Because we're talking about, oh, what's the gigawatt cost? It costs like fifty billion dollars roughly, right? Whereas what does three and a half EUV tools cost? That's like one point two, right?

Dylan Patel

It is, it is much further out once you have energy constraints actually being a big bottleneck, once you have space, land permitting be a much bigger bottleneck as it subsumes more and more of the economy, um, and, and chips are no longer the bottleneck.

Dylan Patel

it costs like close to fifty gigawatts. Now, obviously, we're not putting on fifty gigawatts this year

Dwarkesh Patel

Why H100 value rose despite newer chipsLong-term vs spot GPU contracts and margin dynamicsNvidia/TSMC N3 allocation and hyperscaler squeezeASML EUV throughput as 2030 ceiling; EUV passes per gigawatt mathWhy older-node fallback (7nm/DUV) is not a clean escape hatchHBM/DRAM bandwidth vs capacity tradeoffs; impending memory crunchPower scaling via behind-the-meter generation vs chips as bottleneckChina’s DUV/EUV trajectory and when scale could rival the WestSpace data centers: networking, reliability, deployment-time penaltiesTSMC/Apple on N2 and shifting priority toward AI customersTaiwan concentration risk and limits of “airlifting engineers”

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.