Jensen Huang on Dwarkesh Patel: Why CoWoS Is Nvidia's Moat

Name: Jensen Huang on Dwarkesh Patel: Why CoWoS Is Nvidia's Moat
Uploaded: 2026-04-15T00:00:00Z
Duration: 1 h 43 min 12 s
Description: Huang argues Nvidia’s core value is “electrons to tokens” conversion, where deep co-design across chips, systems, networking, and software makes commoditization unlikely.

CoWoS and HBM commitments placed years early lock supply before rivals can react; no challenger has posted inferencemax results matching Nvidia tokens per watt.

Dwarkesh PatelhostJensen Huangguest

Apr 15, 20261h 43mWatch on YouTube ↗

CHAPTERS

Why Nvidia won’t be “commoditized software”: electrons-to-tokens as the core value
Dwarkesh frames Nvidia as “just software” that sends designs to a manufacturing chain, raising the fear that AI could commoditize Nvidia the way it might commoditize other software firms. Jensen argues the real product is the hard-to-commoditize transformation from “electrons to tokens,” requiring deep co-design and engineering across the stack.
Supply chain as a moat: massive purchase commitments and ecosystem coordination
Dwarkesh points to Nvidia’s huge upstream purchase commitments as a potential moat: others may have accelerators but can’t secure memory, packaging, and leading-edge logic. Jensen explains why Nvidia can align and motivate upstream investment—because its downstream demand is large, credible, and organized via its ecosystem.
Can upstream capacity keep doubling? Bottlenecks, CoWoS, HBM, and “prefetching” constraints
Dwarkesh challenges whether AI compute can keep scaling when Nvidia already dominates leading nodes like TSMC N3. Jensen argues shortages are normal, bottlenecks get swarmed and relieved within a few years, and Nvidia is now forecasting and investing earlier—especially in packaging and interconnect—while also driving efficiency gains.
How far Nvidia pushes the chain: from TSMC to ASML, and why energy is the real long bottleneck
Dwarkesh asks how you scale EUV and leading-edge logic fast enough and whether Nvidia directly pressures deeper-tier suppliers. Jensen says replication is feasible with the right demand signal and that most chip-capacity bottlenecks resolve in ~2–3 years, while energy policy and power buildout are longer-cycle constraints.
TPUs and ASICs vs. Nvidia: accelerated computing’s breadth and programmability advantage
Dwarkesh highlights that frontier models have been trained on TPUs and asks what that implies. Jensen argues Nvidia is not just a matrix-multiply engine: its accelerated computing platform spans many domains and remains broadly programmable, which matters as architectures and algorithms keep changing rapidly.
Is CUDA still the moat when hyperscalers write custom kernels? Install base, trust, and Nvidia’s optimization help
Dwarkesh presses that top customers can replace parts of CUDA with Triton and custom stacks, potentially eroding Nvidia’s margin advantage. Jensen replies that CUDA is an ecosystem and reliability foundation, and that Nvidia’s own engineers routinely extract large performance gains for labs because they know the “F1 car” architecture best.
Why some big labs still use TPUs/Trainium: economics, investments, and the Anthropic exception
Dwarkesh asks why TPU/Trainium adoption happens if Nvidia has best TCO. Jensen argues the apparent trend is concentrated—especially via Anthropic’s unique deals—and that building “better than Nvidia” is harder than assumed, while ASIC margins aren’t dramatically lower once vendors take their cut.
Why Nvidia doesn’t become a hyperscaler: “as much as needed, as little as possible” and ecosystem investing
Dwarkesh argues Nvidia has the cash to fund datacenters and rent compute directly. Jensen explains Nvidia focuses on the hard platform work only it can do, while letting cloud markets be served by many players; Nvidia may catalyze neo-clouds and foundation labs with investments, but doesn’t want to be a financier or compete with partners.
GPU allocation and pricing: purchase orders, FIFO, and long-term trust with partners
Dwarkesh suggests Nvidia may allocate scarce GPUs strategically rather than by price. Jensen rejects “highest bidder” pricing and emphasizes forecasting, purchase orders, FIFO allocation, and serving customers who are operationally ready, tying this approach to long-term trust with customers and suppliers like TSMC.
Should we sell AI chips to China? Security risk vs. ecosystem control and avoiding market concession
Dwarkesh raises cyber-offense risks (e.g., vulnerability discovery models) and argues marginal compute helps adversaries. Jensen counters that China already has substantial compute, energy, and talent; the bigger strategic risk is forcing a bifurcated ecosystem where open-source models and developers optimize for non-US stacks, weakening US long-term leadership.
Process nodes, architecture, and why “7nm vs 2nm” isn’t the whole story
The discussion extends China policy into a broader point: node advantage is not proportional to real-world AI performance. Jensen argues Blackwell’s gains over Hopper vastly exceed what lithography alone would suggest, emphasizing architecture, networking, and software-stack innovations as decisive—and warning against simplistic FLOP/node comparisons.
Why Nvidia doesn’t pursue many radically different architectures—and when it might add new accelerators
Dwarkesh asks why Nvidia doesn’t run multiple divergent chip projects (wafer-scale, new programming models, etc.). Jensen says Nvidia evaluates alternatives in simulation and doesn’t see better options today, but notes market segmentation in inference (premium low-latency tokens) is creating niches where Nvidia may incorporate specialized accelerators like Groq into the CUDA ecosystem.
If deep learning hadn’t happened: Nvidia’s long-run mission of accelerated computing beyond AI
Dwarkesh closes by asking what Nvidia would be without deep learning. Jensen argues the company’s foundational bet was always that general-purpose CPU scaling would stall and that domain-specific acceleration would drive progress across science, engineering, graphics, and data processing—AI being a powerful, but not exclusive, beneficiary.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Why Nvidia won’t be “commoditized software”: electrons-to-tokens as the core value

Supply chain as a moat: massive purchase commitments and ecosystem coordination

Can upstream capacity keep doubling? Bottlenecks, CoWoS, HBM, and “prefetching” constraints

How far Nvidia pushes the chain: from TSMC to ASML, and why energy is the real long bottleneck

TPUs and ASICs vs. Nvidia: accelerated computing’s breadth and programmability advantage

Is CUDA still the moat when hyperscalers write custom kernels? Install base, trust, and Nvidia’s optimization help

Why some big labs still use TPUs/Trainium: economics, investments, and the Anthropic exception

Why Nvidia doesn’t become a hyperscaler: “as much as needed, as little as possible” and ecosystem investing

GPU allocation and pricing: purchase orders, FIFO, and long-term trust with partners

Should we sell AI chips to China? Security risk vs. ecosystem control and avoiding market concession

Process nodes, architecture, and why “7nm vs 2nm” isn’t the whole story

Why Nvidia doesn’t pursue many radically different architectures—and when it might add new accelerators

If deep learning hadn’t happened: Nvidia’s long-run mission of accelerated computing beyond AI

Get more out of YouTube videos.