Dwarkesh PodcastJensen Huang on Dwarkesh Patel: Why CoWoS Is Nvidia's Moat
CoWoS and HBM commitments placed years early lock supply before rivals can react; no challenger has posted inferencemax results matching Nvidia tokens per watt.
At a glance
WHAT IT’S REALLY ABOUT
Jensen Huang on Nvidia’s moats, TPUs, clouds, and China policy
- Huang argues Nvidia’s core value is “electrons to tokens” conversion, where deep co-design across chips, systems, networking, and software makes commoditization unlikely.
- He frames Nvidia’s supply-chain moat as a function of credible downstream demand that convinces upstream partners (foundries, packaging, memory, photonics) to invest and scale bottlenecks within ~2–3 years.
- On TPUs/ASICs, he claims GPUs win because AI progress depends on programmability and frequent algorithmic shifts (not just matrix multiply), plus Nvidia’s install base and engineering support for customers’ stacks.
- He explains why Nvidia avoids becoming a hyperscaler: clouds already exist, while Nvidia focuses on the “hard part” (platform + ecosystem) and selectively invests to ensure partners and AI labs can scale.
- In a long debate on China export controls, Huang contends China has ample energy, talent, and chip capacity to progress regardless, and that conceding the China market risks splitting ecosystems and weakening the American tech stack long-term.
IDEAS WORTH REMEMBERING
5 ideasNvidia’s moat is full-stack co-design, not just chip specs.
Huang repeatedly emphasizes that big gains (e.g., Hopper→Blackwell) come from architecture, networking (NVLink/Spectrum-X), numerics, kernels, and software libraries working together—not merely Moore’s Law transistor scaling.
Supply-chain leverage comes from credible demand and ecosystem coordination.
He claims upstream partners invest because Nvidia can reliably absorb supply and sell through massive downstream demand; events like GTC help align upstream and downstream players around the size and timing of what’s coming.
Most bottlenecks are solvable quickly if the demand signal is clear; energy is slower.
Huang argues packaging, fabs, and even EUV capacity can ramp in a few years once planning is aligned, while grid/energy policy and physical buildout of power infrastructure are the longer-duration constraints.
Programmability matters because AI algorithms change faster than hardware cycles.
He argues a TPU-like systolic array can be great for today’s kernels, but rapid leaps require trying new attention mechanisms, hybrid architectures, disaggregation, and new kernels—work that benefits from a general programmable platform (CUDA).
CUDA’s defensibility is install base plus Nvidia’s hands-on performance engineering.
Even if hyperscalers can write kernels, Huang says Nvidia’s own engineers (and AI-assisted optimization) can often unlock 1.5–3× speedups, and developers still want portability across hundreds of millions of deployed GPUs across clouds and edge/robots.
WORDS WORTH SAVING
5 quotesThe input is electron, the output is tokens. That is, in the middle, Nvidia.
— Jensen Huang
We should do as much as needed, as little as possible.
— Jensen Huang
None of the bottlenecks last longer than a couple, two, three years. None of them.
— Jensen Huang
Our GPUs… are kind of like F1 racers.
— Jensen Huang
Comparing AI to anything that you just mentioned is lunacy.
— Jensen Huang
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome