Dwarkesh PodcastJensen Huang on Dwarkesh Patel: Why CoWoS Is Nvidia's Moat
CoWoS and HBM commitments placed years early lock supply before rivals can react; no challenger has posted inferencemax results matching Nvidia tokens per watt.
FREQUENTLY ASKED QUESTIONS
Direct answers grounded in the episode transcript. Tap any timestamp to verify against the source.
What does Jensen Huang mean by Nvidia turning electrons into tokens?
Jensen Huang uses 'electrons to tokens' as Nvidia's basic job description. His point is that Nvidia sits in the middle of the transformation from electrical power and hardware into useful AI tokens. He says that transformation is hard to commoditize because making one token more valuable than another requires artistry, engineering, science, invention, and manufacturing knowledge that is still far from fully understood. The company tries to do 'as much as necessary, as little as possible': it handles the parts only Nvidia must do, then partners with upstream supply chain companies, downstream computer companies, application developers, and model makers across what he calls a five-layer AI cake.
▸ 1:33 in transcriptHow do Nvidia's purchase commitments create a supply-chain moat?
Nvidia's supply-chain moat comes from credible demand, not just prepaying scarce parts. Huang says the explicit purchase commitments are only one part of the story. The bigger mechanism is that Nvidia can persuade upstream CEOs to invest because they believe Nvidia can buy their supply and sell it through downstream demand. He describes spending time informing, inspiring, and aligning suppliers and ecosystem partners about what is coming, how large it will be, and when it will arrive. GTC matters in that story because it brings the AI universe together: upstream suppliers see downstream customers, downstream customers see upstream capacity, and both see AI startups and model makers firsthand. His argument is that scale and fast business turns let Nvidia build a supply chain for a future that may be trillion-dollar in size.
▸ 5:01 in transcriptWhat is Jensen Huang's argument about TPUs and changing AI models?
Huang's TPU argument is that AI needs programmability beyond matrix multiplication. He does not deny that matrix multiplies matter for AI, but he says they are only one part of the workload. His examples are new attention mechanisms, different ways to disaggregate computation, hybrid SSM architectures, and models that fuse diffusion with autoregressive methods. Those experiments require a generally programmable architecture, because AI advances by changing the algorithm and how it is computed, not just by waiting for Moore's Law. He says Moore's Law improves around twenty-five percent per year, while AI sometimes needs 10X or 100X leaps. In his telling, CUDA and Nvidia's co-design across processors, systems, fabric, libraries, and algorithms make those leaps easier.
▸ 20:59 in transcriptWhy doesn't Nvidia become a hyperscaler and rent GPUs directly?
Huang says Nvidia stays out of cloud renting because clouds already exist. His broader rule is 'do as much as needed, as little as possible.' Nvidia should build the computing platform, NVLink, CUDA, CUDA-X libraries, domain-specific libraries, and ecosystem pieces that Huang believes would not exist without Nvidia taking the risk. Cloud rental is different. He says the world already has lots of clouds, so somebody will show up if Nvidia does not become one. Instead, Nvidia invests or helps where the ecosystem needs it, including CoreWeave, Enscale, Nebulaus, foundation model companies, and later OpenAI. The goal is not to make Nvidia a financier or to do every possible business. It is to keep the business model simple while supporting an ecosystem that lets AI connect with industries, countries, and the American tech stack.
▸ 44:15 in transcriptHow does Nvidia allocate GPUs during shortages?
Huang says GPU allocation starts with forecasts, purchase orders, and customer readiness. In the shortage discussion, he rejects the idea that Nvidia simply hands scarce GPUs to favored neo-clouds or the highest bidder. First, Nvidia works with customers on forecasts because GPUs and data centers take a long time to build. Then the customer still has to place a purchase order. Without a PO, he says, 'all the talking in the world won't make a difference.' After that, the default rule is first in, first out. Nvidia may adjust the order if a buyer's data center or supporting components are not ready, because serving a ready customer maximizes factory throughput. On pricing, Huang says Nvidia sets a price and does not raise it just because demand spikes, because he wants Nvidia to be dependable.
▸ 51:51 in transcript
Answers are AI-generated from the transcript and may contain errors. Tap a question to verify against the source.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome