Jensen Huang on Dwarkesh Patel: Why CoWoS Is Nvidia's Moat

CoWoS and HBM commitments placed years early lock supply before rivals can react; no challenger has posted inferencemax results matching Nvidia tokens per watt.

Dwarkesh PatelhostJensen Huangguest

Apr 15, 20261h 43mWatch on YouTube ↗

EVERY SPOKEN WORD

90 min read · 18,411 words

0:00 – 16:25
Is Nvidia’s biggest moat its grip on scarce supply chains?
1. DPDwarkesh Patel
  We've seen the valuations of a bunch of software companies crash because people are expecting AI to commoditize software. And there's a, a potentially naive way of thinking about things, which is like, look, Nvidia sends a GDS2 file to TSMC. TSMC builds the logic dies, it builds the switches, um, then it packages them with the HBM that SK Hynix and Micron and Samsung make. Then it sends it to an ODM in Taiwan where they assemble the racks. And so Nvidia is fundamentally making software that other people are manufacturing, and if software gets commoditized, does Nvidia get commoditized?
2. JHJensen Huang
  Well, in the end, something has to transform electrons to tokens. That transformation, um, there's no-- The transformation of electrons to tokens, uh, and making those tokens more valuable over time, uh, I, I don't-- I think that, that, that's hard to, hard to, um, completely commoditize. The, the transformation from electrons to tokens is such an, such an incredible journey and, and making that token, you know, it's like making a one molecule more valuable than another molecule, making one token more valuable than another. The amount of artistry, engineering, science, invention that goes into making that token valuable, uh, obviously we're, we're watching it happening in real time. And so, so the, the, the, the transformation, the manufacturing, um, all of the science that goes in there i- is far from un- deeply understood-
3. DPDwarkesh Patel
  Mm.
4. JHJensen Huang
  ... and is far from-- the journey is fro- from, far from over. And so, so I, I doubt that it will happen. Um, we're gonna make it more efficient, of course. I mean, the whole, the whole thing about Nvidia, i-in fact, the way that you framed the question is, is my mental model of our company. The input is electron, the output is tokens. That is, in the middle, Nvidia, and our job is to, to do as much as necessary, as little as possible to enable that transformation to be done at incredible capabilities. And, and what I mean by as little as possible, whatever I don't need to do, I partner with somebody and I make it part of my ecosystem to do. And if you look at Nvidia today, we probably have the largest ecosystem of partners, both in supply chain upstream, supply chain downstream, all of the computers, computer companies, and all the application developers and all the model makers and all the... You know, AI is a five-year, five-layer cake, if you will, and, and we have ecosystems across the entire five layers. And, and so we try to do as little as possible, but the part that we have to do, as it turns out, is insanely hard.
5. DPDwarkesh Patel
  Mm.
6. JHJensen Huang
  And, and, um, I, I don't think that that gets commoditized. In fact, in fact, um, uh, I also don't think that the, the enterprise software companies, uh, the tools makers... You know, most of the software companies today are tools makers. Um, some of them are not. Um, but are, are-- Some of them are workflow, um, codification, you know, systems. Um, but for a lot of companies, they're tool makers. For example, you know, Excel is a tool, PowerPoint's a tool, uh, Cadence makes tools, Synopsys makes tools. I, I actually see the opposite of what people see. I think the number of agents are going to grow exponentially. The number of tool users are gonna grow exponentially, and it's very likely that the number of instances of all these tools are gonna skyrocket. It is very likely the number of instances of Synopsys Design Compiler is gonna skyrocket, and the number of, number of agents that are gonna be using the floor planners and all of our layout tools and our design, d-design rule checkers, the number of agents that are-- Today we're limited by the number of engineers. Tomorrow, those engineers are gonna be supported by a bunch of agents, and we're gonna be exploring out the design space like you've never seen explored before, and we're gonna use the tools that we use today. And so, so I think, I think tool use is gonna cause, cause the software companies to skyrocket. The reason why it hasn't happened yet is because the agents aren't good enough at using their tools yet. And so either these companies are gonna build the agents themselves, or agents are gonna get good enough to be able to use those tools, and I think it's gonna be a combination of both.
7. DPDwarkesh Patel
  Mm. I think in your l-latest filings, it was, you had almost $100 billion in purchase commitments with people, foundries, memory, uh, packaging, and then, uh, SemiAnalysis has reported that you will have $250 billion with these kinds of purchase commitments. And so one interpretation is Nvidia's mode is really that you've locked up many years of these scarce components that are, uh, every-- You know, somebody else might have an accelerator, but can they actually get the memory to build it? Can they actually get the logic to build it? And this is really Nvidia's big mode for the next few years.
8. JHJensen Huang
  Well, it, it's one, it's one of the things that we can do that is hard for someone else to do. The reason why we could-- We've, we've made enormous commitments upstream. Um, some of it is explicit, these commitments that you mentioned. Some of it is implicit. Um, for example, a lot of the investments that are upstream are made by our s- our supply chain because I said to the CEOs, "Let me tell you how big this industry's gonna be, and let me explain to you why, and l-let me reason through it with you, and let me show you what I see." And so as a result of that, that process of, of, uh, informing, inspiring, um, aligning with CEOs of all different industries upstream, they're willing to make the investments. Now, why are they willing to make the investments for me and not someone else? And the reason for that is because they know that I have the capacity to buy it, buy their supply, and sell it through my downstream. The fact that Nvidia's downstream supply chain and our downstream demand is so largeThey're willing to make the investment upstream. And so if you look at GTC, um, a-and, and, uh, you know, people are marveled by the scale of GTC and the people that go, it's a three hundred and sixty degrees, the, the entire universe of AI all in one place, and they're, they're all in one place because they need to see each other. I bring them together so that the, the downstream could see the upstream, the upstream could see the downstream, and all of them could see all the advances in AI. And very importantly, they can all meet the AI natives and all the AI startups that are all c- you know, being, being built and all the amazing things that are happening so that they could see firsthand all the things that I tell them. And so I spend a lot of my time informing directly or indirectly, um, our supply chain and our partners and our ecosystem about the opportunity that's, that's in front of us. You know, most of my keynotes, you know, some of you-- some people alwa-always say, "You know, Jensen, it, i-in most keynotes, it's like one announcement after another announcement, after another announcement, after another ann-announcement." Our keynotes are-- There's always a part of it that's a little torturous in the sense that it's almost comes across like an edu- like education. And, and, and in fact, that's exactly on my mind. I need to make sure that the entire supply chain, upstream and downstream, the ecosystem, understands what is coming at us, why it's coming, when it's coming, how big is it gonna be, and be able to reason about it systematically, just like I reason about it. And, and so, so I think the, the, the moat as you, you describe it, we're able to, of course, um, build for a future, uh, i-i-if our next gen- next several years is a trillion dollars in s- in scale, we have the supply chain to do it. Without our reach, the velocity of our business, you know, just as there's cash flow, there's supply chain flow, there are turns. Uh, nobody's gonna build a supply chain for an architecture if the architecture, the business turns is low. And so our ability to sustain the scale is only because our downstream demand is so great, and they see it, and they all hear about it. They, they see it all coming. And so that's, uh, i-it allows us to do the things that we're able to do at the scale we're able to do.
9. DPDwarkesh Patel
  Mm. I do want to understand more concretely whether the upstream can keep up. Um, for many years now, you guys have been two X-ing revenue year over year. You guys have been more than tripling the amount of flops you're providing to the world year over year.
10. JHJensen Huang
  And two X-ing at the scale now is really incredible.
11. DPDwarkesh Patel
  Exactly.
12. JHJensen Huang
  Yeah.
13. DPDwarkesh Patel
  So then you look at logic, say. You're-
14. JHJensen Huang
  Mm-hmm
15. DPDwarkesh Patel
  ... the biggest customer on TSMC's N3 node, and, um, you're one of the biggest on N2. Uh, AI as a whole this year is gonna be sixty percent of N3. It's gonna be eighty-six percent next year, according to some analysis. How, how do you two X if you're the majority? Um, and how do you do that year over year? So are we, are we in a regime now where the growth rate in AI compute has to slow because of upstream? Do you see a way to get around these, uh, you know, you, you-- how do we build two X more fabs year over year, ultimately?
16. JHJensen Huang
  Yeah, at some, at some level, um, the, the instantaneous demand, uh, is greater than the supply upstream and downstream, uh, in the world. And, and it could be at any instant, a-any instance, we could be limited by the number of plumbers-
17. DPDwarkesh Patel
  Mm-hmm
18. JHJensen Huang
  ... which, which actually happens.
19. DPDwarkesh Patel
  The plumbers are invited to next year's GTC.
20. JHJensen Huang
  [laughs] Yeah. You know, by the way, great idea. But that's a good condition. You, you want, you want a, you want a market, you want an industry where the instantaneous demand is greater than the total supply of the industry. Um, the opposite is obviously less good. If we're too far apart, uh, if one particular item, one particular component is too far, too far away, um, obviously, obviously the industry swarms it. So, for example, I notice people aren't talking very much about CoWoS anymore.
21. DPDwarkesh Patel
  Yeah.
22. JHJensen Huang
  And the reason for that is because for two years we swarmed the living daylights out of it, and we double, double, double on, on several doubles and, and now I think we're in a fairly good shape. And TSMC now knows that CoWoS supply has to keep up with the rest of the logic demand and the memory demand and, and so, so they're scaling CoWoS, um, and they're sco- scaling, uh, you know, future packaging technologies at the same level as they scale logic, which is terrific because for a long time, CoWoS w-was rather specialty and, um, uh, HBM memory was rather specialty, but they're not specialties anymore. People now realize they're mainstream computing technology. Um, and, and then, and of course, uh, we're now much more able to influence a larger scope of our supply chain. In the past, in the past, um, uh, you know, in the beginning of the AI revolution, all the things that I say now, I was saying five years ago, and some people believed in it and invested in it. For example, uh, Sanjay and, and the Micron team, uh, I still remember the meeting really well, uh, where, where I, I was clear about exactly what's gonna happen and why it's gonna happen and, and the predictions, the predictions that, that, um, of, of today and they, they really doubled down on it and we partnered with them and, uh, across LPDDR, across, you know, HBM memories, uh, they really invested in it and, and it, it obviously has been tremendous for the company. Uh, some, some people came a little bit later and, um, but they, the, now they're all here. And so I, I think the, each one of these generation-- each one of these bottlenecks gets a, a great deal of attention. Um-And now we're, we're prefetching the bottlenecks, uh, years in advance. So for example, uh, the, the, the investments that we've done, uh, with, uh, with, uh, Lumentum and Coherent and, um, all of the sil- silicon photonics ecosystem, uh, the last several years, we really reshaped the ecosystem and the supply chain, uh, sil- silicon photonics. Uh, we, we, uh, built up an entire supply chain around TSMC. We partnered with them on CooP, uh, invented a whole bunch of technology. We licensed, uh, those patents to the supply chain, keep it nice and open. Um, and so we're preparing the supply chain through invention of new technologies, new workflows, a new test- new testing equipment, double-sided prodi- probing, um, investing in companies, helping them scale up their capacity. Um, and so, so you could see that we're trying to shape the ecosystem so that it's ready, the supply chain, so that it's ready to support the scale.
23. DPDwarkesh Patel
  It seems like some bottlenecks are easier than others, and so scaling up CoWs versus scaling up-
24. JHJensen Huang
  I went to the hardest one, by the way.
25. DPDwarkesh Patel
  Which is?
26. JHJensen Huang
  Plumbers.
27. DPDwarkesh Patel
  [laughs]
28. JHJensen Huang
  Yeah.
29. DPDwarkesh Patel
  That's true.
30. JHJensen Huang
  Yeah, yeah. I actually went to the hardest one.
16:25 – 41:06
Will TPUs break Nvidia’s hold on AI compute?
1. DPDwarkesh Patel
  Um, um, okay, I wanna ask about, um, your competitors.
2. JHJensen Huang
  Yeah.
3. DPDwarkesh Patel
  So if you look at TPU, arguably two out of the top three models in the world, Claude and Gemini, were trained on TPU. What does that mean for Nvidia going forward?
4. JHJensen Huang
  Um, well, we have, we have a very different-- We build a very different thing. Um, you know, what, what Nvidia built is accelerated computing, not a tensor processing unit. And, uh, accelerated computing is used for all kinds of things, you know, molecular dynamics and quantum chromodynamics, and it's used for data processing, data frames, structured data, unstructured data. It's used for, um, fluid dynamics, particle physics, you know. And in addition, we use it for AI. And so accelerated computing is, is, um, much more diverse and, and although AI is the conversation today, is obviously very important and is impactful, uh, computing is much broader than that. And what Nvidia has done is reinventing-- reinvented the way computing is done from general purpose computing to accelerated computing. Our market reach is far greater than any, any TPU can-- any ASIC can possibly have. And so if you look at our position, uh, w- we're the only company that, that accelerates applications of all kinds. We have a gigantic ecosystem, and so all kinds of frameworks and algorithms all run on Nvidia. And because our computers are designed to be operated by other people, anyone who's an operator could buy our systemsMost of these home-built systems, you have to be your own operator because it was never designed to be flexible enough for other people to operate. And so as a result of the fact that anybody can operate our systems, we're in every cloud, including Google and Amazon and, you know, Azure and OCI and-- right? And so whether you want to operate it to rent or operate it-- If you want to operate it to rent, you better have large ecosystem of customers in many industries that be the off-takers. If you're operating it-- If you, if you want to operate it for yourself, um, we, you know, we obviously have the ability to help you operate yourself, like for example, for Elon with xAI. And, uh, because we could, we could enable operators, uh, in any, any company in any industry, you could use it, uh, to build a supercomputer for, uh, scientific research and drug discovery at Lilly. And so we can help them operate their own supercomputer and, and use it for the entire diversity of drug discovery and biological sciences, um, that, that we accelerate.
5. DPDwarkesh Patel
  Mm.
6. JHJensen Huang
  And so, so there, there are just, you know, a whole bunch of applications that we can address that you can't do so with TPUs because Nvidia's built CUDA as a fantastic tensor processing unit as well, but it does, you know, it does every, every life cycle of data processing and computing and AI and so on and so forth. And so I-- our, our market opportunity is just a lot larger. Our reach is a lot greater, and because we have such a large, um... we basically support every application in the world now, you could build Nvidia systems anywhere and know that there will be customers for it.
7. DPDwarkesh Patel
  Mm.
8. JHJensen Huang
  And so it's a very different thing.
9. DPDwarkesh Patel
  Uh, this is gonna be sort of a long question, but, you know, y-you have spectacular revenue, um, and this revenue is mostly-- y-you're not making sixty billion a quarter from, uh, pharma and, um, quantum. You're making it because AI is, uh, unprecedented technology that is growing unprecedentedly fast. And so then the question is, what is best for AI specifically? And I'm not in the details, but I talk to my AI researcher friends, and they say, "Look, when I use a TPU, it's this big systolic array that's perfect for doing matrix multiplies, whereas a GPU is very flexible. It's great when you have lots of branching, when you have, um, irregular memory access." But these-- You know, what, what, what is AI? Just like these very predictable matrix multiplies again and again and again, and you don't have to give up any die area for warp schedulers, for, you know, switches between threads and memory banks. And so the TPU is really optimized for the majority, the bulk of this growth in revenue and use case for AI, uh, compute that is coming online right now. Um, yeah, I, I wonder how you react to that.
10. JHJensen Huang
  Um, matrix multiplies is an important part of AI, but it's not the only part of AI. And if you want to come up with a new attention mechanism, or if you want to disaggregate in a different way, if you want to come up with a whole new type of architecture altogether, for example, you know, a hybrid SSM, uh, if you want to use a-- if you want to create a model that, that, um, that fuses diffusion and autoregressive somehow, uh, y-you want an architecture that's just generally programmable. And, and we run everything you can imagine. And so that's the advantage. It allows for invention of new algorithms a lot more, a lot, a lot more easily.
11. DPDwarkesh Patel
  Mm.
12. JHJensen Huang
  And so-- Because it's a programmable system. And, and the ability to invent new algorithms is really what makes AI advance so quickly. You know, TPUs, like anything else, is impacted by Moore's Law, and we know that Moore's Law is increasing about twenty-five percent per year. And so the only way to really get 10X leaps, 100X leaps, is to fundamentally change the algorithm and how it's computed every single year.
13. DPDwarkesh Patel
  Mm.
14. JHJensen Huang
  And that's Nvidia's fundamental advantage. The only reason why we were able to make Blackwell to Hopper 50 times, you know, I said it was 35 times and, and, and when I first announced it was gonna-- Blackwell was gonna be 35 times more energy efficient than Hopper, uh, n-nobody believed it. And, and, uh, and then, and then Dylan wrote an article. He said, he said, in fact, in fact, I sandbagged. It's actually 50 times. And y-you can't reasonably do that with just Moore's Law. And so the, the way that we solve that problem is new out-- new models, MoEs, um, uh, parallelized and disaggregated and, and distributed, uh, uh, across a computing system. Uh, and without the ability to really get down and come up with new kernels with CUDA, it's really hard to do. And, and so the combination of the programmability of our, of our architecture, uh, the, the fact that Nvidia is an extreme co-design company, where we could even offload some of the computation into the fabric itself, NVLink, for example, into the network Spectrum-X, um, uh, and that we could affect change across the processors, the system, the fabric, the libraries, the algorithm, all of that was done simultaneously. Without CUDA to do that, I wouldn't even know where to start.
15. DPDwarkesh Patel
  My sponsor Crusoe was among the first clouds to offer Nvidia's Blackwell and Blackwell Ultra platforms, and they just announced their Nvidia Vera Rubin deployment scheduled for later this year. But access to state-of-the-art hardware is only part of the story. For example, most inference engines already do KV caching for a single user's forward passes. But Crusoe does it across users and GPUs. So if a thousand agents are running on the same system prompt, Crusoe only has to compute the KV cache once for it to become available to every single GPU in the cluster. This is especially important as systems get more agentic and require much longer prefixes in order to use tools and access files. In a recent benchmark, Crusoe was able to deliver up to ten times faster time to first token and up to five times better throughput than vLLM. This is just one among many reasons that you should run your inference workload with Crusoe. And if you need GPUs for training, you don't need to switch clouds. Crusoe's got you covered there too. Go to crusoe.ai/thorcache to learn more. So th-this gets at a i-interesting question about, um, Nvidia's clientele, where if sixty percent of your revenue is coming from these big five hyperscalers, you know, in a, in, in a different era with different customers, let's say it's professors who are running experiments, and they are helped a bunch by-- They need CUDA. Um, they can't use another accelerator. They need to just run PyTorch with CUDA and have everything optimized. But if you've got these hyperscalers, they have the resources to write their own kernels. In fact, they have to, to get that extra last five percent that they need for their specific architecture. Um, Anthropic, Google are mostly running their own accelerators or running TPUs, um, and Tranium. But even OpenAI using GPUs has, um, has Triton, which they're like, "We need our own kernels." So they've, um, down to CUDA C++, they've instead of using CuBLAS and NCCL and everything, they've got their own stack, which compiles to other accelerators as well. Um, and so if most of your customers can, can and do make replacements for CUDA, to what extent is CUDA really the thing that is gonna make frontier AI happen on Nvidia?
16. JHJensen Huang
  CUDA, CUDA is, um, uh, is a, a rich ecosystem, and so if you want to build on any computer first, building on CUDA first is incredibly smart. And because the ecosystem is so rich, uh, we support every framework. Uh, if you want to create custom kernels, uh, if you need-- For example, we contribute enormously to Triton, and so the back end of Triton, um, huge amounts of Nvidia technology. We're delighted to help every framework, uh, become as great as it can be, and there's lots and lots of frameworks. There's Triton, there's vLLM, there's SGLang, and there's more, right? And now there's, there's a whole bunch of new reinforcement learning frameworks coming out. You know, you got Veril, you got NeMo RL, you got a whole bunch of new... And then the, the, now with, with, with post-training and reinforcement learning, that entire area is just exploding, right? And so if you want to build on, on an architecture, building on a CUDA makes the most sense because you know that the ecosystem is great. You know that if something happens, it's more likely in your code and not in the mountain of code underneath. You know, don't forget the amount of code that you're dealing with when you're building these systems. When something doesn't work, was it you or was it the computer? You would like it always to be you and to, to be able to trust the computer. And, and, you know, obviously, we still have lots and lots of, lots and lots of bugs ourselves, but, but our system is so well wrung out that you could at least build on top of the foundation. So that's number one, is that the richness of the ecosystem, the programmability of it, the capability of it. The second thing is, is, um, if you were a developer and you were building anything at all, the single most important thing you want more than anything is install base. You want the software that you run to run on a whole bunch of other computers. You don't want to build a softwa-- You're not building software just for yourself. You're building software for your fleet or for everybody else's fleet because you're a framework builder. And Nvidia's CUDA ecosystem is ultimately its great treasure. We are now, I don't know how many, several hundred million GPUs. Every cloud has it. Goes back to A10, A100, H100, H200, you know. The L series, the P series. I mean, there's a whole bunch of them. And, and they're, they're, they're in all kinds of sizes and shapes. And if you're a robotics company, you want that CUDA stack to actually run in the CUDA, in the robot itself. We're literally everywhere. And so the install base says that once you develop the software, once you develop the model, it's gonna be useful everywhere. And so the install base is just too incredibly valuable. And then lastly, the fact that we're in every single cloud makes us genuinely unique because, you know, you're an AI company and you're an AI developer, you're not exactly sure which CSP you're gonna partner with and where you would like to run it, and we'd run it everywhere, including on-prem for you if you like. And so, so I think that, that the, the, the richness of the ecosystem, the expansiveness of the, of the, of the install base, and the versatility of where, where, where we are, that combination is, is, uh, makes CUDA invaluable.
17. DPDwarkesh Patel
  That makes a lot of sense. I gue- I guess the thing I'm curious about is, um, whether those advantages matter a lot to your main customers. Um, like, there, there's many people who, who they might matter for, the kind of person who can actually build their own software stack, who are make up most of your revenue. Um, especially if you go to a world where AI is getting especially good at the things which have tight verification loops, working RL on them, and then this question of how do you write a kernel that does attention or MLP the most efficiently across a scale-up, it's a very verifiable sort of feedback loop. And so, oh, can everybodyCan all the hyperscalers write these custom kernels for themselves? Um, and they might still-- N-Nvidia s- has, uh, still has great price performance, so they might still pre-prefer to use Nvidia. But then the question is, does it just become a question of who is offering the best specs, the best, um, FLOPS and memory and memory bandwidth for a given dollar? Where historically, Nvidia has just had, and still has, you know, the best margins in all of AI across hardware and software, seventy percent plus, because of this CUDA moat. And the question is, oh, can you sustain those margins if for most of your customers, they can actually afford to build, b-build instead of the CUDA moat?
18. JHJensen Huang
  The number of engineers we have assigned to these AI labs is insane.
19. DPDwarkesh Patel
  Mm.
20. JHJensen Huang
  Working with them, optimizing their stack. And the reason for that is because, because, um, nobody knows our architecture better than we do, and these architectures are not, not as general purpose as a CPU. The reason, the reason why a CPU is so... You know, a CPU is kind of like, like a Cadillac, you know? It's, it just always, you know it, it's a nice cruiser. It never goes too fast. Everybody drives it pretty well, you know? It's got cruise control, you know, and everything is easy. But in a lot of ways, Nvidia's GPUs are accelerators are kind of like F1 racers. And yeah, I, I could imagine everybody's able to drive it at a hundred, a hundred miles an hour. But it takes quite a bit of expertise to be able to push it to the limit. And we use, we use a ton of AI to create the kernels that we have. And, um, I'm pretty sure we're gonna still be needed for quite some time. And so our expertise, um, helps our, our, our, um, uh, our AI labs partners get another two X out of their stack easily oftentimes. Well, it's not unusual that we, you know, by the time that we're done optimizing their stack or optimizing a particular kernel, their model sped up by three X, two X, fifty percent. Um, that's a huge number, especially when you're talking about the install base of the fleet that they have, of all the Hoppers and Blackwalls that they have. When you increase it by a factor of two, that doubles their revenues. That directly translates to revenues. Nvidia's computing stack is the best performance per TCO in the world, bar none. Nobody can demonstrate to me that any single platform in the world today has better performance TCO ratio, not one company. And in fact, in fact, the, the, uh, the benchmarks are out there, uh, Dylan's, right, InferenceMax is sitting out there for everybody to, to use, and not one... TPU won't come, Trainium won't come. I, I encourage them to use InferenceMax and demonstrate their incredible inference cost. It's really, really hard. Uh, not-- nobody wants to show up. Uh, MLPerf, uh, I would, I would welcome Trainium to demonstrate their forty percent that they claim all the time. I would d- I would love to, to hear them demonstrate the, the, uh, cost advantage of TPUs. It makes no sense in my mind. It makes absolutely zero sense. On first principles, it makes no sense. And so I, I think the, I think the, the, the reason why we're so successful is simply because our TCO is so great. There's a second-- Y-you say, um, sixty percent of our customers are the top five, but most of that business is external. For example, most of AWS's, most of Nvidia in AWS is for external customers, not internal use. Most of our customers at Azure, obviously, all of our customers are external. All of our customers at OCI are external, not internal use. The reason why they, they favor us is because our reach is so great, we can bring them all of the great customers in the world. They're all built on Nvidia. And the reason why all these cus- companies are built on Nvidia is because our reach and our versatility is so great. And so, so I think, I think the flywheel i- is really install base, the programmability of our architecture, the richness of our ecosystem, and the fact that there's so many AI companies in the world. There's tens of thousands of them now.
21. DPDwarkesh Patel
  Mm.
22. JHJensen Huang
  And if you were one of those AI startups, what architecture would you, would you choose? You would choose an architecture that's most abundant. We're the most abundant in the world. The one, has the largest install base. We're the most largest install base, and one that has a rich ecosystem. And so that's the flywheel that, that's the reason why between the combination of, one, our perf per dollar is so great, um, that, that, uh, uh, they have the lowest cost tokens. Second, our perf per watt is the highest in the world. And so if, if, uh, uh, one of these companies, if our partners built a one gigawatt data center, that one gigawatt data center better deliver the maximum amount of revenues that, and number of tokens, which directly translates to revenues, you want it to generate as many tokens as possible, maximize the revenues for that data center. We are the highest tokens per watt architecture in the world. And then lastly, if your goal is to rent the infrastructure, we have the most customers in the world.
23. DPDwarkesh Patel
  Mm.
24. JHJensen Huang
  And so that's the reason why the flywheel works.
25. DPDwarkesh Patel
  Interesting. I, I, I guess the question comes down to what is the actual market structure here? 'Cause even if there's other companies, th-there could have been a world where there's tens of thousands of AI companies, uh, that have roughly equal share of compute. But if even through these five hyperscalers, really the people on Amazon using the computer, Anthropic, OpenAI, um, and these b-big, big foundation labs who, who can themselves afford and have the ability to make accel- different accelerators work, um-
26. JHJensen Huang
  No, I, I think your, your, your assumption is, is, um, premise is wrong.
27. DPDwarkesh Patel
  Maybe. Um-
28. JHJensen Huang
  Yeah.
29. DPDwarkesh Patel
  But let me, let me, let me ask you-
30. JHJensen Huang
  Yeah
41:06 – 57:36
Why doesn’t Nvidia become a hyperscaler?
1. JHJensen Huang
  to do it.
2. DPDwarkesh Patel
  Uh, this is, uh, this is actually quite interesting, which is, um, for many years, Nvidia has been this, um, the company in AI making money, making lots of money. And, um, now you're investing it. It's been reported that you've done up to thirty billion in OpenAI and ten billion in, um, Anthropic. Um, but now their valuations have increased, and I'm sure they'll continue to increase. Um, and so if over-overall, these many years, you know, you were giving them the compute, you saw where AI was headed, and then they were worth, like, one-tenth what they are now a couple of years ago or even a year ago in some cases. Um, and you had all this cash. W-w-well, there's, there's a world where either Nvidia themselves becomes a foundation lab, um, d-does a huge investment to make that possible or has made the deals you've made now at current valuations much earlier on, um, and you had the cash to do it. So I'm, I am curious actually why not have done it earlier.
3. JHJensen Huang
  We did it as soon as we could have. We did it as soon as we could have. And, and, and, um, if I could have, I would have done it even earlier. Um, at the time that Anthropic needed us to do it, we just weren't in a position to do it. It wasn't, it wasn't, you know, it wasn't in our sensibility to do so.
4. DPDwarkesh Patel
  How so? Like a cash thing or just-
5. JHJensen Huang
  Yeah, the level of investment. You know, we had never invested outside the company at the timeAnd not that much. And, um, and we didn't realize we needed to. You know, I always, I always thought that they could just go raise VCs, for God's sakes, like, like all companies do. Um, but, but, um, uh, what they were trying to-- what they were, were trying to do, uh, couldn't have been done through VCs. What OpenAI wanted to do couldn't have been done through VCs, and, and I recognize that now. I didn't know it then, you know? But that's their genius. That's why they're smart.
6. DPDwarkesh Patel
  [chuckles]
7. JHJensen Huang
  You know? And so, so they realized, they realized that then that they had to do something like that, and I'm delighted that they did, you know? And, and even though, even though, um, we, we caused Anthropic to have to go to somebody else, um, I'm still happy that it happened. An-Anthropic's existence is great for the world.
8. DPDwarkesh Patel
  Mm.
9. JHJensen Huang
  I'm, I'm delighted for it.
10. DPDwarkesh Patel
  Uh, I, I guess you still are making a ton of money, and you're making way more money, um, quarter after quarter.
11. JHJensen Huang
  It's still okay to have regrets. [both laughing]
12. DPDwarkesh Patel
  Um, so the, the, the question still arises, okay, well now that we're here and you have all this money that you keep making, um, what should Nvidia be doing with it? And there's one answer which says, look, there's this whole middleman ecosystem that has popped up for converting, um, CapEx into OpEx for these labs so that they can rent compute, um, because the chips are really expensive. They make a lot of money over their lifetime through-- because the AML is getting better. The value that they generate through tokens is increasing, but they're expensive to set up. Nvidia has the money to do the CapEx, so... And in fact you are, y- uh, you're-- it's been reported you're backstopping CoreWeave up to six point three billion and have invested two B. Um, but w- yeah, why, why, why doesn't Nvidia become a cloud themselves? Why doesn't it become a hyperscaler themselves-
13. JHJensen Huang
  Right
14. DPDwarkesh Patel
  ... and rent this compute out? You have all this cash to do it.
15. JHJensen Huang
  This is a philosophy of the company and, and I think is wise. We should do as much as needed, as little as possible. And, and what that means is the, the work that we do with building our, our computing s- platform, if we don't, if we don't do it, I genuinely believe it doesn't get done. If we didn't take the risk that we take, if we didn't build NVLink the way we built, if we didn't build the whole stack, if we didn't create the ecosystem the way we did it, if we didn't dedicate ourselves to 20 years of CUDA while losing money most of that time, if we didn't do it, nobody else would've done it. If we didn't create all the CUDA-X libraries so that they're all domain specific, you know, this is several-- a decade and a half ago, uh, we pushed into domain-specific libraries because we realized that if we didn't create these domain-specific libraries, whether it's for ray tracing or image generation or even the early works of AI, these models, if we didn't create them for data processing, structured data processing or vector data processing, if we didn't create them, nobody would. And I am completely certain of that. We created a, a, a library for computational lithography called cuLitho. If we didn't create it, nobody would have. And so accelerated computing wouldn't advance the way it has if we didn't do what we did. And, and so we should do that. We should dedicate our company, all of our might, wholeheartedly to go do that. However, the world has lots of clouds. If I didn't do it, somebody show up. And so following the, the recipe, the philosophy of doing as much as needed but as little as possible, as little as possible, that philosophy exists in our company today. And everything I do, I do it with that lens. In the case of clouds, if we didn't support CoreWeave to exist, these neo clouds, these AI clouds wouldn't exist. If we didn't help CoreWeave exist, they would not exist. If we didn't support Enscale, they wouldn't be where they are today. If we didn't support Nebulaus, they wouldn't be where they are today. Now they are-- they're doing fantastically. Is that a business model work? No. We should do as much as needed, as little as possible. And so we're trying-- we invest in our ecosystem because I want our eco- ecosystem to thrive, and I want our, our-- I want, I want the architecture and I want AI to be able to connect with as many industries as possible, as many countries as possible, and make it possible for, you know, the planet to be built on AI and to be built on the American tech stack.
16. DPDwarkesh Patel
  Mm.
17. JHJensen Huang
  And so, so tho- that vision, I think, is exactly what we're pursuing. Now, one of the things that, that you mentioned, um, there are so many great, amazing foundation model companies, and we try to invest in all of them. And this is, this is another thing that we do. We don't pick winners. And we, we like-- We, we, we need to support everyone, and it's part of our, part of our, our, our joy of doing so. It's, it's imperative to our business, but we also go out of our way not to pick winners. And so when I, when I invest in one of them, I invest in all of them.
18. DPDwarkesh Patel
  Wh- why do you go out of your way to not to pick winners?
19. JHJensen Huang
  Because it's not our job to, number one. Number two, when Nvidia s- first started, there were 60 graphics companies, 60 3D graphics companies. Uh, we are the only one that survived. If you would've taken those 60 companies, 60 graphics companies, and asked yourself which one was gonna make it-
20. DPDwarkesh Patel
  Mm
21. JHJensen Huang
  ... Nvidia would be the top of that list not to make it. You know, this is long before you, but Nvidia's graphics architecture was precisely wrong. It's not a little bit wrong. We created an architecture that was precisely wrong. And, and it was an impossible f- thing for developers to support. It was never gonna make it. We reasoned about it for good rea- for, for, from good first, first principles, but we ended up in the wrong solution. And, and, um, uh, everybody would've co- everybody would've counted us out, and, and here we are. And so I'm, I'm, I'm-- I have enough humility to recognize that, you know, don't, don't pick winners.
22. DPDwarkesh Patel
  Mm.
23. JHJensen Huang
  Yeah.
24. DPDwarkesh Patel
  Um-
25. JHJensen Huang
  Either let them all take care of themselves or take care of all of them.
26. DPDwarkesh Patel
  Um, one, one thing I didn't understand isYou said, "Look, we're, we're not prioritizing these neo clouds, um, just because they are neo clouds and we wanna prop them up." But you also said-- You listed a bunch of neo clouds and you said they wouldn't exist if it wasn't for Nvidia.
27. JHJensen Huang
  Yeah.
28. DPDwarkesh Patel
  And so how are those two things-
29. JHJensen Huang
  Oh
30. DPDwarkesh Patel
  ...compatible?
57:36 – 1:35:06
Should we be selling AI chips to China?
1. JHJensen Huang
  important.
2. DPDwarkesh Patel
  Okay, I wanna ask about China.
3. JHJensen Huang
  Yep.
4. DPDwarkesh Patel
  And I always like to take, uh... I don't, actually don't know what I think about whether it's good to sell chips to China or not, but I, like, played devil's advocate against my guests. So when-
5. JHJensen Huang
  Mm-hmm.
6. DPDwarkesh Patel
  Dario was on, who supports export control, I asked him-
7. JHJensen Huang
  Yeah.
8. DPDwarkesh Patel
  "Well, why can't America and China both have country of geniuses in the data center?" But since, um, you're on the opposite side, I'll ask you in the opposite way. Um, and look, one way to think about it is, uh, Anthropic actually announced a couple days ago Mythos preview, um, this model Mythos they're not even releasing publicly because they say it has such cyber offensive capabilities that we don't think the world is ready until we get, we make sure these zero days are patched up. But they say it found thousands of high severity vulnerabilities across every major operating system, every browser. It found one in OpenBSD, which is this operating system that has been specifically designed to not have zero days, and it found one, uh, for 27 years it's existed. Um, and so if Chinese companies and Chinese labs and the Chinese government had access to the AI chips to train a model like Claude Mythos with these cyber offensive capabilities and run millions of instances of it with more compute, the question is, oh, is that a threat to American companies, to American national security?
9. JHJensen Huang
  F-first of all, um, Mythos was, was, uh, trained on fairly mundane capacity and a fairly mundane amount of it, um, by an extraordinary company. Uh, and so the amount of capacity and the type of compute that's, it was trained on is abundantly available in China. And so you just have to first realize that chips exist in China. They manufacture 60% of the world's mainstream chips, maybe more. It's a very large industry for them. They have some of the world's greatest computer scientists. As you know, most of the AI researchers in all of these AI labs, most of them are Chinese. They have 50% of the world's AI researchers. And so the question is, if you're concerned about them, what is the-- considering all the assets they already have, they have an abundance of energy, they have plenty of chips, they got most of the AI researchers. If you're worried about them, what is the best way to create a safe world? Well, victimizing them, um, uh, turning them into an enemy, uh, likely isn't the best answer. They are an adversary. We want United States to win. Um, but I think having a, having a dialogue and having research dialogue is probably the safest thing to do. This is an area that, that is glaringly missing because of our current attitude about China as an adversary. It is essential that our AI researchers and their AI researchers are actually talking. It is essential that we try to both agree on how to, what not to use the AI for. With respect to finding bugs in software, of course, that's what AI is supposed to do. Is it gonna find bugs in a lot of software? Of course.There's lots and lots of bugs. There are lots of bugs in the AI software, and so, um, that's what AI is supposed to do. And I'm delighted that, that, uh, uh, AI has reached a level where it could help us be so much more productive. Um, one of the things that, that, um, is, is, uh, under- underemphasized is the richness of ecosystem around cybersecurity, AI cybersecurity, and AI security, and AI privacy, and, uh, AI safety. That whole ecosystem of AI startups that are trying to create this future for us, where, where you have one AI agent that's incredible surrounded by thousands of AI agents keeping it safe, keeping it secure, that future surely is gonna happen. And the idea that you're gonna h- have an AI agent running around with nobody watching after it is kind of insane. And so, uh, we know very well that this ecosystem needs to thrive. It turns out this ecosystem needs open source. This ecosystem needs open models. They need open stacks so that all of these AI research and all these great computer scientists can go build AI systems that as, are as formidable and can keep, um, AI safe. And, uh, and, and, and so one of the things that we need to make sure that we do is we keep the, the open source ecosystem vibrant. And, um, and that can't be ignored. That can't be ignored. And, and a lot of that is coming out of China. Um, uh, we, we ought to, we ought to n-not suffocate that. You know, with respect to, to China, we wanna have-- of course, we want United States to have as much computing as possible. Uh, we're, we're limited by energy, um, but, you know, we got a lot of people working on that, and we, we ought to not make energy a, a, a bottleneck for our, our country. Um, but what we also want is we wanna make sure that all the AI developers in the world are developing on the American tech stack and making the contributions, the advancements of AI, especially when it's open source available to the American ecosystem. And it would be extremely foolish to create two ecosystems, the open source ecosystem, and it only runs on the Chinese tech, tech, a foreign tech stack, and a closed ecosystem and, uh, that runs on the American tech stack. I think that that would be, that would be a, a horrible outcome for United States.
10. DPDwarkesh Patel
  Hmm. Since there are a lot of things, l-l-let, let me just triage the, um, response. I mean, I think the concern, going back to the FLOP difference and the hacking is, yes, they have compute, but there's some estimates that because they're at seven nanometer, uh, they don't have EUV because of chip making export controls, the amount of FLOPs they're about to actually produce, they have, like, one-tenth the amount of FLOPs that the U.S. has. And so with that, could they train eventually a model like Mythos? Yes. But the question is, because we have more FLOPs, uh, American labs are able to get to these level of capabilities first, and because Anthropic got to it first, they say, "Okay, we're gonna hold onto it for a month while all these American companies, we give them access to it, they're gonna bl- uh, patch up all their vulnerabilities, and now we release it." Furthermore, if they-- even if they trained a model like this, the ability to deploy it at scale, you know, if you had a cyber hacker, it's much more dangerous if they have a million of them versus 1,000 of them, so that inference compute really matters a lot. And in, in fact, the fact that they have so many AI researchers who are so good is the thing that makes it so scary because what is it that makes those engineer researchers more productive is compute. Um, if you talk to any AI lab in America, they say the thing that's bottlenecking them is compute. So-- And there are quotes from DeepSeek founder or Q- uh, Qwen leadership or whatever. They say, like, "The thing we're bottlenecked on is compute." Um, so then the question is, isn't it better that we get to get-- American companies, because they have more compute, get to get, get to the level of Spot or Mythos level capabilities first, prepare our society for it before China can get to it because they have less compute?
11. JHJensen Huang
  We should always be first, and we should always have more. But in, in order for that outcome for you to, to, uh, what you described to be true, uh, you have to take it to the extremes. They have to have no compute. And, um, uh, and if they have some compute, the question is how much is needed? The amount of compute they have in China is enormous. It's u- I mean, you're talking about a country. It's the second largest computing market in the world. If they wanna deploy, aggregate their compute, they got plenty of compute to aggregate.
12. DPDwarkesh Patel
  But, but is that true? I mean, there's, like, people do these estimates, and they're like, "Well, SMIC is actually behind on the process node," so they're a- they actually just-
13. JHJensen Huang
  I'm about to tell you.
14. DPDwarkesh Patel
  Okay.
15. JHJensen Huang
  The amount of energy they have is incredible. Isn't that right? AI is a parallel computing problem, isn't it? Why can't they just put four, 10 times as much chips together? Because energy's free. They have so much energy. They have data centers that are sitting completely empty, fully powered. They've, you know, they have ghost cities. They have ghost, ghost data centers. They have so much capacity of infrastructure. If they wanted to, they just gang up more chips, even if they're seven nanometer, and their capacity of building chips is one of the largest in the world. The semiconductor industry knows that they monopolize mainstream chips. They over capacity. They have too much capacity. And so the idea that China won't be able to have AI chips is completely nonsense. Now, of course, if you ask me, um, uh, would, would, would, uh, United States be, be further ahead if, if the entire world had no compute at all? But that's just not an outcome. That's not a scenario that's true. They have plenty of compute already. The amount of threshold they need for the, for the concern you're worried about, they've already reached that threshold and beyondAnd so, so I think the-- you, you misunderstand that AI is a five-layer cake, and at the lowest layer, layer is energy. When you have abundant of energy, it makes up for chips. If you have abundance of, of chips, it makes up for energy. For example, uh, United States is scarce on energy, which is the reason why Nvidia has to keep advancing our architecture and do this extreme co-design so that with the few chips that we ship, okay, with the few chips, because the amount of energy is so limited, our throughput per watt is off the charts. But if your amount of watts is completely abundant, it's free, what do you care about performance per watt for? You get plent-- You, you can use old chips to do so. So seven nanometer, seven-nanometer chips are essentially Hopper. The ability to, for Hopper, um, I gotta tell you, today's models are largely trained on Hopper. Yeah, Hopper generation. And so, so Hopper, seven-nanometer chips are plenty good. The abundance of energy is their advantage.
16. DPDwarkesh Patel
  But then there's a question of, okay, well, can they actually manufacture enough chips given their-
17. JHJensen Huang
  But they do. Uh, uh, what's, what's the evidence? Huawei just had the largest single year in the history of their company.
18. DPDwarkesh Patel
  How many chips did they ship?
19. JHJensen Huang
  A ton. Millions. Millions is way more, way more than Anthropic has.
20. DPDwarkesh Patel
  So there's a question of how much logic SMIC can ship, then there's a question of how much memory-
21. JHJensen Huang
  I'm telling you what it is. They have plenty of, they have plenty of logic, and they plenty of HBM2 memory.
22. DPDwarkesh Patel
  Right. But as, as you know, the bottleneck often in training and doing inference on these models is the amount of bandwidth. So if you-- HBM2, I, I don't know the numbers offhand, but like versus the newest thing you have, you know, you, you can be almost an order of magnitude difference in memory bandwidth, which is huge.
23. JHJensen Huang
  Huawei's a networking company. Huawei's a networking company.
24. DPDwarkesh Patel
  But that doesn't change the fact that you need EUV for the most advanced HBM.
25. JHJensen Huang
  Not true. Not at all true. You could gang them together just like we gang them together with MVlink seventy two. They've already demonstrated silicon photonics sup- connecting all of these compute together into one giant supercomputer. The... Your, your premise is just wrong. The fact of the matter is, their AI, AI development is going just fine. And, and the best AI researchers in the world, because they are limited in compute, they also come up with extremely smart algorithms. Remember I just... what I said. I said that Moore's law is advancing about twenty-five percent per year. However, through com- great computer science, we could still improve algorithm performance by ten X. What I'm saying is great computer science is where the lever is. There is no question MoE is a great invention. There's no question all the incredible attention mechanisms reduce the amount of compute. We have got to acknowledge that most of the advance, advances in AI came out of a-algorithm advances, not just the raw hardware. Now, if most advances came from algorithms and computer science and programming, tell me that their army of AI researchers is not their fundamental advantage, and we see it. DeepSeek is not inconsequential advance. And the day that DeepSeek comes out on Huawei first, that is a horrible outcome for our nation.
26. DPDwarkesh Patel
  Why? Is that 'cause, I mean, currently you can have a model like DeepSeek-
27. JHJensen Huang
  Because DeepSeek-
28. DPDwarkesh Patel
  ...that can run on any accelerator if it's open source. Why, why, why would that stop being the case in the future?
29. JHJensen Huang
  Well, suppose it doesn't. Suppose it optimize for Huawei. Suppose it optimize for their architecture. It would put ours at a disadvantage. You, you described a situation that I conceived-- I, I perceived to be good news, that, that a company developed software, developed an AI model, and it runs best on the American tech stack. I saw that as good news. You, you set it up as a premise that it was bad news. I'm gonna give you the bad news, that AI models around the world are developed, and they run best on not American hardware. That is bad news for us.
30. DPDwarkesh Patel
  I guess I just don't see the evidence that there's these huge disparities that would prevent you from switching accelerators. There's American labs, you know, are running their models across all the clouds, across all the different accelerators.
1:35:06 – 1:43:10
Why doesn’t Nvidia make multiple different chip architectures?
1. DPDwarkesh Patel
  Uh, we, we can move on from China, but that actually raises an interesting question about, um, we were discussing earlier these bottlenecks at TSMC and memory and so forth. And so if we're in this world where, you know, you're already the majority of N3, at some point you'll be N2, you'll be a majority of that. Do you see that you could go back to N7, the spare capacity at an older process node, and say, "Hey, the demand for AI is so great, and our capacity to expand the leading edge is not meeting it, so we're gonna make a Hopper or Ampere, but everything we know about in numerics today and all the other improvements you described." Do, do you see that world happening within, before 2030?
2. JHJensen Huang
  It's not necessary to. And the reason for that is because with every, every generation, the architecture, the architecture, um, is more than just, is more than just, uh, the transistor scale. It also-- You're doing so much engineering and packaging and stacking and, and the numerics and, you know, the system architecture. Um, when you run out of capacity to easily go back to another node, that's a level of R, R&D that, that no one, no one could afford. You know, we, we could afford to lean forward. I don't think we could afford to go back. Now, if the world simply says if on that day, if on that day, uh, let's do the thought experiment. On that day we go, "Listen, we're just never gonna have more capacity ever again," would I go back and use seven? In a heartbeat. You know-
3. DPDwarkesh Patel
  Mm.
4. JHJensen Huang
  ... of course I would.
5. DPDwarkesh Patel
  Mm. Um, one question somebody I was talking to had is why Nvidia doesn't run multiple different chip projects at the same time with totally different architectures, so you could do like a Cerebras style-
6. JHJensen Huang
  Mm-hmm
7. DPDwarkesh Patel
  ... wafer scale, you could do a Dojo style huge package. You could do one without CUDA. You know, um, you have the resources and the engineering talent to do all of these in parallel. So why put all the eggs in one basket given who knows where AI might go and architectures might go?
8. JHJensen Huang
  Oh, we could. It, it's just that, that, uh, we don't have a better idea.
9. DPDwarkesh Patel
  Mm.
10. JHJensen Huang
  Yeah, yeah. We, we could do all of those things. Um, I-- It's just not better. And we simulate it all. They're in our simulator provably worse, and so we wouldn't do it.Yeah. We're, we're doing-- we're working on exactly the projects that we wanna work on. And, and, um, uh, if the workload were to change dramatically, um, and I don't mean, I don't mean the algorithms, I actually mean the workload, the, um... and that, that depends on the s- shape of the market. Um, uh, we may decide to add other accelerators. Like for example, recently we added, uh, Groq. Um, and we're gonna fold Groq into our CUDA ecosystem. And, and, um, uh, we do-- we're, we're doing that now because the value of tokens, um, have gone up so high that, that you could have different pricing of tokens. Back in the old days, in the, you know, just a couple years ago, tokens are either free or barely, you know, barely expensive, right? And so, but now you can have different customers, and those customers want different answers. And so because the customers make so much money, like for example, our software engineers, if I can give them much more, um, responsive tokens so that they're even more productive than they are today, I would pay for it. But that market has only recently emerged.
11. DPDwarkesh Patel
  Mm.
12. JHJensen Huang
  And so I think that we now have, we now have the ability to have the same model based on the response time, have different segments, and that's the reason why we decided to expand the Pareto frontier and, and create a segment of inference that is faster response time, even though it's lower, lower throughput. At the mo-- Until now, higher throughput is always better. Um, we, we think that there, there could be a world where there could be very high ASP tokens and, and, um, uh, even though the res- even though the throughput is lower in the factory, the ASPs make up for it.
13. DPDwarkesh Patel
  Yeah.
14. JHJensen Huang
  That's the reason why we did it. But otherwise, from an architecture perspective, um, I, I think Nvidia's architectures, I would, I would rather put-- If I, if I had more money, I'd put more behind the architecture.
15. DPDwarkesh Patel
  Mm. I, I, I think this i- idea of extremely premium tokens and just the disaggregation of the inference market is very interesting. It-
16. JHJensen Huang
  The segmentation of it.
17. DPDwarkesh Patel
  Yeah.
18. JHJensen Huang
  Yep.
19. DPDwarkesh Patel
  Yeah. All right, final question. Um, suppose the deep learning revolution didn't happen. Um, what would Nvidia be doing? Obviously games, but given the-
20. JHJensen Huang
  Accelerated computing.
21. DPDwarkesh Patel
  Hmm.
22. JHJensen Huang
  Accelerated computing. The, the same thing we've been doing all along. Uh, the, the premise of our company is that Moore's Law, Moore's Law is going to-- More general purpose computing is good for a lot of things, but for a lot of computation, it's not ideal. And so we combined an architecture called a GPU, CUDA, to a CPU so that we can accelerate the workload of the CPU. And so different, different kernels of code or algorithms could be offloaded onto our GPU, and as a result, you speed up an, an application by, you know, 100X, 200X. And where can you use that? Um, well, obviously engineering and science and physics and, you know, so on so-- data processing, um, uh, computer graphics, image generation. I mean, all kinds of things. Even if AI doesn't exist today, Nvidia will be very, very large. Yeah. And so, so I think the, the reason for that is, is fairly f-fundamental, which is, which is the ability for general purpose computing to continue to scale has largely run its course. And the only-- the, the, not the only way, but the, the way to do that is through domain-specific acceleration. And o-one of the, the domain that we started with was computer graphics. But, uh, many-- there are many, many other domains. I mean, there's, you know, p- you know, all, all kinds of, uh, scientific-- particle physics and fluids and, you know, and, and so structured data processing, all kinds of different types of, of algorithms that benefit from CUDA. And so our, our mission was, uh, really to bring accelerated computing to the world and advance the type of applications that general purpose computing can't do, and scale to the level of, of, uh, capability that helps break through certain f-fields of science. And, and so some of the early applications were, uh, molecular dynamics, uh, seismic processing for energy discovery, um, uh, image processing, of course. Uh, and so all of those kind of fields where, where general purpose computing is just simply too inefficient to do so. And so yeah, i-if, if there's no AI, I would be very sad. Um, but because of, because of, of the advances that we made in computing, we democratized deep learning. We made it possible for any researcher, any scientist anywhere, any student, to be able to access a PC or, you know, uh, a, a GeForce adding card and, and, uh, do amazing science. And, um, uh, that, that fundamental promise, uh, hasn't changed, not even a little bit. And so i-if you see GTC-- If you watch GTC, there's the whole beginning part of it, none of it's AI. That whole part of it with, with, uh, computational lithography or, or, uh, our quantum chemistry work or, you know, uh, all of that stuff, data processing work, uh, all of that stuff is, is, uh, unrelated to AI. And, and, and it's still very important. I mean, there's, you know, I, I know that, that AI is, is very interesting and, and, uh, quite exciting. Um, but, but, um, uh, there's a lot of people doing a lot of very important work that's not, not AI related, and tensors is not the only way that you compute with.
23. DPDwarkesh Patel
  Mm.
24. JHJensen Huang
  And, um, I-- and we wanna help everybody.
25. DPDwarkesh Patel
  Jensen, thank you so much.
26. JHJensen Huang
  You're welcome. I enjoyed it.
27. DPDwarkesh Patel
  Me too. Sweet.

Episode duration: 1:43:12

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode Hrbq66XqtCo

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Is Nvidia’s biggest moat its grip on scarce supply chains?

Will TPUs break Nvidia’s hold on AI compute?

Why doesn’t Nvidia become a hyperscaler?

Should we be selling AI chips to China?

Why doesn’t Nvidia make multiple different chip architectures?

Get more out of YouTube videos.