This video isn’t embeddableWatch on YouTube →

Stanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence

For more information about Stanford's online Artificial Intelligence programs, visit: https://stanford.io/ai Follow along with the course schedule and syllabus, visit: https://cs153.stanford.edu/ In a CS153 Frontier Systems lecture, the class hosts Jensen Huang, CEO of NVIDIA, who argues computing is being reinvented for the first time in 64 years as software shifts from prerecorded execution to real-time generation, with NVIDIA's extreme co-design across chips, compilers, networks, and systems delivering a million-fold speedup over the past decade versus Moore's Law's 100x. He walks through the architectural logic of Hopper (pre-training), Grace Blackwell NVLink72 (inference and decode), Vera Rubin (agents), and the upcoming Feynman generation built for swarms of agents and sub-agents, while pushing back on MFU as a misleading metric in favor of tokens-per-watt and real evals. Huang also defends open models like Nemotron, BioNemo, and Alpamayo as essential for safety, transparency, and democratizing AI across underserved languages and scientific domains, and forecasts compute energy demand growing roughly a thousandfold, making this the strongest market moment in history to invest in sustainable energy and grid upgrades. Guest Speaker: Jensen Huang founded NVIDIA in 1993 and has served since its inception as president, chief executive officer, and a member of the board of directors. Since its founding, NVIDIA has pioneered accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI. NVIDIA is now driving the platform shift of accelerated computing and generative AI, transforming the world's largest industries and profoundly impacting society. Huang has been elected to the National Academy of Engineering and in 2026 was appointed to the President’s Council of Advisors on Science and Technology. He is a recipient of the Semiconductor Industry Association’s highest honor, the Robert N. Noyce Award; the IEEE Founder’s Medal; the Dr. Morris Chang Exemplary Leadership Award; and honorary doctorate degrees from Taiwan’s National Chiao Tung University, National Taiwan University, Oregon State University, Huazhong University of Science and Technology, and Linköping University. He has been named the world’s best CEO by Fortune, the Economist, and Brand Finance, as well as one of TIME magazine’s 100 most influential people. Prior to founding NVIDIA, Huang worked at LSI Logic and Advanced Micro Devices. He holds a BSEE degree from Oregon State University and an MSEE degree from Stanford University. Follow the playlist: https://youtube.com/playlist?list=PLoROMvodv4rN447WKQ5oz_YdYbS74M5IA&si=DOJ5amlyRdyMJBhG

Jensen Huangguest

May 13, 20261h 8mWatch on YouTube ↗

EVERY SPOKEN WORD

65 min read · 12,789 words

0:09 – 3:15
Computing is being reinvented: from pre-recorded software to generative, contextual systems
1. SPSpeaker
  I would like to welcome back Preacher Huang. [audience applauding] We have been now in a, locked in a global race way faster than NASCAR racing, and it's partly your fault. Jensen's been the preacher that's given us all the power we need, all the energy, and some more to have what I think has been the craziest 12 months of my life, certainly for many of you, and we're just getting started. Um, the energy with which you approach every single thing you do, including the class last year, and then all, every time I've had the chance to hang out with you, you've given so much time to the students, to the founders. Thank you. Should we jump right in?
2. JHJensen Huang
  Yeah, let's go.
3. SPSpeaker
  All right. We're gonna go rapid fire. What is co-design, and why is it so important?
4. JHJensen Huang
  Uh, I'll, I'll answer that in a second.
5. SPSpeaker
  Yes, please.
6. JHJensen Huang
  Um, but, but, uh, this is a great time to be in computer science, and obviously the reason is because computing is being reinvented for the first time as dramatically as, as it is for the first time really in about 60-plus years. Uh, the computer that we know of, that you all use in our computing model, our mental model of a computer, the architecture of a computer, how you write the program, run the program, how you think about even taking computers to market, what it's used for, for 64 years it has been largely the same since the s- IBM System 360. In fact, my first architecture book for learning about computer architecture was the System 360's manual. And so, so a lot has changed, um, as we went from PCs to internet and mobile and cloud and all those things, but the fact of the matter is the computing model, the fundamental part of computer science has largely remained the same until now. You know, for the first time, uh, the way you write the software, how you process the neural network versus the software, and what the applications can do has now dram-dramatically changed. Everything is fundamentally different. At the highest level, you know, one simple way to think about it is, is, um, uh, computing as we knew it before was largely pre-recorded. It's content that we pr- we pr-pre-recorded, images, videos, you know, software that we re- largely pre-recorded. But now everything is generated, and the nice thing about generating everything in real time is that it could be contextually consistent, con-con-contextually relevant to what, what it is that you're dealing with. And of course, um, it can respond, uh, to your intention, not just explicitly, uh, to the things that you instruct. And, and so, so the computer, the computer is, is, um, uh, fundamentally different in that way. Now the question is what
3:15 – 5:46
Stack-wide disruption: new development methods, new systems, and new applications
1. JHJensen Huang
  does that mean, uh, at every single layer of the stack, from, from, uh, how the computer, how the software is now developed, the methodology of it, how you organize your company to be able to develop software of today completely changed. And so the methodology, the tools we use, uh, the approach that we think about software coding, uh, completely changed. Uh, how we run the software, neural network versus compiled binaries, um, very, very different. And so what does that mean to the computer system, the network, the storage? Um, what does that mean to the software stack and the cloud services that sit on top of that? And of course, you know, everything about the applications. What did it open up? And, uh, somebody just, uh, somebody just came and said, uh, this piece of software we just opened up called Alpamayo, and, uh, I've been working on self-driving cars now for about 13 years. And, and, um, I, and the, uh, and the, and the days of robotaxis are gonna be literally everywhere. You know, everything that moves will be robotic. And, and that's an example of an application that, that, uh, we wouldn't consider doing until deep learning and artificial intelligence came along. That was such a big unlock, um, that, you know, I said, "Hey, aha, uh, this, all of these problems that we wanted to solve in the past that we need a computer vision for, uh, really, really, uh, are now fundamentally unlocked." And so, so it's how you think about every single stage of that. What is, you know, what is a software engineer? How do you organize the company? Uh, what is a computer for the age of AI? How do you architect that? All the way to what you can use it for, and therefore, uh, therefore, um, uh, where you would deploy it. Um, all of that has fundamentally changed. And, and, uh, for me, the journey really started about 15 years ago. And, uh, uh, I had the benefit of, of seeing some early works in, in the area. And, uh, as all Stanford students do, uh, you break the problem down, you reason about it from first principles, and you come to the conclusion literally everything has changed. And so here you are, you know, computer scientist students, uh, this is really the first generation of AI becoming, uh, useful. And, um, where we a couple years ago was in the generative part of AI, uh, and, and as you guys know, uh, generative AI not only made it, made it cool for us to do image generation
5:46 – 8:34
From GPT to agentic systems: continuous computing and what comes next
1. JHJensen Huang
  and text summarization and translation and whatnot, but generative al- generative al- AI also enabled us to think. And so when I saw generative AI, uh, you know, when o- other people saw was that it was able to generate images, and I, I, and I surely would appreciated that as well. Uh, but the fact that you can generatethoughts i-in the form of images, but you can generate thoughts, uh, you can also reason with it. And the ability for AI to think after GPT, uh, was, was very, very obvious. Now the question is, is, uh, how would you train, how would you fine-tune an AI, uh, to be able to reason step by step by step, and how would you teach it how to do so at, at fairly large scale in a kinda semi-supervised way? And so those are kind of the engineering problems you had to solve. Uh, but the moment you see GPT, you say, "Aha, uh, thinking is just around the corner." And thinking is generating tokens that you consume internally, and, uh, generating tokens that you consume externally, uh, would be called tool use. And so the idea that, that after GPT happened two years ago, that we would be at this moment was fairly easy to predict. Now, of course, un-un, you know, un-unbelievable amount of technology was invented and a lot of m-amazing people did amazing work, but you could almost see that moment here. And so here we are, you now have agentic systems, and so now the question is, is what's next? And what happens in a world where a computer is not, uh, not responsive to what you ask it to do? It's not, it's not b- on demand. You know, today's computing is really on-demand computing. The word on demand was actually gener- created i-in our generation to talk about how you think about using computers. Uh, time-sharing computers that you would use on demand became cloud computers. And cloud computing, of course, is on demand, but, uh, in your new world of agentic system, uh, these con- the computers are now continuously running. And so what happens in a world where the computers are continuously running? Uh, what happens to cloud services? What happen to your personal computer? What happens to, you know, all of these different sy-systems? Now there's a great ch- great opportunity again to rethink all of that. And so, so what I, w- you know, as kinda my, my, my introduction to everything about, um, computer science has changed and, and everything about every field of science has changed because of the things that we've changed and, and so it's a g- a good time to go to school.
2. SPSpeaker
  Okay.
3. JHJensen Huang
  That's it. What was your question?
4. SPSpeaker
  You know what? Uh, I'm just gonna turn it over to the kids.
5. JHJensen Huang
  Oh, oh, co-design, co-design, co-design.
6. SPSpeaker
  No, l-let, let, let's just go into the st- the students have questions. They've all been asking questions in Discord. They're all voting on each other's questions.
8:34 – 11:06
Co-design explained: why optimizing hardware + compilers + frameworks together wins
1. JHJensen Huang
  Co-co-design's really interesting, but it's not, it... Co-design's super interesting. And, and basically, co-design says, it said back in the old days, we abstracted computing so that, so that, um, uh, the people who designed microprocessors designed microprocessors, people who, uh, worked on compilers worked on compilers, and people who worked on languages worked on languages, and so on and so forth. You guys know that. And we actually had different fields. Um, but the problem, and in fact, this happened at Stanford, uh, what's the, what's the beauty of RISC? What was the beauty of the work that John Hennessy did? Um, it, it, the beauty of it is that you ought to think about compilers and microprocessor architectures harmoniously co-designed, because otherwise you could end up creating a microprocessor that's super, super tight and, you know, everything is, is maximally optimized, but unfortunately, it's hard to compile. It's, it's difficult, it, not compilable. And so they created a, a simpler instruction set that exposed simplicity to compilers so that compilers could do a better job generating code. And it turns out a simpler machine co-designed with a compiler creates better performance than two systems that were optimized individually. That's, you know, k- that's very Stanford, okay? And this is a, this is part of your heritage a-as all of your, and, and, uh, John Hennessy's, you know, trail of amazing work that's left behind. And so you take, you take that and you think about, well, what happens in, in the post world of general purpose computing? Why is it that every problem in, in computer science would be solvable by a general purpose instrument? At some level, you know, you could say, well, if you had a general purpose instrument, you prefer that. However, there are some extreme problems, whether it's computer graphics back in the old days, or molecular dynamics, or quantum chemistry, or, you know, f- you know, fluid dynamics and large multi-scale, meso-scale multi-physics problems, or deep learning. These problems are so computationally intense, why would you use a general purpose computer to go do that? And so there, y- the big insight is what if you understood the algorithms, understood the computer systems, understood the, you know, if you will, the compilers, the frameworks, and understood the architecture of chips, and you were optimizing all of it at the same time. And so the, the facts, here are the facts. This is what happens when you do it, what I just described. NVIDIA is probably the
11:06 – 13:50
Co-design at NVIDIA: beyond Moore’s Law (the “million‑X” claim)
1. JHJensen Huang
  first computer systems company that's extreme co-design, meaning we, we literally co-design across all of that, and including CPUs, GPUs, networking, and switches, and every- and storage. And so the question is, what do you get? Well, Moore's Law back in the old days, you guys all know about that. Moore's Law was about, uh, 2X every 18 months, so call it 10X every five years. Okay, so 10X every five years is 100X every 10 years. And that's, that was in the good old days of Moore's Law, and for all the computer, computer scientists in the room, you know, you know that Moore's Law was underpinned by a concept called Dennard scaling, and Dennard scaling ran out of steam, um, several years ago, probably about a decade ago, in fact. And we, we kept squeezing it, we kept squeezing it, but, uh, over the course of last 10 years, if you just allowed microprocessors to continue to scale and you just don't touch the software and just benefited from the speed up of semiconductors and m-microprocessor design, you, at best case, you would've gotten 100X, but probably because Dennard scaling slowed down and Moore's Law largely ended, you know, you probably got something along the lines of 10X over the course of 10 years. Well, in the case of NVIDIA and co-design, we got 1 million X over 10 years1 million X. And so somewhere between 100,000 X and 1 million X, okay? So there, when you're talking about numbers that big, it really doesn't matter. And so 1 million X over 10 years, it got-- We, we were able to get scaling and computation scale so large so fast that AI researchers say, "Why don't we just take all of the internet? Why even worry about what data to go curate and what cr- what data to create? Let's just take all of the world's data and just give it to the computer." And that's really the big breakthrough. When you're able to, to, to do something so insanely fast, you know, for example, if you were able to travel at the speed of light, uh, where we choose to live is, doesn't matter. Uh, if you were able to go from New York to California in 10 minutes, uh, you know, our freedom, everything about society would change, right? And so if you're able to do computing a million times faster, everything about computing s- computing changed, and that's really the big breakthrough. Because of co-design, because of the way NVIDIA approached it, we accelerated computing by so, so far that it created all this infinite abundance opportunity for everybody to, to think about the future. And so anyways, here we are.
2. SPSpeaker
  Cool. I have a bunch of follow-up questions, but I'm not gonna ask them.
3. JHJensen Huang
  One, that one word led to that.
4. SPSpeaker
  GPT-10, ladies and gentlemen.
5. JHJensen Huang
  That's what it's like to work at NVIDIA. You give me one word, and you get ranted at for about half an hour. Because I got too much to, too much to share with you.
13:50 – 17:08
How education should change: learn with AI while keeping first principles
1. SPSpeaker
  The question is w- how should education evolve in response to the industry as it's changing?
2. JHJensen Huang
  Yeah, and, and that's a really excellent question, and, and I think the answer clearly is, uh, AI should be part of your curriculum, not just in learning about AI, but using AI for the curriculum. The, the problem with, with textbooks, as you know, it takes an enormous amount of effort to do. And when I was taking classes at Stanford, Professor Hennessy was still writing his textbook. It was all handwritten out, and, and each week, it seemed like he was writing a chapter. I don't even know how he writes a chapter a week, but every week, he was writing about a chapter. And, and then over time, all of those notes turned into a textbook, into the first edition, and that must have taken several years. And so I, I think, I think, um, it, it's not, it's not possible for universities, for, you know, pre-recorded textbooks to keep up with information and knowledge that's being generated in re-real time by AI. And so I think the future probably has to be some union of the two, and, and I, I don't know about you guys, but I, I can't learn anymore without AI. And so not only do I have the AI read the papers, um, but I also ha- Once it reads the papers, I might ask it to go read, you know, a whole bunch of the other papers that are associated with it, and then now it becomes a super researcher, and then I can, I can-- First, I ask it to summarize, um, I ask it some basic questions, and then after that, you interact with that paper as if it's a researcher that's dedicated to you. And so most people don't realize that. You know, I think a lot of people still think that you, you summarize a document. But in the process of summarizing the document, that AI learned a lot. And so, and I, I, um, uh, I think that in the future, I, I do hope that, that curriculums are, are tightly integrated. Um, I, I will say, in defense of the textbooks, though, I will say that the first principles don't change. You know, in the final analysis, uh, uh, Mead & Conway is still a solid, a, a, a fundamental methodology as, as before. It, it, it is true that the scaling process that led to, um, constant, constant current density, um, constant, uh, power density, all, all of that, all of those design optimizations associated with modern semiconductor design, you know, the, the-- We've, we've kind of exhausted all of that. None of that is ISO anything anymore, and so, um, but it's still good to know where we came from, you know? And so I, I would still encourage the, to appreciate the first principles and, and you know, while, while I was going to, to Stanford, I was als-already working at AMD and, um, and I was designing microprocessors at the time, and it was still, it was still really good to, to see simultaneously, um, how, how we design things in practice versus, uh, the first principle methods associated with learning about, eventually, uh, how to design these things. And, and, um, I, I, I really enjoyed having, you know, feet on both sides of it, and I, I ended up learning a lot more. And so what that means is when you're using AI, which is real world, it's contextually relevant now, um, it's, it's contemporary, and meanwhile, you have first principle knowledge that you're learning at the same time. You're kind of getting the same thing that I experienced. There-
17:08 – 23:49
Open source vs closed models: why NVIDIA builds open foundation models
1. SPSpeaker
  The question is op- what are your thoughts on open source? How do we, how does open source stay at the frontier?
2. JHJensen Huang
  Yeah, there's really the question of closed source versus closed proprietary software versus open source. There's a question of my intentions with open source, and so I'll start with my intentions of open source. Um, first of all, uh, NVIDIA uses more Anthropic and OpenAI tokens than just about anybody. Uh, and, and the reason for that is obviously we do a lot of coding, we do a lot of design, and 100% of our engineers are now agentic-agentically supported. And so, so I want them to be working with agents using the latest generation tools and re- and remon- uh, modernize how NVIDIA does work altogether, okay? So number one, if you can use, uh, OpenAI and Anthropic, I would highly recommend you use it, and the reason for that is because it's useful. It works really well. It's getting better all the time, and it's, you know, as a, as you know, large language models is the technology inside, but Claude is a product cl- and Claude Code is a whole harness around it, and that harness is getting better all the time, and the model's getting better all the time. It's not un-- It's not likely that anybody open source go to GitHub, download something that's gonna work nearly as well, okay? So, so I, I highly recommend and, and we do, um, use off-the-shelf frontier AI models. The question is why is it there that we're advancing and working so hard on open, on, on open models? The reason for that is because language models are very important because they representThe, the codification of our intelligence and, and, uh, we wanna automate ourselves especially is a very important part. But you, you, you need to know that, know that AI is about learning the representation, the meaning, the structure of information. And so the question is, where is information? Well, we're living in information right now as we speak. The reason why there's structure is the reason why every day you show up, it's kinda largely the same, otherwise it'd be like practically white noise. And so the fact that biological systems and physical systems have structure, and from that structure, I must be able to learn higher level representation. And if I can learn the representation, then I could manipulate it. Then I can... Does that make sense? And so just because I can learn the representation of, of language, I can then generate it, I can manipulate it, you know, I could put it to use. And so I wanna do the same thing for chemicals and, and proteins and genes and, uh, physics and physical systems, robotics, for example. And so notice the way you represent all of those things are fundamentally different because the structure is different and the dimensionality is different. How you train it is fundamentally different, right? Because you don't have a whole bunch of internet corpus of human language on it. So you, you gotta come up with new, new strategies for all of that stuff. And so we decided that we would dedicate ourselves in some fundamental pillars of-- And because we have the company, the company has the talent and the scale, we have the ability to put the first piece of artifact out in the world, data, model, how to train it, in several different domains. And so some of the domains I care very much about, uh, one of them is called, of course, Nemotron's language, and I'll come back to that in a second why it is that we're doing it. And then second is BioNemo, that's for biology. And, and, uh, um, we have, uh, Alpamayo, somebody mentioned it earlier, for autonomous vehicles. Uh, basically artificial intelligence, uh, navigation. And then, and then, um, uh, we have, uh, Groot, which is, uh, a humanoid articulation robotics, gener- artificial general robotics. Uh, and, and, and then we have, uh, climate science, you know, basically meso-scale multiphysics. Okay? And so all of these different area, these different domains, uh, we decided that, that we should go and pioneer it. And the reason for that is because otherwise, the scientists in these different domains, they simply won't have the scale and the technology necessary to go build that foundation model. And so we decided that we would do that. Okay? So that-- And, and as a result of doing that, we activated healthcare, we activated life sciences, we act- We're working with every single self-driving car company in the world, doesn't matter which one it is. You know, there's NVIDIA in there somewhere. And so we're, we, we enabled that entire ecosystem to really flourish and, and we're working with robotics right now and, you know, so on and so forth. Okay? Without us making that first effort and building a foundation model, it's hard to activate the whole industry downstream. And so it's about, really about expanding AI and, and, uh, democratizing this capability. The, the reason why we do language models is because two reasons. One, there are too many, too many societies where the scale of their language is not big enough for somebody else to decide to make it a high priority. They'll understand Sweden, Swedish, but making Swedish a top priority is not, not, not likely because the country is, is big, but not so big. Uh, Chinese, of course, well taken care of. Uh, Indian, certain dialects very well taken care of, but as you know, you have like 230 others. And so there are too many others. Now, unless you deeply care, it's never gonna be great. And human intelligence, no matter the size of your population, uh, you-- somebody should care. And so we created a, a large language model that's near frontier, Nemotron is close to frontier, and we l- we make everything available so that if somebody wants to then fine-tune it into whatever language of their choice, they got no trouble doing that. Okay? And so-- And then the second reason is very important is because we want to also take these language models and fuse it with the domain-specific models because of human priors. So for example, Alpamayo is a language model fused with a, a world model. And so on the one hand, it's really designed to detect cars and roads and things like that. But on the other hand, we also believe that if the AI model, if Alpamayo, the self-driving car model, can reason, reason like a human, and it could reason with human priors, then, uh, the number, the amount of experiences it needs to have before it could be an extremely good and safe driving car is dramatically reduced. The amount of training data is reduced, and we've proven that. Alpamayo is probably one of the most effective self-driving car systems in the world, and it's really only experienced, you know, a few million miles, not billions of miles. And so that kinda te- the system actually works. Okay? So anyways, I just gave you a long-winded answer for... I broke it all down. You can't just ask a simple question.
3. SPSpeaker
  Well, what we talked about-
23:49 – 26:10
Why openness matters for safety and security: transparency and “swarms” of defense
1. JHJensen Huang
  But open models is really important. And then, and one, one more thing, okay? If there's nothin'-- That wasn't enough. One more thing. Uh, if you want, if you care to have AI be safe and secure, it has to be open. And the reason for that is you can't defend against a black box, and you can't secure a black box. And you can't put a black box of some cap- incredible capability into your system with it completely, completely opaque. Now, of course, there's a lot of different ways you could solve the opaqueness. For example, you could say, "Before it do- does anything, you have to reason about it to me step by step. Before you do anything at all, you have to come up with a plan, you have to reason about it step by step," but you could always lie. And so, so the ability for the, the, the nice thing about transparent systems is that then, you know, we, everybody gets to interrogate it. Uh, if you have a transparent system, then researchers get to use it. If you have a transparent system, uh, open system, then the way you defend against super agentic systems in the future for cybersecurity is obviously not to go into a battle of who gets the better one.You know, you come up with some model, uh, model 7.0, and the only way I combat against that, I'm completely vulnerable until I come back with a 8.0, and then you gotta come back with a 9.0, and we just go back, back and forth driving each other nuts. And obviously that's not, that, that's obviously not the smartest way to do it. The smartest way to do it is you're gonna, you're gonna create these incredible cybersecurity systems and, or you're gonna... These cybersecurity threats, and what we're gonna do is we're gonna have millions, billions, swarms of cheap AIs, and we're gonna systematically surround it. And so it's kinda, you know, if you will, a giant dome. So for example, Nemotron Nano is being used for cybersecurity. And so all these cybersecurity firms take Nemo- Nemotron Nano because it's so fast and so, so cost-effective, you can train it to detect cybers- cyberattacks and then just deploy trillions of them.
2. SPSpeaker
  Yeah. Um, on, on the topic of open scaling, you know, we, we hung out in January and we-
3. JHJensen Huang
  I feel like, you know that one scene in Thor? Do you remember he was just hanging and he kept rotating in that direction?
4. SPSpeaker
  It's zero gravity. Here at AI Coachella, we got no gravity. [laughs]
5. JHJensen Huang
  You know, in Thor: Ragnarok. Do you guys remember that?
6. SPSpeaker
  We can, we can move a little bit back so you can do this.
7. JHJensen Huang
  No, that's all right.
8. SPSpeaker
  Okay. All right.
9. JHJensen Huang
  You guys don't watch movies?
26:10 – 30:01
Coalition scaling and utilization: why MFU can be a misleading metric
1. SPSpeaker
  Well, we got a whiteboard too if you wanna get up and walk. But, um, so in Jan- in January we met and we talked about this topic, open scaling. We talked about bottlenecks. We talked about, um, data as one bottleneck, compute as another bottleneck. Um, you know, there's at least one experiment that, uh, we announced at GTC together, which was the coalition scaling idea. The second i- is on how to improve utilization on compute, which is increasingly scarce. Uh, it came out last week that there was a memo at xAI that said their, uh, Memphis cluster pool is running at 11% MFU utilization, which I think, like, corresponds to something like 11 billion or something of unutilized MFU flops. Like, how can the open space... Well, like, maybe you could talk a little bit about why coalition scaling is an, an experiment worth trying, and we have Brian coming actually in the final office hours to talk about progress, and then how do we get utilization to be better for open, the open ecosystem when you don't have full, like, sort of fully integrated companies that can optimize up and down the stack?
2. JHJensen Huang
  Yeah. Um, th- do you guy- do you guys know, know what M- MF- MFU is? And so F- FU, do you guys know? You guys don't use that anymore? So MFU, uh, i- is, uh, just simply wrong, okay? It's, uh, it's the, the amount of, of, uh, the percentage of, of, uh, flops basically, uh, that you consume while doing your work. All right?
3. SPSpeaker
  Model flops utilization.
4. JHJensen Huang
  Yeah. And so, so it, it's, uh, unfortunately, with every metrics, uh, depending on what you measure, you could be measuring the wrong thing. And so let me tell you why. Uh, if you ask me do I want to be at, at, um, high MFU personally or low MFU, I would like to be at low MFU all the time. And the reason for that is because I want to be so smart I'm overprovisioned for the work, okay?
5. SPSpeaker
  Hmm.
6. JHJensen Huang
  Because I'm overprovisioned. I got so many flops and sitting idle.
7. SPSpeaker
  Hmm.
8. JHJensen Huang
  And the reason for that is because the way that computing works in these large-scale data centers is you have flops, you have memory bandwidth, you have memory capacity, you have network capacity. At si- any given point in time, something is bottlenecked. S- At any w- given point in time, something is bot- bottlenecked. And so what you want to do is you want to provision every, overprovision on everything-
9. SPSpeaker
  Hmm
10. JHJensen Huang
  ... so that you could avoid Amdahl's law. Otherwise, you're fighting Amdahl's law all the time.
11. SPSpeaker
  But then if you're provisioning for peak, uh, but not your base loads, then you're gonna have a bunch of those flops sitting while, while overprovisioned when you don't need them because it's spiky.
12. JHJensen Huang
  But they're, at, at the right time it goes to 100% MFU, but only for a short period of time.
13. SPSpeaker
  Ah.
14. JHJensen Huang
  And in that short period of time, you don't get, you don't get all that overprovisioned flops.
15. SPSpeaker
  Right.
16. JHJensen Huang
  Then during that short period of time it become, becomes a long period of time.
17. SPSpeaker
  And so what are you seeing for teams that are trying to optimize-
18. JHJensen Huang
  And transis- and, and flops are cheap. No, flops are cheap.
19. SPSpeaker
  H100s are going up in price. [chuckles]
20. JHJensen Huang
  Well, not because of its flops, but because of H100. Hopper, you know, it's, it's bandwidth, it's architecture, it's everything else, not just its flops.
21. SPSpeaker
  What, what, uh, is... So should we think about compute as not a scarce resource?
22. JHJensen Huang
  No, no, that's not, that's not the question. It's like this. Uh, uh, when you, when you ask about a car, uh, back in the old days when we were unsophisticated, we used to say, "How many horsepower is your car?"
23. SPSpeaker
  Right.
24. JHJensen Huang
  But these days who does that?
25. SPSpeaker
  So what's the right measure you think we should be thinking about in terms-
26. JHJensen Huang
  Performance.
27. SPSpeaker
  Uh, and what, what... When you tell the teams, "Guys, this is the perf we've gotta hit next year," what are you finding is the eval you're, you're reaching for more and more?
28. JHJensen Huang
  You have to come up with a real eval, a really serious eval, and that real... Uh, because otherwise you'd be, like, improving your flops, you know? It's, it's no- You, you figure out something that, that you guys can improve, and you're improving that number, it doesn't make you smarter. You're improving that number, it doesn't make you more successful.
29. SPSpeaker
  Yeah.
30. JHJensen Huang
  And so it, it's, it, there's nothing wrong, there's nothing wrong with having a lot of flops, um, but it's not the complete... Necessary, not sufficient. That's all.
30:01 – 33:14
Measuring progress: tokens-per-watt, NVLink bandwidth, and the need for serious evals
1. SPSpeaker
  In one sense you could think about the output of tokens as intelligence, so it sh- should be some unit of intelligence per watt or-
2. JHJensen Huang
  Yeah, yeah
3. SPSpeaker
  ... uh-
4. JHJensen Huang
  Notice, notice that tokens per watt is more than flops.
5. SPSpeaker
  Right.
6. JHJensen Huang
  And in fact, we know that now because for decoding these large language models, the single most important thing for generating tokens per watt-
7. SPSpeaker
  Right
8. JHJensen Huang
  ... is actually the aggregate bandwidth across the NVLink72. And the MFU is incredibly low because the prefill's not that much. It's mostly decode.
9. SPSpeaker
  But you can decouple decoding and prefill. Yeah.
10. JHJensen Huang
  It's disaggregated.
11. SPSpeaker
  But it's not-
12. JHJensen Huang
  And so notice I just delivered incredibly high tokens per watt-
13. SPSpeaker
  Right
14. JHJensen Huang
  ... with extremely low MFU.
15. SPSpeaker
  MFU. But, but, but not all tokens are born equal, right? And so how do we account for that? Like, when, when you're designing the systems of the future, how do we acco- how, how, what is the right way to measure without a standard measure of intelligence when you have coding tokens being more valuable per watt than, I don't know, some other kind of token? Does, does that, does that question make sense?
16. JHJensen Huang
  Makes perfect sense. You always have to come back to not just optimizing for SAT scores.You're optimizing for something bigger. And so, so that's, that's basically it. It's the same idea. You're, uh, you have to decide what evaluation. As you know, eval, how you evaluate success matters a lot in how people perform. And so what NVIDIA does extremely well inside the company is the systems that we create for evaluating architectures and, and FLOPS is too, too-
17. SPSpeaker
  Too-
18. JHJensen Huang
  ... contrived. Because if it was that easy, then-
19. SPSpeaker
  And so do we have-
20. JHJensen Huang
  ... I wouldn't be here
21. SPSpeaker
  ... you have a hard job, which is to try to design an index of different intelligences, right? 'Cause you got-- Like, I think when, when I'm building, when I'm, when our teams are researching on the NVIDIA architecture, we've got one lab doing coding, another one pushing the frontier of superconductivity, and so on, and they've got all, they all have completely different evals they're measuring for, but they're all using NVIDIA chips.
22. JHJensen Huang
  Yeah.
23. SPSpeaker
  So how, w- like, how do you, how do you solve that problem when your customers all have their own evals-
24. JHJensen Huang
  Yeah
25. SPSpeaker
  ... but the architecture of the underlying platform-
26. JHJensen Huang
  That's why it's so hard. A- and, and here, it, it is true. It's, uh, it's, it's that hard. The problem is this. If you, if you build something that's too overfit-
27. SPSpeaker
  Mm-hmm
28. JHJensen Huang
  ... for something, you could be incredibly good at it. And so you're overfit for this one problem, you're insanely amazing at it, but then the problem is, is that market, you know, that problem space may not be good, may not be big enough to find a sufficiently large R&D.
29. SPSpeaker
  Right.
30. JHJensen Huang
  And so you want to be good at many domains, multi-domain, on the one hand. On the other hand, if you're good at everything, then you're good at nothing. You became general purpose. And so that, riding that balance, by the way, is artistry. You know? It's, that's what I do for a living. What should we not do? What should we double down on? What should we 10X on? The, you know, that's, that requires some amount of vision, strategy, you know, some amount of trial and error, some just personal enjoyment and entertainment, uh, you know, iteration, all of that.
33:14 – 38:12
Architecture roadmap logic: Hopper → Grace Blackwell → Vera Rubin → (future) Feynman
1. JHJensen Huang
  Well, I can tell you the journey that we came on. And so, so if you look at Hopper, Hopper was designed for a problem space that was rather new. It was called pre-training. And so pre-training, uh, came along and we, we g- came to the conclusion that, that, um, uh, although the generation before it was, was, uh, fairly significant already, that we should build even larger ones, tremendously larger ones, larger than any of the largest science, scientific supercomputers in the world.
2. SPSpeaker
  Mm.
3. JHJensen Huang
  Okay? So that's a very big deal, that-
4. SPSpeaker
  Mm
5. JHJensen Huang
  ... that the f- the largest supercomputer in the world was about $350 million, and we, we thought, "You know what? Uh, pre-training is gonna be such a large domain and such an important problem, we should design systems that could be multi-billion dollars." At the time that we're thinking about doing this, this just sounds insane.
6. SPSpeaker
  Mm-hmm.
7. JHJensen Huang
  You know? You, you would have precisely zero customers, and the reason for that is because the most expensive thing that has ever been sold was $350 million, and you're building something that's multiple billions of dollars. So you're pres- you're building for a precisely cust- a marketplace of zero. But we went and did it anyways on first, on first reasoning. And so Hopper was designed for pre-training, and that was a great call. The second thing that we did was we said, "Okay, well, after, after training, and we'll keep, we're gonna keep making training better," but the goal is not of AI isn't training. The goal of AI is inference.
8. SPSpeaker
  Mm.
9. JHJensen Huang
  And, and, um, and what kind of a system would inference really care about? And so we created a system called NVLink 72, and the reason we did that was because decode, the, in, in, in processing the neural network, there's the prefill, which is really context processing and things like that, and attention processing, and then the decode, which is generating all these tokens. The generation of the tokens requires really high, uh, memory bandwidth.
10. SPSpeaker
  Mm-hmm.
11. JHJensen Huang
  And the amount of memory bandwidth you need is way more than one chip can possibly provide. And so we said, "Why don't we gang up like 72 of these things?" And so we had to invent all kinds of new systems for switching and interconnects and, uh, create all kinds of new SerDes and, and we created essentially the world's first rack scale computer. It's called Grace Blackwell NVLink 72. The speed up over the previous generation, 50 times. In two years, we improved something by 50 times. Moore's Law would've improved it by 2X. Okay? So the architecture and the insight, uh, was fantastic, and decode and inference and large language models and token generation, it, uh, you know, all of that kinda landed at exactly the time that Grace Blackwell came out, and boom, took off. So Grace Blackwell, uh, another incredible generation. Now, the question is what happened to Vera Rubin, and what's the big idea? Well, the big idea is that, that the goal isn't just to think. The goal is to do work, and so Vera Rubin is designed for agents. And so the question is what is the compute pattern? What is the processing pattern of agents?
12. SPSpeaker
  Mm-hmm.
13. JHJensen Huang
  And, uh, agents, of course, uh, you have to, you have to load, uh, a fair amount of memory, uh, long memory. It's got working memory. So long-term memory, we put into storage, and we got that storage needs to be able to directly communicate with the GPU. You can't be copying, copying the, the, the data off of the, the, uh, the network storage, but you want the storage to be connected right into the processor itself. And so we, we have, we have storage that's connected to, to the fabric. We have, we have, um, uh, we're gonna use a lot of tools, and so CPUs are gonna be really important, but the CPUs of the last, of the current generation was really designed for cloud computing. And so you have these CPUs with hundreds of cores, like, you know, 200 cores. Well, the CPUs of agents, because, because the AI is this multi-billion dollar system, and it sends off, uh, an instruction to use a tool, and that tool's gonna run on the CPU. Meanwhile, this a- this computer, this s- GPU supercomputer, this multi-billion dollar system is waiting for this one CPU. And so that CPU really wants to have extremely low latency. So we designed Vera, which is the-For, for current generation, for single-threaded m- you know, multiple core single-threaded code, it is by far the most, most, uh, performant. And so we created a CPU just for that. Notice, notice the way you solve this problem intuitively is you, you kinda think about what is the computing pattern, um, how is it different than the past. Um, you, you have to have some mental model about it, and you create a system, uh, that you can, you can go and, uh, go build, uh, to, to, uh, run that. And so, so now agents are here. We're gonna run that on Vera Rubin. And, and s- and hopefully when Feynman gets here, it's gonna be, it's gonna be like all software. Uh, ag- we call them agents today, but, you know, it could be modules in the past or, you know, sub-modules. And, and so in the future, you're gonna clearly have systems of agents and agents with sub-agents and sub-agents with sub-agents and, and so you're gonna have, um, you know, this swarm of agents and, and what, what kind of computer, you know, does that, does that manifest? And so that's, that's likely what Feynman's about.
38:12 – 41:57
Energy as the next bottleneck: efficiency, grid upgrades, and sustainable generation
1. SPSpeaker
  I have one more follow-up question on that, which is, you know, one of the things you've always done well is kinda spot bottlenecks one generation ahead and then try to sort of pre-solve for them the supply chain. A year ago, that was, um, uh, photonics ended up becoming a huge solution. Um, as we look about, look to energy as a bottleneck, you know, copper wires, literally copper wires are one of the, the transmission sort of bottlenecks. W- how does that get solved in your view?
2. JHJensen Huang
  Um, energy is just, uh, everywhere. Fir- well, first, the first thing that we could do that, that, um, uh, that is in y- our control, you know, uh, as with everything in life, uh, whatever the problem is, uh, whatever the external, external concerns are, uh, you should do something that's in your control, and in our control is energy efficiency. So if you look at, look at tokens per watt, uh, we improved it by 50x, and then we'll have to keep on improving it by, you know, by significant factors, and it compounds. So that's, uh, that's the first thing we can do. We can control that through co-design architectures and things like that. And the second thing that we could do, the thing we could, um, inspire people to, and that's through a lot of education, inspire the ecosystem to get ready for this. And, and, um, uh, and, and, and I've been over the last, last half decade, uh, helping people understand the amount of compute that's likely to be coming. And I just told you guys something about how I reason through, uh, how much energy is gonna be necessary. The amount of energy that we need for compute, for computing is likely, um, you know, probably a thousand times more than we currently have, and that's an enormous amount of energy. Um, however, the, the way to think about that is in the future, uh, computers are gonna be two things. It, it's always gonna be generated because it's intelligent, it's con- contextually aware, so it's gonna be generated, and then number two, it's gonna be continuous. And so you've got generative computing in a continuous way, um, compared to pre-recorded retrieval-based computing that is only, um, uh, initiated on, you know, per use. The question is how do you, how do you think about the amount of energy necessary for that? So I, I think if you, if you say we need it 1,000 times, uh, I, I wouldn't be surprised if we're off by a couple of orders of magnitude. And, and so we need a lot more compute, we need a lot more energy. And so you gotta go and explain this to people. And so I, I, you know, ex- you gotta explain it to people in a way that's kind of common sense and, and, and they can observe it and there are ear- you know, indicators along the way that, that in fact this is happening. And, and notice just as I was breaking it down for you guys, you know, I'm reasoning about it for you so that, so it's common sense to you. And so the amount of energy is, is high. And then lastly, uh, the source of energy. Now we, there, there's, there's a, there's all kinds of sources of energy, but unfortunately, because of, of great concerns about, about, um, uh, the cost of sustainable energy, we under-invested in, in sustainable energy. Um, but this is the best time ever in the history of humanity to go and invest in sustainable energy, and the reason for that is because the market forces are so strong. Back in the old days, you needed government subsidies to go build solar farms and government subsidies to go build nuclear plants, and now y- you can just market will pay you to do it. And so market forces are so powerful right now, this is our best chance to upgrade our grid, our, you know, archaic grid, um, add, add sustainable energy of all kinds and, you know, this is a great time.
3. SPSpeaker
  In terms of education, what I've learned as well, we designed the class for the students here. Turns out a lot more people, especially a lot of investors and capital allocators are watching this the next segment.
4. JHJensen Huang
  Oh, is that right? Oh, shucks.
5. SPSpeaker
  Let, let me put it up. Uh, yeah. Um, and so if there's edu-
6. JHJensen Huang
  I'm just kidding
7. SPSpeaker
  ... if there's education you'd like to do to that audience, feel free to drop in a, you know, uh, repeating yourself after a while with, with capital allocators can get-
8. JHJensen Huang
  No
9. SPSpeaker
  ... repetitive.
10. JHJensen Huang
  I don't mind.
41:57 – 47:23
Career advice: seek resilience through struggle, not only passion
1. SPSpeaker
  So if you'd like to transmit, feel free, feel free to. Um, what is the next question we should take? The question is how best to spend their mental faculties over the next few years.
2. JHJensen Huang
  Yeah, I, s- so, so first of all, on, on the pain and suffering comment, um, there, there's a, there's, there's some kind of a, so there's some advice that says you should, you should choose what you love and what you're passionate about. That's what your career should be. And, and I think that's terrific. I think that's terrific. You know, if you're, if you happen to, to, to know what you're passionate about, if you happen to know what you love, um, uh, uh, but I think there are a lot of people who don't know what they're passionate about and they don't know what they love, and the reason for that is because nobody knows everything. How could you not, how could you know what you don't know? So in a lot of ways, um, the idea that you would only do, you would only choose careers that give you passion, that gives you, you know, w- gives you, get, makes you happy, um, is a bar that I think is too, too high, number one. Um, and the reason for that is because whatever you decide to do for a living, whether it's you found something that you're passionate about, uh, or this is your job. And in my case, you know, I used to be cleaning toilets and bussing tables. It was my job.And I will do the best I can in my job. Whatever you give me as a job, I will do the best I can possibly do, and I do that today. Uh, now, th- there's a misunderstanding that, that somehow CEOs, we love our job. And, and, and, you know, many CEO, "Oh, I, I'm passionate about my job. I love my job." They're, they're lying. They're, uh, there, there's not, there's not one CEO who, who, uh, who can say that, you know, from the moment I wake up to the moment I go to bed is just zippity-doo-dah. Uh, the, the fact of the matter is, uh, I really love doing 10% of my work, and 90% of my work is hard, and I do it to the, to the best of my ability anyhow, and I suffer through it. I literally suffer through it. I prefer to do something else, that other 10%. But that other 10%, there's only so much quantity of that, and, and every company has abundance of problems, and there comes in different types. And you're going through life, you're gonna have abundance of problems. They're gonna come in different types. And you just have to learn how to condition yourself to want to get to a better state, no matter how hard. To get better, no matter how hard. And that's suffering. You know? You're, you don't like doing it, but you're doing it with all your might anyways. What do you call that? That's suffering. And so, so I think that when you, when you suffer and you have the benefit of struggle, and you, you're being presented with many opportunities like that, it teaches you resilience. And when the, when the time comes and, and the world or your family or your company or your colleagues, they need you to be tough. They need you to be resilient. They just need you to be able to fight through it. You can't have tho- You don't have that character about you, you don't have that muscle unless you've gone through it a whole bunch of times. And so, you know, I'm, I'm advising that, that, that you not, you not seek for just joy, that you also seek for some, some pain, some suffering because y- you're gonna need it someday. And, and then lastly, it's also, it's just your job, you know?
3. SPSpeaker
  As Preacher Huang once said, "Don't wake up with a loser mindset." [laughs] The question is-
4. JHJensen Huang
  Mm-hmm
5. SPSpeaker
  ... what's your favorite order at Denny's?
6. JHJensen Huang
  Yeah, Corvallis really should have a Denny's. Um-
7. SPSpeaker
  [laughs]
8. JHJensen Huang
  Well, you know, after all these years, frankly, it's about time, right? And so there was that, there was that, that one Chinese restaurant, um, and, uh, and Woodstock's, of course, right? Corvallis Woodstock's Pizza. It, it's still pretty good, isn't it, Woodstock's?
9. SPSpeaker
  It's solid. I like American Dream better.
10. JHJensen Huang
  American Dream's better? Okay. All right. I'll, I'll, I'll be back there soon enough. And so, so, um, uh, Denny's, I would say, surprisingly, the fried chicken is really good. So, you know, it's, uh, slightly on the, on the sweet side. Uh, Super Burger is excellent if done right. And, um, uh, and then another one, if they're willing to make it for you, uh, make it like a Super Burger, okay? But as a grilled ham and cheese with tomato and mustard. And if they're willing to make it for you, that they're willing to make it for me, and so [laughs]
11. SPSpeaker
  [laughs]
12. JHJensen Huang
  But that's because I'm, not, not because, because I, I'm an alum. They know that, "Hey, you used to bus tables here."
13. SPSpeaker
  That's right.
14. JHJensen Huang
  "Yeah, yeah. We'll make it special for you." Uh, but, but tho- those are all good. You know, the, the Grand Slam, you know, I enjoy it. Uh, l- like a Pigs in a Blanket, so that's pretty good. Um, there's a whole bunch of stuff. Goodness, I could go all day. I, uh, at, at Denny's, I had my first fudge, hot fudge sandwi- uh, sundae. I had my first, uh, apple pie with cheese on top. I, I, that's like, for a, for a Chinese kid it's like, "What is that about? That doesn't make any sense." And but now you think about it, it makes perfect sense, you know, apple and cheese. But anyways, I, I had a whole bunch... It was, uh, I had my first milkshake when I was at Denny's. Um, I had a whole bunch of firsts, yeah. Denny's, Denny's was, uh, eye-opening for me.
15. SPSpeaker
  Man, before we lose you [laughs] to the, to memory lane, next question, please. [laughs]
16. JHJensen Huang
  Those are some of the most important questions. [laughs]
17. SPSpeaker
  Agreed, yes.
18. JHJensen Huang
  [laughs]
47:23 – 52:52
Policy and geopolitics: GPUs aren’t ‘atomic bombs,’ and restricting markets harms industry
1. SPSpeaker
  The question's about your thoughts on adversarial countries getting access to, uh, NVIDIA chips.
2. JHJensen Huang
  Uh, first of all, so you, you know what we, we make for a living. We make GPUs. And, and, um, uh, GPUs are used for, uh, video games. Uh, they're used for, uh, delivering soy sauce. They're used for medical imaging. Uh, if you, uh, had a CT scan done, done yesterday, I'm fine, uh, but that behind it was NVIDIA. Uh, NVIDIA's in every single medical imaging system in the world. Uh, and, uh, and so the question is what is it that you build? Um, what I'm, what I'm fundamentally against, and it makes no sense, it makes no sense to this moment, is to compare NVIDIA GPUs to atomic bombs. There are a billion people with NVIDIA GPUs. I advocate NVIDIA GPUs to all of you. Uh, I advocate NVIDIA GPUs to my family, to my kids, uh, to people I love, but I don't advocate, uh, atomic bombs to anybody.
3. SPSpeaker
  [laughs]
4. JHJensen Huang
  So that analogy is stupid. And so, so if you start from there, you can't finish a thought. If you start from believing that, you can't finish the rest of the thoughts. Um, the second, the second idea that I, I consider completely ridiculous, uh, why should American companies go compete in foreign countries? You're gonna lose it anyways. You're gonna lose it anyways, so why go? Well, if you guys all apply that same philosophy, why wake up in the morning? And so I don't, I don't prescribe to, "We are gonna lose anyways." I don't prescribe to that. If you want me to lose, you're gonna have to deal it to me, but, you know, I'm gonna have to put up a fight, and I've put up a lot of fights over the years. I'm doing okay. And so, so I think that... And, and, and as you know, the battle, the competition, uh, serves markets. It enhances, enhances your company. I'm not a little bit afraid of having to go and compete in the marketplace. But the idea that I'm gonna lose anyway, so why go compete, makes no sense to me.And then lastly, uh, the idea that, that somehow we should deprive certain countries of general purpose computing, and we can all acknowledge now NVIDIA's a general purpose computing company, and I just gave you a whole bunch of general purpose use cases, as a general purpose computing company, to be deprived of that so that one or two companies, uh, could benefit from depriving other people of it, that makes me- makes no sense either. Why should one industry suffer so that another one company benefits, another one or two companies benefit? Entire American, the, the American technology industry is one of our national treasures. You are gonna be part of it. And if I do my job, when you are done graduating, you're gonna graduate into the mightiest computer industry in- the mightiest industry in the history of humanity. But if we give it up for some reason, or we through policy decide that we can't go and sell and concede two-thirds of the market to the wo- two-thirds of the world to other companies, by the time that you graduate, you would've gone into a shell of an industry. That shell of an industry we've seen before. A long time ago, the same arguments win- went against America in telecommunications. Today, America has no telecommunications fundamental technology anymore. It was all li- it was all completely policied out of our country. And so somebody has to put up a fight for that. Some of these reasoning systems, to, to, to say that AI is, AI's gonna come and it's gonna be a singularity moment, that singularity me- that moment, the moment it comes, it's gonna be the most powerful thing in the world, it's come, come as a flash. We have no idea whether it's gonna g- come on Wednesday or Thursday at 7:00, but when it comes, it's gonna be game over. Some percentage chance that it'll be the end of society as we know it. Come on, we all watched Dune. We don't have to repeat it. And, and so I think that living, living, living their fantasies out, their science fiction fantasies out, uh, in, in, in, in public, uh, demonstration when everybody is relying on their words and believing their words is irresponsible. It is not true. It is not true that we have no idea how these systems work. It is not true. It is not true that the technology is gonna some- somehow, uh, in some nanosecond become infinitely powerful, and therefore it's gonna take over the world. It is not true. It is not true there's no way to defend against it. It is not true. These things are all being made up, and it's made up in a way that unfortunately even harms all of you. You're in computer science. You're hoping that when you graduate, people care about computers. We wanna create a future that is optimistic about the technology that you are learning to master. S- We wanna create that future. We wanna make sure that America, we wanna make sure that everybody benefits from AI. Everybody should have AI. Nobody should have nuclear bombs. Can you guys agree with that?
5. SPSpeaker
  Yeah.
6. JHJensen Huang
  And so, okay. [audience applauding] And so, so young man, young man, thank you for triggering me. I'm just kidding. [laughs]
7. SPSpeaker
  Okay, so-
8. JHJensen Huang
  I'm just kidding.
9. SPSpeaker
  We're-
10. JHJensen Huang
  I'm just kidding
11. SPSpeaker
  ... we call this-
12. JHJensen Huang
  I just wanted to get, get it out.
52:52 – 57:20
Why universities can’t get enough compute: budgeting and aggregation, not chip supply
1. SPSpeaker
  So we're rational optimists here in, at AI Coachella, so believe in optimism. I'm gonna push back a little bit on a different angle. I completely agree, reasoning by analogy is a problem. Once you start with bombs, uh, you should, should do first principles. What we are observing is that compute, we are compute-constrained in America. Independent teams, startups, universities, they can't get compute. So from a preference order perspective, shouldn't America get first priority to com- a scarce resource before we start shipping it off?
2. JHJensen Huang
  Absolutely.
3. SPSpeaker
  But that's not happening.
4. JHJensen Huang
  Absolutely not.
5. SPSpeaker
  [laughs] There's the gotcha.
6. JHJensen Huang
  Yeah. Absolutely and absolutely not.
7. SPSpeaker
  Why not?
8. JHJensen Huang
  And the question is why not?
9. SPSpeaker
  Mm-hmm.
10. JHJensen Huang
  Uh, there's plenty of chips. You guys, if some- if, if, if the president of Stanford places an order, I promise you I'll deliver it. I have... No, absolutely.
11. SPSpeaker
  You guys heard it here. All right. A head of-
12. JHJensen Huang
  Un- un- this is not funny. This is not funny. Um-
13. SPSpeaker
  We are dying out there.
14. JHJensen Huang
  No, no. No, no. This is not funny. That's right. This is a serious matter. Um, it is not, it is not true, it is not true that people are giving me orders, placing orders, and we're not delivering chips. It is just not true. You gotta, you gotta place orders. The fact of the matter is, the fundamental problem is actually something very different.
15. SPSpeaker
  Hmm.
16. JHJensen Huang
  The... Stanford needs compute. Science needs compute. The fundamental problem is the system is no longer built to be able to deliver massive scale compute, and the reason for that is because just think, all of the, all of the research departments here at Stanford, they're all in different departments. You all raise your own funding. You all get your own grants. Nobody's gonna go share their grants, but none of the grants are big enough to have a large enough compute that you use some of the time, but when you use it, you need it to be incredible. You've got the world moved away from those centralized computing environments towards everybody just using laptops. That's, this is today's computing environment. And fundamentally, these un- all the universities, Stanford's not alone, you don't have a budget for a billion-dollar compute. It doesn't exist.
17. SPSpeaker
  But whose fault is that?
18. JHJensen Huang
  Stanford's. And the reason, w- the reason why you have to say that is because I'm empowering... When somebody is at fault, you empower them to solve it. Do you agree? When you s- "Oh, yeah, it's not your fault. Son, it's not your fault. Your failure, it's not your fault. It's not your fault."
19. SPSpeaker
  He's talking to me right here. [laughs]
20. JHJensen Huang
  You know? Uh, he's, uh, you know, "Hey, son, uh, you're an idiot. It's not your fault." No, it's absolutely your fault. And, and so by saying that it's absolutely your fault, you're also empowering yourself to solve it. Isn't that right?You're empowering yourself to solve it. And so the question that you just talked to somebody who kind of feels, you know, uh, I can do something about my future. Um, you're talking to somebody who, who's, who believes in that, okay? And so if I were Stanford, you just have to s- you have to find a way to, to, to change the way you do budgeting, the way you deal with computing. You have to find a way to aggregate and build yourself a linear accelerator, just like Stanford has done in the past. We need to build campus-wide supercomputers that everybody share. Now, you could also go and just contract somebody else to do it. I mean, that's, that's all possible. But you do need to have, you know, a billion dollars. You need to have some reasonable fund, uh, to go buil- buil- build something like this because that's how much it costs. But that's, that's just what it takes.
21. SPSpeaker
  I mean, last I checked, we've got a, what? $40 billion endowment here. How would you put that to use if you were, if you were stepping into the-
22. JHJensen Huang
  I would cut a billion dollars of it right away and give it to somebody as a cloud service and have every single, uh, student and every researcher here, uh, have access to, to, uh, to, uh, uh, AI supercomputers. I would do that right away. Now, of course, of course, you've got to go plan things. You don't-- If you want to buy a billion dollars worth of tomatoes, you don't show up to the grocery store and say, "Hi." And then, and then, and then they don't have a billion dollars of groc- groc- tomatoes, and you go, "Aha, you're withholding tomatoes from me." [laughs] That's just ridiculous. And so, so, you know, so you gotta do some planning. And so what you ought to do is you gotta say, "Next year, we need to have a billion dollars worth of computing for Stanford and, and, uh, and f- we'll go build it."
23. SPSpeaker
  All right, you know what? We'll move on, but thank you for that.
24. JHJensen Huang
  Yeah. Yeah, yeah, exactly. [audience applauding]
25. SPSpeaker
  We'll come back to that one next time. [laughs]
26. JHJensen Huang
  [laughs]
57:20 – 1:08:20
Leadership reflections: CEO joys and vulnerabilities, early mistakes, and forecasting in fog
1. SPSpeaker
  Yeah. What, what is the best and worst part of your job?
2. JHJensen Huang
  The-- When you're, when you're CEO of a company, you, you have the benefit, you have the benefit of, of a lot of, uh, really fun things. Like for example, uh, you're, you're really the person who has to conceive of the intersection between vision and strategy and execution, okay? And so, so you have to live in that, in that world, and it gives you-- A- and when you're a company with capability, and I'm surrounded by amazing computer scientists and many of them from, from Stanford, and when you're surrounded by, by people like that, when you have a vision, it's very realizable. And because you're with amazing people, your vision is more ambitious, okay? And so, so I think, I think that's the fun part. The not fun part, so, a- and so that fun part I get to do almost all the time. I'm always constantly, um, updating my, my, my view of the future and my vision of the future and, and w- our role in it and, and how we ought to reinvent ourselves so that we could, you know, contribute more to that future or, or go invent that future in the first place. And, and so as a CEO, you have, you get to live in that world, and that's fun. You're-- It's very imaginative. It's very strategic. It's, you know, highly complicated. There's no right answer. Uh, and in a lot of ways, it's, it's creativity at, at, at its most, okay? On the other hand, what comes with that power is the responsibilities for a bunch of people who joined you in that spaceship, that joined you in that, in that vessel, and they want to be, they want to help you create this future, and they're part of your team, and you feel a deep responsibility for their well-being. And so when the company's not doing well, or the company in the older days, you know, when we were in the beginning trying to find our way, uh, we probably nearly went out of business, you know, four or five times. I mean, literally almost went out of business, and we were on fumes or, or we're really flat on our back. And so during those times, it's embarrassing, it's humiliating, it's hard. Um, you don't know what the answer is. Oftentimes, you're in the dark. Uh, you're afraid. Uh, you know, all of those, the feelings that, that we have as humans just multiplied by, you know, a thousand, a million. And, and, uh, uh, y- you know, when you're a public CEO, uh, your face is always out there, and when you do well, uh, people are happy. When you don't do well, they're fast to tell you. And, and, um, and so you're, you know, and so it's a vulnerable, you know, for me, it's, it's a highly vulnerable profession. And, and so, uh, y- you're not naked, but you feel it, you know?
3. SPSpeaker
  The question is, what's the biggest mistake you made in the early days of NVIDIA, and what'd you learn from it?
4. JHJensen Huang
  Um, l- let me, let me give you an example of, of what somebody might say, and, and I will say I, I won't, I, I'll say that that's not. And so, so anybody who knows our history, uh, would know that the first generation of our products, uh, the architecture, the technology we used was completely wrong. It's not like a little bit wrong. It's like completely wrong. The fact that, that, that smart engineers and professionals and we were actually funded and we created this thing and it's like, check it out, doesn't work at all, you know? And so, uh, that, that using curved surfaces instead of triangles, no Z-buffer instead of Z-buffer, forward texture mapping instead of inverse texture mapping, we did everything wrong. We did everything wrong. No floating point inside. We did everything wrong. And so we made a lot of tremendously bad choices. Um, uh, and I, I'll say that, that, uh, those are technical bad choices, but it led to strategic genius moves. Um, how do you take a company that, um, had that reputation and wasted a bunch of money and a bunch of time, two and a half years, doing it the wrong way and surrounded by competition, and now here we are the only one remain, okay? And so, so that, that transformation taught me a lot about the importance of technology is important, but strategy is so important. And so how you see the world, uh, how you approach competition, how you approach the market, uh, how do you conserve resources and apply resources, tho- those decisions, um, I learned more in my early 30s through that deep failure-Uh, and the company almost vaporizing. Uh, I learned so much about strategy and strategic thinking and, and, uh, maneuvering and things like that, and it's lasted a whole, whole long time. The mistake that I made that I, I would say, um, w- was a genuinely straight-up mistake is when the PC, uh, uh, or, or when mobile devices took off, uh, we were approached by very important companies that, that are in, important in the mobile space, uh, to work on some mobile devices. And, and, um, uh... And the choices that I, that I made, um, uh, I think the answer f- when they approached us, the answer should've been, "Nah, not interested." But we decided to shift a bunch of our resources to go build mobile devices. And, um, I, and I thought that we could add a lot of value, but it turned-- You know, I, I think if I would've thought through it a couple more clicks, uh, the amount of value you could really deliver in, for, for the things that we know how to do and what we're good at is probably marginal at best. And so, uh, I shifted the company to go into mobile devices. Uh, it grew into a billion-dollar business, and, and that kind of positive reinforcement. And then shortly after, uh, during the 3G to 4G transition, uh, we were just 100% locked out. And, and, um, uh, Qualcomm was the leader in that 3G to 4G modem, mo- And that's the most important part of the phone.
5. SPSpeaker
  Mm-hmm.
6. JHJensen Huang
  Not the SoC, not computer graphics, not even the application processor. The phone is obviously the most important part. And so during that transition, uh, they were able to block us out. I could've probably called it, you know, to, to-- If you, if that circumstance were to happen again, I would've said, "Yeah, it's, it, it would be a really interesting opportunity for a couple years, but we're gonna get shut out after that, so what's the point? Like, let's go conserve our resources somewhere else." But the re- the recovery-- So we got shut out. We built it up to about a billion dollars and then went back to zero. But the recovery was I took all of that expertise, that extreme low power and energy efficiency expertise, and I shifted all to, um, an application that didn't exist at the time called robotics. And so all of the, the... Somebody mentioned Thor. Uh, Thor is the great, great, great, great grandson of the chip that we were using, um, uh, in mobile devices. And that, that entire g- genealogy and all the teams and all the expertise that we, we built up, uh, was really helpful to getting here. And so it doesn't-- That's rationalization. Um, going into that market in the first place was a waste of time, and so that, that I think is a strategic mistake.
7. SPSpeaker
  Um, on strategy, is there-- You know, sometimes strategy's about forecasting, so precisely enough. Is there, uh, from a systems perspective, what do you think you've updated your priors on s- or what, what is the forecasting mechanism you've developed to give yourself some confidence that, like, this fog of war here, don't quite know where things are gonna go, but generally speaking, we're h- like shooting in the right direction? Is there any, is there sort of a systems, piece of a systems design advice you'd give folks on when the shape of the future is not entirely clear?
8. JHJensen Huang
  Yeah, and, and in fact, in fact, you, you used all the s- the, the right words already. Um, uh, the first thing I do is, is I-- What am I observing? What am, what am I observing? And, um, based on what I observe, uh, let's reason about it back to first principles, break it all back down, and ask ourself, uh, "So what's gonna happen next?" And, uh, first, so what? Is this a big deal? Hey, deep learning, computer vision, AlexNet, you know, big deal. Is that a big deal or not a big deal? And so the big deal part of it is, my goodness, uh, in just one, you know, here, here's two engineers, right, Alex and, and Ilya and, uh, and, and Hinton, of course, and they came up with a neural network model, and boom, it crushed the, the, uh, computer vision capabilities of all the computer scientists, you know, decades before them in one shot. And so is that a big deal? Is that a big deal? Um, uh, the, the, the step up in, in quality, uh, uh, and performance was a big deal. Now, the next question is, so what's gonna happen next? How far can you take it? And then if you could do it in this way, what else can you, what, what else can you solve? Um, and if, if this was able to solve some really amazing problems, uh, what does that mean to computers and computing? And so you just keep asking yourself these questions, right? And so you're just iterating it like that all the way to first principles, and then from that, you create a mental model about the future of computing. And, uh, where is it gonna be? What can it do? For example, self-driving cars and robotics. Um, this, uh, uh, how large would models become? And, uh, if, if so, what would computers look like?
9. SPSpeaker
  Mm.
10. JHJensen Huang
  Uh, what would processing neural networks, how is that different than processing, you know, floating point numbers and integers and first principle mathematics? You know, we express everything in FP64 or FP32, but obviously neural networks don't have to do that. And so, so you, you reason through it kinda like this, and then you build up a mental model of a future, but, you know, of the future, and then where your, your company, where you are going to be within it, and then you just work backwards from there. And, and then, and then now the question, of course, is you could be wrong, and oftentimes you're, you know, if you reason about things properly, you're not completely wrong, but you're not completely right. And so I tend to, I tend to be very comfortable saying, "Okay, these are the things that, that will likely happen, and these are things that will absolutely happen, and these things may happen. And based on that, I think we ought to go in that general direction, and we'll feel our way through." And now the, now the, the, the skill of, of building companies then, of being successful along the way is you're going into this direction, and it's gonna take energy, it's gonna take time, it's gonna take money, and, and everything that time, energy, and money, that takes away from something else, right? So the cost, the, the, the opportunity cost of pursuing a strategy is the real cost. And so you just gotta ask yourself, how can you be smart enough such that the opportunity cost is reduced and your optionality is increased? And so you're trying to think through all of that stuff all the time. You know, and it's no simple answer, but, but, um, uh, in a lot of ways, uh, you're, you're trying to get the journey to pay for itself.
11. SPSpeaker
  Given, uh, everybody's gonna mob you for more signatures, that's where we're gonna end. Thank you.
12. JHJensen Huang
  Thank you very much. [audience applauding]

Episode duration: 1:08:23

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode tsQB0n0YV3k

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Computing is being reinvented: from pre-recorded software to generative, contextual systems

Stack-wide disruption: new development methods, new systems, and new applications

From GPT to agentic systems: continuous computing and what comes next

Co-design explained: why optimizing hardware + compilers + frameworks together wins

Co-design at NVIDIA: beyond Moore’s Law (the “million‑X” claim)

How education should change: learn with AI while keeping first principles

Open source vs closed models: why NVIDIA builds open foundation models

Why openness matters for safety and security: transparency and “swarms” of defense

Coalition scaling and utilization: why MFU can be a misleading metric

Measuring progress: tokens-per-watt, NVLink bandwidth, and the need for serious evals

Architecture roadmap logic: Hopper → Grace Blackwell → Vera Rubin → (future) Feynman

Energy as the next bottleneck: efficiency, grid upgrades, and sustainable generation

Career advice: seek resilience through struggle, not only passion

Policy and geopolitics: GPUs aren’t ‘atomic bombs,’ and restricting markets harms industry

Why universities can’t get enough compute: budgeting and aggregation, not chip supply

Leadership reflections: CEO joys and vulnerabilities, early mistakes, and forecasting in fog

Get more out of YouTube videos.