Skip to content
Lex Fridman PodcastLex Fridman Podcast

Jensen Huang on Lex Fridman: Why CUDA almost sank NVIDIA

By absorbing fifty percent cost increases on GeForce to seed CUDA install base; agentic scaling now runs on foundations that nearly broke the company.

Lex FridmanhostJensen Huangguest
Mar 22, 20262h 25mWatch on YouTube ↗

FREQUENTLY ASKED QUESTIONS

Direct answers grounded in the episode transcript. Tap any timestamp to verify against the source.

  1. What does extreme co-design mean at NVIDIA?

    Extreme co-design means NVIDIA optimizes the whole AI computing stack together, not just the GPU. Huang says the problem no longer fits inside one computer or one GPU, so the algorithm, pipeline, data, and model have to be sharded across many machines. Once the work is distributed, Amdahl's Law makes every subsystem matter: CPU, GPU, networking, switching, software, power, and cooling can all become bottlenecks. At rack scale, simply making one component faster only helps if the rest of the workload can keep up. That is why he describes NVIDIA's organization as part of the design problem too. His large staff contains experts in memory, CPUs, optics, GPUs, algorithms, and architecture, and they attack problems together instead of isolating decisions in one-on-one meetings.

    3:51 in transcript
  2. What are Jensen Huang's four AI scaling laws?

    Jensen Huang frames AI progress as four connected scaling laws: pre-training, post-training, test-time scaling, and agentic scaling. Pre-training originally depended on more high-quality data and bigger models, but he says synthetic data means training becomes compute-limited rather than data-limited. Post-training uses AI to augment ground truth and generate far more training data. Test-time scaling is inference as reasoning, planning, search, and problem decomposition, which he argues is compute-intensive because thinking is hard. Agentic scaling multiplies AI by spinning off sub-agents, like growing NVIDIA by hiring more employees. The outputs and experiences from agents feed back into new pre-training and post-training data, so Huang describes a loop where intelligence keeps scaling mainly with compute.

    23:16 in transcript
  3. How does Jensen Huang think AI data centers can use idle grid power?

    Huang argues that AI data centers should be designed to take advantage of power that already exists on the grid but sits unused most of the time. The grid is sized for rare worst-case days in winter, summer, or extreme weather, while ordinary demand may be far below peak. Instead of demanding perfect availability from utilities, he wants customer contracts, data-center design, and utility offerings to allow graceful degradation. When the grid needs power for hospitals, airports, or other infrastructure, the data center could reduce power use, move workloads elsewhere, run slower, or accept slightly longer response latency. Critical workloads could shift to locations with full capacity, and data should never be lost. His core point is that AI computing can be engineered around variable guarantees once customers, data centers, and utilities all agree on the service model.

    47:30 in transcript
  4. What is NVIDIA's biggest moat?

    Huang says NVIDIA's biggest moat is CUDA's install base, reinforced by execution speed, ecosystem reach, and developer trust. His argument is that a computing platform wins when developers believe their software will reach many users and keep improving. CUDA is in every cloud, every computer company, many industries, and many countries, so an open source package that targets CUDA first can reach a huge audience. He also says developers trust NVIDIA to keep maintaining and optimizing CUDA for as long as the company exists. That trust matters because millions of developers have ported a mountain of software onto the platform. Huang adds a second advantage: NVIDIA vertically integrates complex systems while also fitting horizontally into other companies' products, from Google Cloud, Amazon, and Azure to cars, robots, satellites, and edge infrastructure.

    1:15:20 in transcript
  5. How does Jensen Huang define coding in the AI agent era?

    Huang defines coding as specifying what a computer should build. In the AI agent era, he says the skill is no longer limited to writing lines of code; it includes describing a specification and, when needed, giving the architecture of the software. That shift could expand coding from about 30 million people to about one billion, because carpenters, accountants, plumbers, farmers, pharmacists, and other professionals can use AI to elevate their work. He also treats specification as an art. Sometimes the best instruction is highly prescriptive because the desired outcome is exact. Other times it is deliberately under-specified, leaving room for an AI or a team to explore, improve the idea, and push creativity. For Huang, the future of coding is learning where to sit on that spectrum.

    2:01:45 in transcript

Answers are AI-generated from the transcript and may contain errors. Tap a question to verify against the source.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome