The Twenty Minute VCJonathan Ross, Founder & CEO @ Groq: NVIDIA vs Groq - The Future of Training vs Inference | E1260
At a glance
WHAT IT’S REALLY ABOUT
Groq’s Jonathan Ross Redefines AI Inference, Chips, and Global Power Dynamics
- Jonathan Ross, founder and CEO of Groq, explains why AI inference, not training, will be the dominant long‑term bottleneck and economic driver, and how Groq’s LPU architecture is built specifically to win that market. He argues that synthetic data, better algorithms, and massive compute will push scaling laws further than most people expect, while energy and power infrastructure become true hard constraints. Groq’s model avoids high‑margin GPU training, instead targeting low‑margin, ultra‑high‑volume inference with a capital‑light, revenue‑sharing deployment model, exemplified by its multibillion‑dollar Saudi deal. Ross also examines NVIDIA’s position, China’s AI trajectory, Europe’s strategic choices, and the societal risks of abundance and over‑delegation to AI, framing Groq’s mission as preserving human agency in the age of AI.
IDEAS WORTH REMEMBERING
5 ideasInference will dwarf training as the dominant AI compute market.
Ross notes that at Google, inference consumed 10–20x the compute of training, and argues that as AI becomes embedded in every workflow, the need to serve tokens (inference) will massively outstrip one‑off model training.
Synthetic data and smarter training pipelines can extend scaling laws.
By iteratively training models to generate higher-quality synthetic data, pruning errors, and retraining on that curated data, the effective scaling curve becomes much steeper than traditional token counts suggest, blurring the idea of a near-term ‘scaling limit.’
Groq’s LPU design trades chip count for radically better inference efficiency.
Instead of relying on scarce, power-hungry HBM, Groq keeps model parameters on-chip across many LPUs, avoiding expensive external memory traffic and delivering ~3x better energy efficiency per token and more than 5x lower cost per token than GPUs for inference.
NVIDIA will likely remain dominant in training while inference unbundles.
Ross sees NVIDIA’s GPU stack as a solved problem for training and expects them to sell every GPU they can manufacture into that high-margin market, while Groq and others absorb the low‑margin, high‑volume inference workloads.
Groq’s capex-light, revenue-share model removes capital as the main constraint.
Partners like Aramco fund the hardware build‑out; Groq repays with a targeted IRR and then flips more upside to itself, so growth is governed by chip supply and deployment capacity rather than Groq’s own balance sheet.
WORDS WORTH SAVING
5 quotesYour job is not to follow the wave, your job is to get positioned for the wave.
— Jonathan Ross
We did not raise 1.5 billion, that's revenue. That's actually about 30% of the revenue of OpenAI.
— Jonathan Ross
You could almost say we're one of the best things that's ever happened to NVIDIA... we'll take the low margin, high volume inference business off their hands.
— Jonathan Ross
When you are growing faster than exponential, there is no amount of profit that you can make that matters. What matters is getting a toehold in the market and becoming relevant.
— Jonathan Ross
To preserve human agency in the age of AI, we need to be one of the most important compute providers in the world.
— Jonathan Ross
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome