The Twenty Minute VCCerebras CEO on the Future of Data Centres, Token Costs & Memory | Should US Companies Sell to China
At a glance
WHAT IT’S REALLY ABOUT
Cerebras CEO on AI compute bottlenecks, costs, geopolitics, and adoption
- Feldman argues today’s AI buildout is not a classic bubble because data-center and chip supply are trailing real, current demand, creating persistent backlogs across the industry.
- He predicts multi-year memory (HBM) shortages because capacity additions are lumpy, capital-intensive, and slow, while AI usefulness has reached a tipping point that sustains demand growth.
- He expects long-run token costs to fall via architectural and efficiency improvements across the whole chip industry, while emphasizing speed as a decisive competitive advantage for agentic workflows and search-like experiences.
- He claims the biggest near-term blocker to enterprise AI adoption is organizational risk control—security teams and lawyers—before data cleanliness becomes the next constraint.
- On geopolitics and industrial policy, he opposes selling leading-edge chips to China, stresses rebuilding US fab/packaging capability, and criticizes Europe’s regulate-first posture as a drag on innovation and adoption.
IDEAS WORTH REMEMBERING
5 ideasAI infrastructure is constrained by supply, not speculative oversupply.
Feldman contrasts AI with past bubbles (rail/fiber) where capacity got built ahead of demand; he says today’s reality is backlogs everywhere because data centers and supporting infrastructure can’t be built fast enough.
Memory shortages are structurally hard to fix quickly.
HBM is supplied by only a few vendors and expanding fab capacity is a multi-year, tens-of-billions step function; if demand stays high, he expects shortages for “the next several years.”
A 2025 usefulness inflection is what turns training hype into inference-driven demand.
He claims models crossed a threshold where everyday users (across demographics) get real value, so inference usage explodes and keeps pulling forward demand for compute and power.
Speed is a moat because many AI products have near-zero tolerance for latency.
He argues there’s effectively “zero market for slow” AI experiences (search, coding agents, workflows), so large speedups (not just 10–20%) can change competitive outcomes by enabling many more tasks per day.
Token costs should decline over time even if near-term components spike.
Despite HBM-driven GPU cost inflation, Feldman expects the industry’s historical trend—more performance per dollar and per watt through better designs—to continue, pushing compute unit costs down over a 3–5 year horizon.
WORDS WORTH SAVING
5 quotesThe infrastructure build-out is behind demand. We can't build data centers fast enough to keep up with demand. We have a twenty-five billion dollar backlog.
— Andrew Feldman
Somewhere in twenty twenty-five, the models got smart enough to be really useful. Before that, Harry, these were sort of, sort of a novelty.
— Andrew Feldman
For hard problems, there is no upper bound to how much faster you wanna be, nor the value of speed.
— Andrew Feldman
Really. H-It's zero. How big is the market for dial-up, for slow internet? ... That's how impossible it is to engage with an important technology slowly. Why do we believe the inference will be any different? There'll be zero market for slowing them.
— Andrew Feldman
No. No, the biggest are, are lawyers.
— Andrew Feldman
High quality AI-generated summary created from speaker-labeled transcript.