Cerebras CEO on the Future of Data Centres, Token Costs & Memory | Should US Companies Sell to China

Andrew Feldman is the co-founder and CEO of Cerebras Systems. This month, Cerebras went public achieving a market cap of $70BN, the largest semiconductor IPO in history. Cerebras has a massive commercial backlog with a monumental, multi-year $20 billion compute agreement from OpenAI. ---------------------------------------------- In Today’s Episode We Discuss: 00:00 Intro 02:18 Is There an AI Infrastructure Bubble? 07:35 Memory Shortages Will Last Years 09:35 2025: The Year AI Became Actually Useful 11:33 Will Frontier Models Commoditize Like Cloud Did? 16:34 Can Google Win by Owning the Full Stack From TPUs to Tokens? 32:53 Data Centers & Local Communities 33:10 AI Layoffs 38:15 The Real Blocker to Enterprise AI Adoption 44:04 Should the US Be Selling Chips to China? 47:00 Why Europe Can't Build Great Tech Companies 53:48 Timing the Cerebras IPO: Luck or Strategy? 57:10 Is the Trump Administration Better for Business? 58:53 Quick-Fire Round ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Andrew Feldman on X: https://twitter.com/andrewdfeldman Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #andrewfeldman #cerebras #ceo #founder #ai #nvidia #chips #china #ailayoffs #ipo

Andrew FeldmanguestHarry Stebbingshost

May 26, 20261h 7mWatch on YouTube ↗

CHAPTERS

0:00 – 4:30
AI infrastructure isn’t a bubble: demand is outrunning supply
Andrew argues today’s AI buildout is the opposite of past infrastructure bubbles like fiber or rail. Instead of building ahead of demand, the industry is scrambling behind demand with major backlogs across chips and data centers.
- •AI infrastructure is constrained by real, present demand (not speculative build)
- •Cerebras cites a $25B backlog; Nvidia/AMD also have backlogs
- •Data centers, not just chips, are the gating factor
- •Why “trying to catch up” doesn’t match typical bubble dynamics
4:30 – 7:14
Why delays can be healthy: metering demand, and OpenAI’s forecasting edge
They discuss the idea that construction, permitting, and supply constraints can “meter” usage and smooth the market. Andrew highlights OpenAI’s willingness to act early on exponential compute forecasts as a competitive superpower.
- •Permitting/build delays can act like freeway metering to reduce volatility
- •OpenAI anticipated exponential compute needs and contracted early
- •Belief in forward demand (1–3 years out) is a strategic advantage
- •Not all compute can be bought instantly at the same quality/terms
7:14 – 9:28
Memory (HBM) shortages: why it’s acute and why it lasts for years
Andrew explains how explosive AI demand stresses the whole supply chain, with HBM memory as a key bottleneck after leading-edge fab capacity. Because adding supply requires massive, multi-year fab investments, shortages can persist for several years if demand remains high.
- •HBM supply is concentrated (Samsung, Micron, SK Hynix)
- •HBM pricing power is extreme (memory makers seeing very high margins)
- •Capacity additions are lumpy: $40B fabs and ~5-year timelines
- •Expectation: shortages persist for ‘the next several years’ if demand stays high
9:28 – 11:08
2025 as the inflection: when AI became truly useful and inference demand exploded
Andrew claims models crossed a usefulness threshold in early 2025, shifting AI from novelty to daily utility. That usefulness drives inference demand across demographics and problem types, sustaining rapid growth if capability keeps improving.
- •Training creates models; inference is where usage (and demand) explodes
- •Usefulness threshold triggered broad-based adoption (not just Silicon Valley)
- •Demand grows as AI tackles harder and more frequent tasks
- •Compute demand tracks model usefulness—if models improve, usage keeps compounding
11:08 – 13:45
Will frontier models commoditize like cloud? Segmentation, ‘leather seats,’ and neo-cloud dependence
They explore whether model providers become utilities. Andrew argues hyperscalers deliver differentiated value (security, integrated services), while other segments will prioritize lowest-cost compute, making room for neo-clouds—though Nvidia’s strategy may be creating unhealthy dependence.
- •Nvidia’s strategy: enable competitors to hyperscalers via neo-clouds
- •Hyperscalers’ moat: legitimacy, security, integrated tooling (Bedrock/SageMaker/S3)
- •A segment of buyers only wants ‘cheap compute’—security layers become a cost
- •Market segmentation mirrors traditional industries (premium vs stripped-down offerings)
13:45 – 16:14
Token economics and COGS: why compute gets cheaper, and where Cerebras is structurally advantaged
Harry presses on future costs; Andrew returns to the historical trend of falling cost per unit compute. He explains Cerebras’ supply-chain advantages versus GPU stacks (no HBM, no CoWoS reliance, different node dynamics) while emphasizing industry-wide efficiency gains over time.
- •COGS outlook shaped by better designs: more tokens per second, better perf/watt
- •Cerebras advantages: SRAM approach avoids HBM bottleneck and pricing
- •Avoiding CoWoS constraints and oversubscribed leading-edge nodes can help supply
- •Long-run industry law: major reductions in cost per unit compute
16:14 – 19:39
Can Google win by owning the full stack (TPUs-to-tokens)? The volume vs integration tradeoff
Andrew evaluates the ‘full stack’ thesis: owning everything can lower costs, but only selling TPUs internally limits volume and learning curves. They discuss why Google may expand outside its own data centers and how vertically integrated players differ from neo-cloud economics.
- •Vertical integration can reduce cost from land/power to tokens
- •Downside: single customer problem (only selling to yourself) caps volume benefits
- •Google stepping outside its own DCs suggests the constraint is real
- •Neo-clouds buy high-margin GPUs then must add their own margin—integrated stacks avoid that
19:39 – 22:34
Speed as the core moat: why “slow inference” has no market
Using Cerebras’ Kimi K2 benchmark as a springboard, Andrew argues that speed dominates value in AI workflows. He contends there’s effectively zero market for slow inference, especially for coding, agents, and search-like experiences where latency compounds into competitive advantage.
- •Cerebras posts major throughput gains vs GPU clouds for certain models
- •For hard problems, there’s no practical upper bound on speed’s value
- •Latency compounds across workflows (coding/agents/search) into decisive advantage
- •Analogy: no one wants dial-up; even high pay wouldn’t justify slower internet
22:34 – 27:25
Scaling to massive deals: concentration risk, operational muscle, and multi-gigawatt ambition
Andrew describes why landing one huge customer is a prerequisite to winning many. They discuss the operational realities of fulfilling enormous contracts, and how industry thinking has shifted to treat gigawatt-scale infrastructure as normal rather than absurd.
- •Path to many large customers starts with winning one and building the muscle
- •Customer concentration concerns recur even as deal sizes grow
- •Industry mentality shift: 20MW → 100MW → 1GW → multi-GW is becoming ‘normal’
- •Compute scale implies escalating needs for power, sites, and rapid deployment
27:25 – 32:42
Data centers vs local communities: delays are normal, but neighbor relations weren’t
Andrew reframes data-center delays as standard large-construction reality, then pivots to community relations. He argues the industry failed by being opaque and cost-shifting, and proposes a ‘pay our own way’ model—closed-loop water, full grid upgrades, and tangible community benefits.
- •Construction delays are inherent: supply chain, contractors, transformers, generators
- •Industry mistake: insufficient transparency and poor ‘neighbor’ behavior
- •Principle: don’t shift infrastructure costs to communities; pay for upgrades fully
- •Practical goodwill: jobs, local investment, facilities, and responsible water use
32:42 – 42:07
AI layoffs and the enterprise adoption blocker: it’s lawyers and security, not data cleanliness
They address fears about AI-driven layoffs, with Andrew arguing many cuts were ‘AI-washed’ and tied to prior overhiring and automation. For enterprise adoption, he claims the biggest near-term constraint is legal/security risk-aversion, with data organization becoming the next constraint once governance is settled.
- •Many layoffs reflect COVID overhiring and long-harvested productivity gains
- •AI is beginning to have real enterprise impact, but isn’t the sole driver of cuts
- •Main blocker: legal/security organizations optimized to say ‘no’ under uncertainty
- •After governance, data structure/cleanliness becomes the bottleneck—disciplined orgs gain advantage
42:07 – 44:00
Open source models, China, and cost pressures: adoption vs risk and ‘tidal wave’ inevitability
Open source creates legal complexity, and the best open models increasingly come from Chinese labs. Andrew notes enterprises may take the cost savings despite legal/security discomfort, and that demand pressure often overwhelms cautious governance processes.
- •Open source licensing introduces deep legal complexity
- •Chinese open models (e.g., Kimi, DeepSeek, Qwen, GLM) raise added scrutiny
- •Cost advantages push companies toward open-source despite risk concerns
- •In practice, demand can ‘wash over’ governance resistance
44:00 – 47:18
Should the US sell chips to China? Military use, industrial competition, and chokepoint strategy
Andrew argues against selling leading-edge chips to China, emphasizing near-certainty of military use and competitive industrial leverage. He acknowledges counterarguments about keeping China in the ecosystem, but favors stronger restrictions and cites manufacturing chokepoints (TSMC/ASML) as enforceable control points.
- •Consensus view: leading-edge tech sold to China will be used militarily
- •Also likely: it strengthens China’s industrial competition against the US
- •Counterargument exists (ecosystem dependence), but Andrew rejects it
- •Chokepoints: advanced manufacturing depends on TSMC and lithography supply chains
47:18 – 50:08
Onshoring advanced fabs: why the US lost the ecosystem—and the policy fix Andrew wants
They discuss US weaknesses in long-term infrastructure policy and the strategic risk of offshore manufacturing. Andrew calls for rebuilding not just fabs but the surrounding packaging and talent ecosystem, and proposes an aggressive policy carve-out to let TSMC/Samsung build without local ordinance friction.
- •US struggles with durable long-range policy and modern grid infrastructure
- •Losing fabs also meant losing packaging expertise and supporting ecosystems
- •Strategic priority: cutting-edge fabs and packaging onshore
- •Proposed policy: 20-year exemption from local ordinances for TSMC/Samsung fab construction under proven safety rules
50:08 – 53:27
Why Europe struggles to build tech giants: regulation, risk aversion, and slower adoption
Andrew critiques Europe’s pattern of fear-then-regulate-and-tax as anti-entrepreneurial, with slower invention and slower adoption. They add nuance: Europe has strong application-layer successes, but infrastructure, chips, and frontier models remain concentrated elsewhere.
- •Cultural/regulatory stance can discourage entrepreneurship and risk-taking
- •Slower adoption compounds slower invention in infrastructure and models
- •Notable European bright spots exist (London/Cambridge/Stockholm; app layer)
- •Silicon Valley advantage: low stigma for failure and a stronger risk/reward loop
53:27 – 1:07:44
Cerebras IPO timing, public-company opportunities, and quick-fire lessons on leadership
Andrew describes the IPO as a product of persistence amid regulatory hurdles (CFIUS), with some advantage in being a rare public AI ‘pure play.’ In the closing quick-fire, he reflects on the ‘IPO tax’ of vendors, wealth changes, leadership burdens, marriage strain, and the importance of empathetic boards during long technical slogs.
- •IPO outcome driven by repeated attempts, grit, and shifting regulatory climate
- •Being first/only public AI pure play created investor demand and narrative clarity
- •Public-company status expands options: investing, acquiring, and deeper partnerships
- •Quick-fire: vendor price inflation around IPOs, making employees wealthy, sustaining relationships, and board empathy during 18-month failure cycles

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

AI infrastructure isn’t a bubble: demand is outrunning supply

Why delays can be healthy: metering demand, and OpenAI’s forecasting edge

Memory (HBM) shortages: why it’s acute and why it lasts for years

2025 as the inflection: when AI became truly useful and inference demand exploded

Will frontier models commoditize like cloud? Segmentation, ‘leather seats,’ and neo-cloud dependence

Token economics and COGS: why compute gets cheaper, and where Cerebras is structurally advantaged

Can Google win by owning the full stack (TPUs-to-tokens)? The volume vs integration tradeoff

Speed as the core moat: why “slow inference” has no market

Scaling to massive deals: concentration risk, operational muscle, and multi-gigawatt ambition

Data centers vs local communities: delays are normal, but neighbor relations weren’t

AI layoffs and the enterprise adoption blocker: it’s lawyers and security, not data cleanliness

Open source models, China, and cost pressures: adoption vs risk and ‘tidal wave’ inevitability

Should the US sell chips to China? Military use, industrial competition, and chokepoint strategy

Onshoring advanced fabs: why the US lost the ecosystem—and the policy fix Andrew wants

Why Europe struggles to build tech giants: regulation, risk aversion, and slower adoption

Cerebras IPO timing, public-company opportunities, and quick-fire lessons on leadership

Get more out of YouTube videos.