a16zAI Eats the World: Benedict Evans on the Next Platform Shift
CHAPTERS
Why ChatGPT’s massive reach doesn’t equal daily utility (and how “AI” keeps getting redefined)
Evans opens with a puzzle: ChatGPT has enormous weekly usage, yet many people still can’t find a reason to use it day-to-day. He argues that “AI” is a moving label—once something becomes normal, we stop calling it AI—setting up a theme about hype, adoption, and shifting definitions.
- •ChatGPT’s huge weekly active user count vs. relatively low daily habitual use
- •The gap between awareness/accounts and recurring practical usage
- •“AI” as a term that tends to mean “new stuff,” not a stable category
- •AGI framed as “new scary stuff,” with ambiguous goalposts
AI as a platform shift: repeating patterns from PCs, the web, and smartphones
Evans explains his thesis for “AI Eats the World” by comparing generative AI to prior platform shifts. He emphasizes recurring dynamics: big tech industry reshuffles, bubbles, and uneven impact across industries (transformative for some, incremental for others).
- •Platform shifts create winners/losers and new massive companies
- •Different industries experience different levels of disruption (newspapers vs. cement)
- •A framework for separating changes inside tech vs. outside tech
- •“As big as the internet or smartphones” is already an enormous claim
AGI: always here, or always five years away—and nobody can model the limits
The conversation turns to the uncertainty around AGI and the lack of a clear way to forecast capability growth. Evans notes the contradiction between claims of imminent “PhD-level” agents and the practical reality of shipping developer platforms, and he stresses that unlike past shifts, we don’t understand the fundamental limits of this technology.
- •AGI debate: “already here” vs. perpetually “five years away”
- •Conflicting expert assertions (Altman vs. Hassabis) reflect uncertainty, not clarity
- •Key difference from earlier eras: unknown physical/theoretical limits of capability
- •No good model for how far or fast improvement can go—forecasting becomes vibes-based
Will incumbents win again? Why platform-shift analogies help—but don’t predict outcomes
Evans cautions against overly deterministic “disruptive vs sustaining” framings. Using mobile as an example, he shows how a shift can both create new companies and dramatically reshape incumbents, while the biggest outcomes may be hard to foresee early on.
- •Mobile changed behavior (apps, global pocket computers), not just market caps
- •Platform-shift taxonomies have holes; they’re explanatory, not predictive
- •Early eras show you can know a shift is big without knowing the winning shape
- •Historical uncertainty: internet→web→how the web works; smartphone winners weren’t obvious
Bubbles and CapEx: compute spending feels like 1990s bandwidth forecasting
Evans argues that transformative tech tends to produce bubbles and that AI is no exception. He likens today’s compute buildout to the late-90s attempt to forecast bandwidth demand: many plausible parameters, huge error bars, and a risk of synchronized overinvestment.
- •Bubbles are a common feature of world-changing tech cycles
- •Compute demand forecasting resembles bandwidth/router forecasting in the dot-com era
- •Hyperscalers prioritize under-investment risk over over-investment risk—until it flips
- •Skepticism about “we can resell capacity” if everyone overbuilds simultaneously
Where AI deploys easily today: code, marketing, and narrow enterprise point solutions
Evans describes a split in adoption: some domains see immediate value, while others struggle to find everyday uses. He highlights current strongholds—software development and marketing—and notes the role of consultancies and systems integrators in embedding AI into specific corporate workflows.
- •Clear near-term ROI in coding and content/marketing production
- •“Boring, specific” enterprise tasks are prime targets for AI point solutions
- •Large integrators (Accenture/McKinsey/Infosys, etc.) operationalize deployment
- •Early wins are often about volume and speed, not perfect accuracy
The adoption gap: hundreds of millions try it, but many don’t stick—why?
Despite wide reach, only a minority uses AI daily, and Evans challenges power users to explain why others don’t. He offers hypotheses: error rates, task mismatch, habits, and the absence of productized workflows that make benefits obvious without prompting expertise.
- •High weekly usage but modest daily usage; small share paying
- •Not a purely “young people” adoption curve
- •Barriers: hallucinations/error tolerance, unclear mapping to daily tasks
- •Many users need AI wrapped into tools, UX, and workflows to become habitual
Validation and “infinite interns”: when mistakes erase the time savings
Evans explores the economics of verification. In creative and exploratory tasks, AI can generate many options and humans select; in precise data-entry or research, errors force full checking, eliminating the productivity advantage—illustrated by his critique of “deep research” outputs.
- •Some tasks benefit from generate-many-then-select (e.g., marketing assets)
- •For exactness (data extraction), verification can cost as much as doing it yourself
- •Example: AI research outputs with wrong numbers and questionable sources
- •Framing AI as “infinite interns” who need heavy guidance and supervision
New behaviors vs. old tasks: why ‘it’s bad at X’ can be the wrong critique
Evans argues that new platforms often look weak against legacy benchmarks but unlock brand-new activities. He compares dismissing genAI due to mistakes to dismissing early PCs for not running banks or early web for not doing pro video editing—missing the new-category creation.
- •Disruption often succeeds by doing different things, not old things better
- •Early tech critiques often focus on the wrong benchmark tasks
- •Key question: what becomes newly possible that wasn’t worth doing before?
- •Entrepreneurs translate raw capability into products that surface new workflows
Why UI still matters: prompts don’t replace product design and institutional knowledge
The discussion turns to how much of the stack models can absorb. Evans argues that GUIs encode institutional decisions about what users should do next; a blank prompt forces users to invent the workflow from scratch, which is why many “solutions” will remain packaged products, not raw model calls.
- •“How far up the stack does the new thing go?” remains unresolved
- •Products sell solutions, not technologies (law firms don’t buy ‘sentiment analysis’)
- •GUIs constrain choices and embed expertise; prompts outsource that cognitive burden to users
- •Chatbots ask users to specify everything—what to do, how to do it, and why
Searching for the ‘iPhone moment’ of AI: precursors, local maxima, and reinvention
Evans suggests it’s early enough that defining products will likely emerge, but they may not look like today’s chatbots. He notes that transformative products often arrive after many “good enough” iterations, and even landmark products (like the iPhone) took time to become fully functional and correctly packaged.
- •Both can be true: users learn over time, and better product forms will emerge
- •History: many precursors before Facebook/Google/Instagram/Tinder; categories crystallize late
- •“Local maxima” can feel like progress until a step-change redesign appears
- •If no new products emerge, the implication would be that foundation models capture everything
Competitive landscape: commoditized benchmarks, fragile distribution, and OpenAI’s defensibility problem
Evans observes that model benchmark parity doesn’t match consumer usage, implying distribution and brand matter more than marginal quality. He argues that OpenAI’s lead may be fragile without strong lock-in, network effects, or cost control—pushing it to expand both product surface area and infrastructure positioning.
- •Benchmarks converging, but consumer adoption diverges—distribution drives usage
- •Claude strong on benchmarks but limited consumer share; ChatGPT dominates mindshare
- •OpenAI lacks infrastructure control and must pay for capacity—cost base vulnerability
- •“Memory” is stickiness, not a true network effect; defensibility remains uncertain
Strategic questions for tech giants: Google, Meta, Amazon, Apple—and who gets disintermediated
Evans maps distinct strategic stakes for major incumbents. For Google, AI may be an extension of search; for Meta, a deeper shift in content and recommendation; for Amazon, a chance to improve discovery and intent; for Apple, the hardest question is whether AI changes computing itself or remains a service accessed on premium devices.
- •Google: frontier models become another feature layer; search may remain ‘search’
- •Meta: AI touches core feed/recommendation dynamics—higher imperative to own models
- •Amazon: opportunity is better discovery/recommendation beyond SKU retrieval
- •Apple: Siri vision was compelling but not reliably buildable; question is platform vs. service dynamics
- •Analogy: Microsoft ‘lost’ the 2000s platform war yet rode PC demand—Apple could similarly persist if apps remain
What changed since early 2023: from model questions to product and industry unbundling
Evans reflects on how the question set has shifted: earlier focus was on scaling, NVIDIA, open source, and model count; now it’s increasingly about product strategy, market structure, and which industries get unbundled. He argues many current questions will look wrong in hindsight, just as “killer app for 3G” missed the real answer.
- •Earlier questions persist (NVIDIA, scaling, open source), but product strategy questions grew
- •“Step 1 feature, step 2 new workflows, step 3 industry redefinition” as a recurring arc
- •Industries may discover their real moat was ‘boring friction’ that AI can remove
- •Historical lesson: the right question often reveals itself only after new behavior emerges
What would make AI ‘bigger than the internet’: capability discontinuity, not just better tools
Evans closes by resisting unfalsifiable AGI debates while stating clearly that current systems aren’t human equivalents outside narrow constraints. For AI to be “bigger than the internet,” we’d need a fundamental, widely felt shift in perceived capability—something that changes what software and work essentially are, rather than incremental productivity gains.
- •People underestimate how big the internet and smartphones already were
- •Today’s models are not reliable person-replacements outside constrained guardrails
- •A ‘bigger than internet’ world requires a clear capability regime change, not marginal benchmark gains
- •AGI definitions are slippery (‘AI is whatever doesn’t work yet’), so evidence will be experiential over time