Mike Krieger, Instagram CoFounder & Anthropic CPO: Where Will Value Be Created in an AI World?|E1265

Mike Krieger is the Co-Founder of Instagram and now CPO @ Anthropic. ---------------------------------------------- In Today’s Episode We Discuss: (00:00) Intro (00:50) Where Will Value Be Created and Sustained in a World of AI? (01:40) Are Foundation Models Commoditised Today? (04:31) Should Founders Build for the Models of Today or Build for Models of the Future (06:55) Why Will Models Become More Different Than More Similar (12:59) Will Human or Synthetic Data Be More Prominent in the Future (18:02) Model Quality vs. Product UX (20:12) The Competitive Landscape of AI (31:49) Do We Underestimate China's AI Capabilities (33:31) What Did Anthropic Learn from Deepseek (34:44) Is Deepseek a Sustaining and Credible Threat? (38:09) Transitioning from Model Provider to Application Provider (43:44) What is the Role of a Software Developer in the Future (48:31) Balancing API and Consumer Products (52:25) Is Europe Stronger or Weaker in a World of AI (52:59) Quick-Fire Round ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Mike Krieger on X: https://twitter.com/mikeyk Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #mikekrieger #anthropic #cpo #ai #openai #deepseek

Mike KriegerguestHarry Stebbingshost

Mar 3, 20251h 2mWatch on YouTube ↗

CHAPTERS

0:31 – 2:42
Where AI startups create durable value: GTM, domain expertise, and proprietary data
Harry opens by asking where venture-scale value will accrue in an AI-driven decade. Mike argues durability comes less from generic model wrappers and more from differentiated go-to-market, deep industry knowledge, and unique data access—especially in complex regulated verticals.
- •Durable moats: differentiated GTM, domain expertise, and special/unique data
- •Vertical complexity (e.g., healthcare, legal, finance) rewards non-obvious legwork
- •Use foundation models as leverage; fine-tune/specialize when needed
- •Long-term advantage comes from learning loops once deployed in a vertical
2:42 – 4:31
Incumbents vs new entrants in vertical AI: the trust and expectation trap
They explore whether AI’s next wave favors vertical SaaS incumbents or new startups. Mike frames it as a product-design and expectation-management problem: startups can “dream louder,” while incumbents risk breaking trust if AI features underdeliver.
- •Startups can push the frontier with early adopters; incumbents face higher expectation risk
- •Incumbents must evolve without alienating existing customers/behaviors
- •Startups lack relationships/data but can win with a compelling future narrative
- •Key challenge: don’t overpromise capabilities that models can’t reliably deliver yet
4:31 – 6:55
Build for today’s models or tomorrow’s breakthroughs? Don’t wait—iterate into the frontier
Harry asks how founders should plan when model capability shifts can make or break products. Mike’s view: exploring early is valuable even if current systems are frustrating, because the winners are usually those who’ve already built context and workflow understanding when the “right” model arrives.
- •Many products become viable only after a step-change in model accuracy/capability
- •Early “lovingly assembled” systems build domain learning and workflow context
- •Model leaps reward teams already iterating (example: Cursor’s multiple attempts)
- •Guidance: don’t wait for perfection; aggressively test each new generation
6:55 – 10:20
Is the foundation model layer commoditizing? Three defensible advantages for labs
They move to whether there’s lasting value at the model layer. Mike outlines three durable advantages for frontier labs: talent density aligned to mission, differentiated model characteristics/focus areas, and enterprise-grade partnership (not just token vending).
- •Defensibility #1: talent attraction/retention and breakthrough capacity
- •Defensibility #2: models will differentiate (style, strengths) rather than converge
- •Defensibility #3: deep customer relationships and ‘AI partnership’ beyond APIs
- •Failure mode: incremental benchmark chasing + treating API as pure commodity
10:20 – 13:00
What actually blocks progress: real-world environments, evals, and agentic workflows
Asked about the biggest bottleneck (compute, data, algorithms), Mike emphasizes training/evaluating models in environments that resemble real work. Today’s evals measure narrow tasks; the hard part is multi-step, social, organizational, and iterative collaboration.
- •Main blocker: environments/evals that match real-world, multi-turn work
- •Software engineering is more than writing code: requirements, planning, iteration
- •Need agentic evals and broader “office professional” task evaluations
- •Goal: models that become reliable collaborators, not narrow point-solvers
13:00 – 15:35
Human vs synthetic data—and the missing piece: measuring ‘vibes’ and character
They discuss whether future gains come from synthetic data compounding or continued reliance on human data. Mike argues it must be a mix, and adds a less-discussed frontier: training and evaluating qualitative ‘feel’—tone, personality, and user experience—where regression testing is weak.
- •Progress needs both human ‘seed’ data and synthetic environments for exploration
- •Games illustrate controllable synthetic environments; real-world tasks are harder
- •Character/tone (‘vibes’) is hard to evaluate and easy to regress between versions
- •Better data + evals needed for soft skills, not just benchmark performance
15:35 – 18:02
Leaky abstractions in AI UX: model selection, memory, and prompting should disappear
Harry predicts model choice will become irrelevant; Mike agrees current UX exposes too much internal machinery. He flags three ‘leaks’ that should be abstracted away: choosing models, fragmented chat memory/context, and the skill gap between expert and novice prompting.
- •Model pickers are confusing; most users can’t rationally choose variants
- •Chat/threading lacks shared persistent memory like real coworkers have
- •Prompting should become transparent; systems should ask clarifying questions
- •Design goal: collapse the prompter/non-prompter gap across generations
18:02 – 20:13
Model quality vs product UX: you’re designing a scaffold around non-determinism
Mike argues model quality and product design can’t be separated anymore. Building AI products means shaping behavior through prompts, reasoning settings, tool use, and robust evaluation/regression testing—because changes can come from models, prompts, or UI decisions.
- •AI products are non-deterministic; UX includes prompts, evals, and system behavior
- •Product decisions: follow-up questions vs none; longer reasoning vs faster outputs
- •Need strong evaluation frameworks to prevent silent regressions over time
- •Hard debugging: failures may stem from model updates, prompt changes, or features
20:13 – 28:28
Shipping and marketing in a hyper-competitive release cycle: staying nimble without breaking trust
They examine the pressure of constant launches across labs and the resulting product-marketing chaos. Mike contrasts API expectations (stability, opt-in betas) with consumer/enterprise surfaces, and describes how launch timing now feels reactive amid weekly competitive drops.
- •APIs prioritize predictability; experiments often gated behind opt-in/beta headers
- •Consumer experiences need faster iteration and less friction than opt-ins
- •Launch timing is chaotic (‘Crossy Road’); teams constantly read competitive signals
- •Internal mindset: avoid ‘we’re so back/it’s so over’ emotional whiplash
28:28 – 31:49
Open source and distillation: usefulness vs sustainability, security, and incentives
Harry probes whether distillation is ‘wrong’ and what open source implies about value distribution. Mike distinguishes internal distillation as a practical technique from cross-entity copying, raising national security and long-term commercialization incentives as key concerns.
- •Distillation is valuable internally to make models cheaper/faster to serve
- •Cross-nation or uncontrolled distillation raises security and policy concerns
- •Sustainable frontier progress requires viable commercialization models
- •Open source can thrive without distillation; ToS and provenance still matter
31:49 – 38:01
China, DeepSeek, and the breakthrough playbook: narrative, product speed, and UX novelty
They discuss underestimating China’s AI capability and what Anthropic learned from DeepSeek. Mike highlights that the surprise wasn’t frontier talent; it was the speed of productization, the geopolitical narrative, and the novelty of features like visible chain-of-thought that captured attention.
- •China’s frontier capability shouldn’t be surprising; avoid Western-centric assumptions
- •DeepSeek’s breakthrough: compelling cost/efficiency narrative matched the moment
- •Product lesson: ship ideas faster; novelty can be valuable even if imperfect
- •Chain-of-thought display may shift as distillation risks and UI patterns evolve
38:01 – 43:44
From model provider to application provider: what to build, and why Claude Code exists
Harry asks when a model company should build applications. Mike sets criteria: prioritize broadly generalizable products, avoid overly bespoke vertical apps, and focus on areas where first-party products accelerate learning—illustrated by Claude Code’s internal dogfooding to model improvements.
- •App-bet criteria: generalizability across users/surfaces; careful resource allocation
- •Claude Code started as internal acceleration, then shipped externally
- •Anthropic focuses on agentic loops rather than building a full IDE
- •First-party products create tighter feedback loops that improve next model versions
43:44 – 48:32
The future software developer: delegation, review, and automated verification loops
Mike predicts developers shift from primarily writing code to delegating work to agents and reviewing outputs. The bottleneck becomes scalable verification—security, correctness, UI testing—supported by AI-assisted analysis and multi-agent checks.
- •Skills shift: multidisciplinary product thinking + delegating effectively to agents
- •Code review changes when much code is AI-generated; idioms/patterns matter
- •Need better model learning from codebases + review feedback
- •Future workflow: agent proposes approaches, tests in-browser, scans for vulns, escalates decisions
48:32 – 1:02:41
API vs consumer balance—and speeding up: org design, abstractions beyond tokens, and rebuilds
They close with how Anthropic balances API and consumer products, and how to increase iteration speed. Mike emphasizes first-party learning velocity, building higher-level API abstractions (planning, tool use, memory), and removing organizational calcification to ship faster; quick-fire then covers competitive comparisons, privacy/agent trust, Europe’s role, and AI for longevity.
- •First-party products teach faster; APIs provide distribution and ecosystem leverage
- •API roadmap: abstractions beyond tokens (planning, tool use, memory, knowledge graphs)
- •Speed gains: break org boundaries, form ‘right people’ squads, reduce bureaucracy
- •Quick-fire themes: OpenAI ships V1s faster; Anthropic aims for cohesive personality; key risk is privacy/discernment with agent-to-agent systems

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Where AI startups create durable value: GTM, domain expertise, and proprietary data

Incumbents vs new entrants in vertical AI: the trust and expectation trap

Build for today’s models or tomorrow’s breakthroughs? Don’t wait—iterate into the frontier

Is the foundation model layer commoditizing? Three defensible advantages for labs

What actually blocks progress: real-world environments, evals, and agentic workflows

Human vs synthetic data—and the missing piece: measuring ‘vibes’ and character

Leaky abstractions in AI UX: model selection, memory, and prompting should disappear

Model quality vs product UX: you’re designing a scaffold around non-determinism

Shipping and marketing in a hyper-competitive release cycle: staying nimble without breaking trust

Open source and distillation: usefulness vs sustainability, security, and incentives

China, DeepSeek, and the breakthrough playbook: narrative, product speed, and UX novelty

From model provider to application provider: what to build, and why Claude Code exists

The future software developer: delegation, review, and automated verification loops

API vs consumer balance—and speeding up: org design, abstractions beyond tokens, and rebuilds

Get more out of YouTube videos.