Skip to content
The Twenty Minute VCThe Twenty Minute VC

The $100,000 token budget EVERY engineer will need | Sierra Co-Founder

Clay Bavor is the Co-Founder of Sierra, one of the world's fastest-growing enterprise AI companies. Sierra is valued at approximately $15.8 billion, has raised more than $1.5BN from leading investors including Sequoia, Benchmark, Greenoaks, GV and Tiger Global, and today serves more than 40% of the Fortune 50. The company recently surpassed $150 ARR, making it one of the fastest-growing enterprise software businesses in history. ----------------------------------------------- Timestamps: 0:00 Intro 1:37 Why Clay finally said yes to building Sierra with Bret 5:53 Why Sierra chose not to train their own foundation models 7:15 The case for unbounded demand for frontier intelligence 10:41 Why token costs are rising, not falling 14:35 Open models: the US vs China gap 18:36 Inside Pinecone, Sierra's internal AI agent 26:00 Staying close to customers as an enterprise AI company 33:22 Forward deployed teams: kickoff to live in 6 weeks 43:02 Sierra's core values: craftsmanship, intensity, family 55:41 Advice for young people entering the AI job market 1:02:35 Quickfire: Sundar, books, and parenting lessons ---------------------------------------------------------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Clay Bavor on X: https://twitter.com/claybavor Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #founder #ai

Clay BavorguestHarry Stebbingshost
Jul 4, 20261h 11mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 1:37

    Clay Bavor and Sierra’s breakout: scale, fundraising, and enterprise traction

    Harry introduces Clay Bavor, Sierra’s rapid growth, fundraising scale, and penetration into the Fortune 50. Clay frames the episode’s big themes: frontier intelligence demand, open-model distillation, and how AI is reshaping engineering and companies.

    • Sierra’s growth metrics: funding raised, valuation, and Fortune 50 footprint
    • Clay’s early thesis: unbounded demand for frontier intelligence
    • Open-model ecosystem dynamics and distillation as a competitive lever
    • How AI is changing engineering workflows and hiring signals
  2. 1:37 – 3:59

    Why Clay finally left Google to build Sierra with Bret Taylor

    Clay recounts a 20-year relationship with Bret Taylor and why prior attempts to work together didn’t happen. The catalyst was timing: late 2022/early 2023 as language models reshuffled the startup landscape, making it a rare window to build something new.

    • Clay and Bret’s history from Google APM to long-term friendship
    • Why Clay stayed at Google: culture, learning, and extraordinary colleagues
    • “Planets aligned” in late 2022 with the rise of LLMs
    • Starting a company: competence, character, and timing as prerequisites
  3. 3:59 – 5:48

    What Clay brought from Google: going deep in the stack (but not pretraining)

    Clay explains the Google lesson he carried over: invest as far down the stack as needed to build the product you want. Sierra embraced agents early and built frameworks and architectures from scratch, while deliberately avoiding the capital sink of full foundation-model training.

    • Google-style principle: build deeper infrastructure when necessary
    • Early conviction (April 2023) that agentic systems would matter
    • Hiring research leadership tied to foundational agent research (ReAct paper)
    • Clear boundary: build frameworks/architecture, not massive pretraining runs
  4. 5:48 – 7:15

    Why Sierra chose not to train its own foundation models

    Clay describes why pretraining didn’t pencil out for most startups: huge ongoing capex for a rapidly depreciating asset. Sierra’s approach is to ‘slipstream’ hyperscalers and labs—using open-weights foundations and adding proprietary fine-tunes where it creates control and differentiation.

    • Pretraining as a startup status symbol in 2022–2023—and why that was misleading
    • Capex and ongoing cost of training vs. value capture for application companies
    • Strategy: take what’s off-the-shelf, go deeper only where needed
    • Using proprietary fine-tunes on top of open-weights models
  5. 7:15 – 10:42

    Frontier intelligence has unbounded demand (even if not every task needs it)

    Clay argues frontier models remain essential because many domains will continuously absorb more intelligence—coding, science, legal, high-stakes reasoning—even if some customer-service tasks don’t require it. He predicts a hybrid world: frontier for hardest problems, cheaper open-weights for commoditized workloads.

    • Analogy: every company would upgrade engineers to higher ‘levels’ if possible
    • Not all tasks need maximal intelligence (e.g., simple returns)
    • High-complexity domains will keep pulling frontier capability
    • “Assembly line” effect: yesterday’s frontier becomes cheap open-weights targets
  6. 10:42 – 14:29

    Why token costs are rising: agents, reasoning, and compute scarcity

    The discussion turns to token economics and why costs aren’t simply falling with time. Clay points to reasoning models’ test-time compute, agentic workflows that generate more tokens, and the fundamental constraints of GPUs and power—creating a floor on token pricing when demand is effectively unbounded.

    • Reasoning models increase inference tokens via ‘thinking’ and test-time compute
    • Hardware improvements reduce unit costs, but demand grows faster
    • Supply/demand: limited GPUs (H100/Blackwell) and energy set price floors
    • Local/on-device inference helps some consumer use cases but doesn’t solve frontier needs
  7. 14:29 – 18:32

    Open models and geopolitics: US vs. China and distillation dynamics

    Clay explains why Chinese open-weights models appear unusually strong: willingness to do large-scale distillation of frontier models. He also notes the incentive mismatch for frontier labs to release open-weights competitors that undercut their own hosted margins.

    • Scale distillation as a key driver of Chinese open-model competitiveness
    • Open-weights often trace back to frontier training runs done elsewhere
    • Frontier labs’ business incentives limit how far they’ll ‘compete with themselves’
    • Distillation as the ‘next best approach’ if you can’t build frontier models
  8. 18:32 – 22:08

    Inside Sierra’s internal agent ‘Pinecone’: MCP gateway, skills, and Sierra Brain

    Clay details how Sierra runs itself with internal agents, starting with an MCP gateway that unifies access to company systems under permissions. Pinecone becomes a purpose-built harness for engineering and operations, while ‘Sierra Brain’ acts as a strategy thought partner grounded in memos, reviews, and company context.

    • MCP gateway aggregates Slack/docs/reviews and connects to Claude/Codex/Pinecone
    • Permissioned ‘superpowers’: query and reason over published company knowledge
    • Pinecone as a company-wide tool with shared skill libraries (including hiring review)
    • Sierra Brain: long-form grounding + board letters/operating reviews for strategy reasoning
  9. 22:08 – 25:51

    The coming $100K token budget per engineer: budgeting, ROI, and the new CFO mindset

    They debate whether to cap token spend and how to think about it economically. Clay predicts per-employee token budgets will become standard, with tokens treated like a core operating expense tied to headcount—potentially reaching ~20% of developer cost because productivity gains are so material.

    • Token usage as a near-term proxy for ‘leaning in’ to AI
    • Top engineers can run-rate >$100K/year in tokens
    • Future model: salary + token budget; headcount planning includes token OpEx
    • Clay’s view: ~20% token spend vs salary is plausible given 2x+ productivity gains
  10. 25:51 – 29:53

    Staying product-close while selling to the largest enterprises

    Harry challenges whether ‘enterprise’ implies distance from product; Clay rejects the tradeoff. He argues Sierra stays close by founders building agents themselves and by obsessing over end-user experience at massive interaction volumes—latency, voice fluency, and quality.

    • Enterprise doesn’t have to mean being distant from product or end user
    • Founders remain hands-on: building agents and contributing to shipped code
    • Sierra effectively becomes a scaled B2C interaction layer via enterprise clients
    • Key experience metrics: quality, latency, voice fluency, and reliability
  11. 29:53 – 34:15

    Forward-deployed teams and ‘kickoff to live in 6 weeks’: how Sierra deploys in the real world

    Clay explains Sierra’s Palantir-inspired forward-deployed model, discovered through early design partners with engineers embedded inside customer orgs. This tight partnership accelerates time-to-value—sometimes deploying in ~6–8 weeks—while the platform remains extensible enough for customers who want to build independently.

    • Origin: design partners and deep embedding to learn real deployment constraints
    • Why it matters: customers are deploying customer-facing AI agents for the first time
    • Forward-deployed motion speeds implementation (examples: Next, Cigna)
    • Not strictly required, but a major catalyst for fast, high-quality outcomes
  12. 34:15 – 38:42

    Expanding beyond support: the full customer lifecycle (sales, marketing, and personalization)

    Clay describes Sierra’s trajectory from service/support into broader front-office workflows. Examples like Rocket and Next show agents powering discovery, outreach, loan shaping, and personalized recommendations—positioning Sierra as a lifecycle platform rather than a narrow support tool.

    • Customer support as the initial wedge into broader lifecycle interactions
    • Rocket example: search/discovery, refi outreach, loan intake and shaping
    • Retail example: personalized recommendations and basket-building
    • Strategy: build applications that inform and strengthen the underlying platform
  13. 38:42 – 43:16

    How Sierra runs: six-week board cadence, memo-driven governance, and choosing valuation

    Clay outlines operating cadence in ‘AI time’: board touchpoints every six weeks and memo-based preparation to force clarity and invite real challenge. He also explains milestone-based fundraising and why Sierra has sometimes taken lower valuations than available to optimize for long-term outcomes.

    • Six-week cadence to update priors quickly as model capabilities shift
    • No decks: 6–10 page memos for pre-reading and sharper discussion
    • Board letters include what’s going well and what the team ‘sucks at’ (e.g., hiring too slowly)
    • Fundraising approach: milestone-to-milestone capital needs; sometimes taking lower price
  14. 43:16 – 55:50

    Core values and culture mechanics: craftsmanship, intensity, and family (plus trust)

    Clay breaks down Sierra’s values as direct extensions of the founders: craftsmanship as excellence in thousands of details, intensity as pace and competitiveness in an inevitable market, and family as sustainable commitment to what matters beyond work. He connects values to concrete behaviors like founders’ involvement and high-stakes customer moments.

    • Craftsmanship: excellence compounds into great companies; customers trust Sierra with their customers
    • Intensity: pace to win in a crowded market; hiring for ‘smart, nice, intense’
    • Family: high intensity without performative grind; protecting important life commitments
    • How to sustain intensity at scale: founder example, selective ‘founder mode,’ ambitious goals
  15. 55:50 – 1:02:38

    Advice for young people in AI: become ‘AI-native’ and expect interviews to change

    Clay argues AI is an unprecedented advantage for early-career talent: mastering tools can outweigh lack of experience. He describes Sierra’s AI-native engineering interviews with real token budgets and build exercises, and predicts every role will soon include strong AI components; he also flags cybersecurity as increasingly critical.

    • Young people’s edge: time to master AI tools before entering the workforce
    • AI fluency makes 22–23-year-olds disproportionately impactful
    • New interview style: build with coding agents; company pays token budget
    • Cybersecurity importance rises as offensive capabilities scale; defensive use may also improve
  16. 1:02:38 – 1:11:53

    Quickfire lessons: Sundar’s ‘zoom range,’ reading recommendations, and parenting rituals

    In rapid-fire, Clay shares what he learned from Sundar Pichai—extreme range from strategy to pixels—and what outsiders misunderstand about Google’s mission-driven problem solving. He recommends a favorite entrepreneurship-adjacent book and closes with parenting principles rooted in rituals, habits, and investing in kids’ interests.

    • Sundar’s strength: switching from macro strategy to micro product detail
    • Google underestimated: mission + smart people + truth-seeking culture can ‘solve anything’
    • Book recommendation: David McCullough’s ‘The Wright Brothers’ as a portrait of invention
    • Parenting: carve rituals (family dinners, maker mornings) and adopt kids’ interests as your own

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.