The Twenty Minute VC

The $100,000 token budget EVERY engineer will need | Sierra Co-Founder

Clay Bavor is the Co-Founder of Sierra, one of the world's fastest-growing enterprise AI companies. Sierra is valued at approximately $15.8 billion, has raised more than $1.5BN from leading investors including Sequoia, Benchmark, Greenoaks, GV and Tiger Global, and today serves more than 40% of the Fortune 50. The company recently surpassed $150 ARR, making it one of the fastest-growing enterprise software businesses in history. ----------------------------------------------- Timestamps: 0:00 Intro 1:37 Why Clay finally said yes to building Sierra with Bret 5:53 Why Sierra chose not to train their own foundation models 7:15 The case for unbounded demand for frontier intelligence 10:41 Why token costs are rising, not falling 14:35 Open models: the US vs China gap 18:36 Inside Pinecone, Sierra's internal AI agent 26:00 Staying close to customers as an enterprise AI company 33:22 Forward deployed teams: kickoff to live in 6 weeks 43:02 Sierra's core values: craftsmanship, intensity, family 55:41 Advice for young people entering the AI job market 1:02:35 Quickfire: Sundar, books, and parenting lessons ---------------------------------------------------------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Clay Bavor on X: https://twitter.com/claybavor Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #founder #ai

Clay BavorguestHarry Stebbingshost

Jul 4, 20261h 11mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Sierra co-founder on token budgets, enterprise agents, and AI demand

Bavor explains why he left an 18-year Google career to co-found Sierra when LLMs reshuffled the competitive landscape for startups.
Sierra chose not to pre-train foundation models, instead “slipstreaming” hyperscaler/lab investment and focusing on agent frameworks, fine-tunes, and deep product engineering.
He argues demand for frontier-level intelligence is effectively unbounded, and reasoning/agentic workflows plus compute scarcity will keep token costs from collapsing.
Sierra runs itself with an internal agent (“Pinecone”) connected via an MCP gateway to company systems, plus a “Sierra Brain” strategy partner grounded in internal documents.
Sierra’s go-to-market relies on forward-deployed teams to implement quickly in complex enterprises, reinforced by operating rituals like memo-based boards every six weeks and AI-native hiring interviews.

IDEAS WORTH REMEMBERING

5 ideas

Training frontier models is a poor default for startups.

Bavor calls foundation models a “perishable bag of floating-point numbers” with recurring capital expense that only a few companies can sustain; Sierra instead uses open-weights plus proprietary fine-tunes and invests lower in the stack where it creates differentiation (agent architecture and frameworks).

Frontier models won’t be displaced; they’ll be selectively used alongside cheaper models.

He predicts an “assembly line” where yesterday’s frontier becomes tomorrow’s cheap open-weights workhorse, but high-stakes domains (coding, science, legal) will still justify frontier intelligence, leading to task-based routing and mixing models by capability/cost.

Token costs can rise even as hardware improves.

Reasoning models and agents consume more inference (“thinking out loud”), while GPU/power constraints create a price floor; unbounded demand plus limited Blackwells/H100s keeps compute scarce, preventing token prices from simply trending down.

Per-employee token budgets will become standard operating practice.

He’s seeing top engineers run-rate >$100K/year in token spend using Claude Code/Codex; he expects CFOs to allocate “salary + token budget” and believes steady-state token spend could approach ~20% of developer compensation rather than low single digits.

Internal AI agents become force multipliers when connected to real systems with permissions.

Sierra’s MCP gateway aggregates Slack/docs/reviews/etc. into a single interface for multiple agents, enabling Pinecone to assist across engineering and operations (e.g., scanning interview packets) while keeping access control aligned to each employee.

WORDS WORTH SAVING

5 quotes

I think we have not yet appreciated the unbounded demand for, call it frontier levels of intelligence.

— Clay Bavor

The capital expense, uh, uh, the ongoing capital expense to create what is effectively a highly perishable bag of floating-point numbers, it just doesn't work, just doesn't work for any but a small number of companies.

— Clay Bavor

I have heard and I have observed that top engineers who are really leaning in to Claude Code, Codex, and so on are spending more than $100,000 on a run rate basis on tokens per year.

— Clay Bavor

I would not bet on 3.8%. I would bet on much closer to 20%.

— Clay Bavor

Writing is just thinking on paper, and I, I think it's very hard to hide from writing.

— Clay Bavor

Leaving Google and timing Sierra’s foundingBuild vs buy in foundation models (pre-training vs fine-tuning)Frontier intelligence demand and model mixing strategiesToken economics: reasoning models, supply constraints, and budgetingUS vs China open-weights gap and distillationInternal agents (MCP gateway, Pinecone, Sierra Brain)Enterprise deployment: forward-deployed teams, speed to production, vertical expertiseOperating cadence: 6-week board memos and truth-seeking cultureCompany values: craftsmanship, intensity, family (plus trust, customer obsession)AI-era hiring and interviewing for engineers

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.