ElevenLabs CEO: Why Voice is the Next AI Interface

ElevenLabs CEO and co‑founder Mati Staniszewski joins Jennifer Li to explain how the team ships research‑grade AI at lightning speed—from text‑to‑speech and fully licensed AI music to real‑time voice agents, and why voice is the next interface for human‑computer interaction. He shares the small, autonomous team model, global hiring approach, and how the Voice Marketplace has paid creators over $10M while evolving into an enterprise platform. Timestamps: 00:00 Intro 02:20 Lucky Number Eleven 02:50 Early Research and Product Work with Piotr 03:35 Shipping quickly with small, high ownership independent teams 04:40 Balancing research and product launches 06:50 A Remote-first approach: Meeting talent where they are 10:01 US vs Europe work cultures 10:40 Removing titles and flat leadership layers 13:35 The creative industry’s adoption of AI 15:10 The Voice Marketplace: Empowering creators to earn 16:43 Challenges in licensing and 18-month negotiation process 18:05 Hiring in complex domains 19:10 Finding risk-tolerant talent 20:45 Transitioning from creator-first to enterprise adoption 21:48 Lessons from hiring the first salespeople 23:34 Scaling orchestration, long sales cycles and cultural adjustments 26:22 Customer choice in adopting early features 27:55 Phases of company growth: product, sales, scaling 30:06 Turning down licensing to a competitor Stay Updated: If you enjoyed this episode, please like the video and share it with a friend. And if you want more like this, subscribe to our channel for updates on new releases. Resources: Follow Mati on X: https://x.com/matistanis Follow Jennifer on X: https://x.com/JenniferHli Find a16z on X: https://x.com/a16z Find a16z on LinkedIn: https://www.linkedin.com/company/a16z Listen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYX Listen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711 Follow our host: https://x.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Mati StaniszewskiguestJennifer Lihost

Nov 4, 202531mWatch on YouTube ↗

CHAPTERS

ElevenLabs’ expanding audio stack: from voices to agents and licensed music
Mati opens with how ElevenLabs has broadened beyond text-to-speech into voice-agent orchestration and a fully licensed music model. The framing sets up the company’s thesis that audio—especially voice—is becoming a primary AI interface across many use cases.
Origin story and the “Eleven” motif: early constraints and early momentum
Mati reflects on the early days and the company’s fondness for the number eleven, using it as a light way to illustrate growth. He contrasts the tiny early infrastructure footprint with today’s scale to highlight execution speed and rapid organizational evolution.
Research foundation: building expressive, context-aware voice models
He credits cofounder Piotr and the research team for the core breakthroughs—capturing context, emotion, intonation, and speaker characteristics. This R&D base later enabled expansion to STT, music, and other audio domains.
Shipping fast with small, high-ownership teams (and the trade-offs)
ElevenLabs organizes into many small, independent teams that can ship end-to-end, maximizing ownership and throughput. Mati acknowledges the cost—duplication and uneven pace—but argues the speed and accountability benefits dominate.
Balancing research vs. product: when to ‘ship the hack’ vs. wait for the breakthrough
Mati describes a practical rule for deciding whether to rely on research progress or product workarounds, illustrated by a “speech speed” feature debate. The takeaway is to avoid being blocked by uncertain research timelines while still aiming for elegant model-level solutions.
Remote-first to hub-first hybrid: meeting global talent where they are
ElevenLabs began remote-first to access specialized talent globally, then added hubs once headcount and onboarding complexity increased. The model is flexible: early-career hires benefit from hubs, while experienced remote workers can stay distributed.
Cultural contrasts and unconventional hiring signals
Mati contrasts US and European work cultures, noting Europe’s quieter “work talk” norms but strong pockets of highly driven talent. He also emphasizes non-traditional hiring—valuing demonstrated craft (e.g., open-source work) over standard credentials.
Flat structure and no titles: speeding impact while managing attention and coordination
ElevenLabs removed titles and kept leadership layers thin to reinforce merit, mobility, and team-level ownership. Mati also describes operational realities: “leads” must coordinate across teams, and transparency can backfire if it distracts people from priorities.
Creative industries warming to AI: relationship-building over disruption narratives
The conversation turns to how creative professionals shifted from early resistance to increasing adoption. Mati argues the best results come from deep engagement with creative workflows and showing concrete examples to overcome knee-jerk “AI is bad” reactions.
Voice Marketplace: scaling diversity of voices while paying creators
Mati explains the marketplace strategy: enabling creators to upload and share voices, earn revenue, and help ElevenLabs cover the long tail of accents, languages, and styles. He shares growth metrics and an anecdote showing demand can emerge unexpectedly across markets.
Licensing music with labels: the 18-month negotiation and ‘forcing functions’
Mati outlines how ElevenLabs approached licensed music generation by partnering with major label groups and aligning on protections and rights. He highlights the need for deadlines/forcing functions, repeated resets, and extensive education to reach agreement.
Hiring for complex, high-stakes domains: risk-tolerant counsel and expert bridging
In unfamiliar areas like legal and licensing, ElevenLabs blended experienced operators with specialist consultants who spoke the industry’s language. Mati describes mis-hires—especially overly risk-averse profiles—and the value of counsel who can propose pragmatic lines, not just enumerate risks.
From creator-first PLG to enterprise: building orchestration, reliability, and GTM muscle
ElevenLabs began with creator adoption but saw early enterprise inbound, forcing a shift toward sales-led execution and production-grade platforms. Mati details how enterprise needs drove orchestration (STT + LLM + TTS), integrations (telephony), and the hard work of evaluation, monitoring, and compliance.
Scaling execution: alpha vs. stable releases, pre- vs. post-PMF teams, and incentive alignment
Mati describes mechanisms to preserve speed while serving enterprise: clear alpha labeling with customer choice, and internal separation of pre-PMF (ship fast) vs post-PMF (stability) teams. He closes with a CEO lesson on scaling: incentives shape behavior, so comp plans must match strategy—illustrated by refusing to license models to a competitor even when commissions would encourage it.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

ElevenLabs’ expanding audio stack: from voices to agents and licensed music

Origin story and the “Eleven” motif: early constraints and early momentum

Research foundation: building expressive, context-aware voice models

Shipping fast with small, high-ownership teams (and the trade-offs)

Balancing research vs. product: when to ‘ship the hack’ vs. wait for the breakthrough

Remote-first to hub-first hybrid: meeting global talent where they are

Cultural contrasts and unconventional hiring signals

Flat structure and no titles: speeding impact while managing attention and coordination

Creative industries warming to AI: relationship-building over disruption narratives

Voice Marketplace: scaling diversity of voices while paying creators

Licensing music with labels: the 18-month negotiation and ‘forcing functions’

Hiring for complex, high-stakes domains: risk-tolerant counsel and expert bridging

From creator-first PLG to enterprise: building orchestration, reliability, and GTM muscle

Scaling execution: alpha vs. stable releases, pre- vs. post-PMF teams, and incentive alignment

Get more out of YouTube videos.