CHAPTERS
ElevenLabs’ expanding audio stack: from voices to agents and licensed music
Mati opens with how ElevenLabs has broadened beyond text-to-speech into voice-agent orchestration and a fully licensed music model. The framing sets up the company’s thesis that audio—especially voice—is becoming a primary AI interface across many use cases.
Origin story and the “Eleven” motif: early constraints and early momentum
Mati reflects on the early days and the company’s fondness for the number eleven, using it as a light way to illustrate growth. He contrasts the tiny early infrastructure footprint with today’s scale to highlight execution speed and rapid organizational evolution.
Research foundation: building expressive, context-aware voice models
He credits cofounder Piotr and the research team for the core breakthroughs—capturing context, emotion, intonation, and speaker characteristics. This R&D base later enabled expansion to STT, music, and other audio domains.
Shipping fast with small, high-ownership teams (and the trade-offs)
ElevenLabs organizes into many small, independent teams that can ship end-to-end, maximizing ownership and throughput. Mati acknowledges the cost—duplication and uneven pace—but argues the speed and accountability benefits dominate.
Balancing research vs. product: when to ‘ship the hack’ vs. wait for the breakthrough
Mati describes a practical rule for deciding whether to rely on research progress or product workarounds, illustrated by a “speech speed” feature debate. The takeaway is to avoid being blocked by uncertain research timelines while still aiming for elegant model-level solutions.
Remote-first to hub-first hybrid: meeting global talent where they are
ElevenLabs began remote-first to access specialized talent globally, then added hubs once headcount and onboarding complexity increased. The model is flexible: early-career hires benefit from hubs, while experienced remote workers can stay distributed.
Cultural contrasts and unconventional hiring signals
Mati contrasts US and European work cultures, noting Europe’s quieter “work talk” norms but strong pockets of highly driven talent. He also emphasizes non-traditional hiring—valuing demonstrated craft (e.g., open-source work) over standard credentials.
Flat structure and no titles: speeding impact while managing attention and coordination
ElevenLabs removed titles and kept leadership layers thin to reinforce merit, mobility, and team-level ownership. Mati also describes operational realities: “leads” must coordinate across teams, and transparency can backfire if it distracts people from priorities.
Creative industries warming to AI: relationship-building over disruption narratives
The conversation turns to how creative professionals shifted from early resistance to increasing adoption. Mati argues the best results come from deep engagement with creative workflows and showing concrete examples to overcome knee-jerk “AI is bad” reactions.
Voice Marketplace: scaling diversity of voices while paying creators
Mati explains the marketplace strategy: enabling creators to upload and share voices, earn revenue, and help ElevenLabs cover the long tail of accents, languages, and styles. He shares growth metrics and an anecdote showing demand can emerge unexpectedly across markets.
Licensing music with labels: the 18-month negotiation and ‘forcing functions’
Mati outlines how ElevenLabs approached licensed music generation by partnering with major label groups and aligning on protections and rights. He highlights the need for deadlines/forcing functions, repeated resets, and extensive education to reach agreement.
Hiring for complex, high-stakes domains: risk-tolerant counsel and expert bridging
In unfamiliar areas like legal and licensing, ElevenLabs blended experienced operators with specialist consultants who spoke the industry’s language. Mati describes mis-hires—especially overly risk-averse profiles—and the value of counsel who can propose pragmatic lines, not just enumerate risks.
From creator-first PLG to enterprise: building orchestration, reliability, and GTM muscle
ElevenLabs began with creator adoption but saw early enterprise inbound, forcing a shift toward sales-led execution and production-grade platforms. Mati details how enterprise needs drove orchestration (STT + LLM + TTS), integrations (telephony), and the hard work of evaluation, monitoring, and compliance.
Scaling execution: alpha vs. stable releases, pre- vs. post-PMF teams, and incentive alignment
Mati describes mechanisms to preserve speed while serving enterprise: clear alpha labeling with customer choice, and internal separation of pre-PMF (ship fast) vs post-PMF (stability) teams. He closes with a CEO lesson on scaling: incentives shape behavior, so comp plans must match strategy—illustrated by refusing to license models to a competitor even when commissions would encourage it.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome