David AI: Powering the Voice Era of AI

Tomer Cohen and Ben Wiley launched David AI just days before the Y Combinator deadline—submitting their application at midnight and hoping it counted. A year later, their company is now one of the market leaders for voice training data in AI, having just closed a $25 million Series A. They met while working at Scale AI, where they bonded over the belief that the next big leap for AI would be moving beyond screens, into real-world interactions powered by voice. That idea became David AI, a company that collects, produces, and refines massive volumes of audio data for training voice models. So far, they've built a library of 100,000 hours of audio in over 15 languages, complete with rich metadata like accents and dialects. YC Partner Diana Hu recently sat down with the David AI founders to talk about how they got here, their founding story, and the kind of company they are building. Learn more about David AI at https://www.withdavid.ai. Apply to Y Combinator: https://ycombinator.com/apply Chapters 00:00 - Introduction 00:12 - What is David AI? 00:31 - Challenges in Audio Data 01:11 - Origin Story of David AI 01:46 - Building the First Product 04:12 - Early Success and Growth 05:24 - Business Model and Approach 07:40 - Future Plans and Hiring

Diana HuhostTomer CohenguestBen Wileyguest

May 28, 20259mWatch on YouTube ↗

EPISODE INFO

Released: May 28, 2025
Duration: 9m
Channel: YC Root Access
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Tomer Cohen and Ben Wiley launched David AI just days before the Y Combinator deadline—submitting their application at midnight and hoping it counted. A year later, their company is now one of the market leaders for voice training data in AI, having just closed a $25 million Series A. They met while working at Scale AI, where they bonded over the belief that the next big leap for AI would be moving beyond screens, into real-world interactions powered by voice. That idea became David AI, a company that collects, produces, and refines massive volumes of audio data for training voice models. So far, they've built a library of 100,000 hours of audio in over 15 languages, complete with rich metadata like accents and dialects. YC Partner Diana Hu recently sat down with the David AI founders to talk about how they got here, their founding story, and the kind of company they are building. Learn more about David AI at https://www.withdavid.ai. Apply to Y Combinator: https://ycombinator.com/apply Chapters 00:00 - Introduction 00:12 - What is David AI? 00:31 - Challenges in Audio Data 01:11 - Origin Story of David AI 01:46 - Building the First Product 04:12 - Early Success and Growth 05:24 - Business Model and Approach 07:40 - Future Plans and Hiring

SPEAKERS

Diana Hu
host
Host of YC Root Access who interviews founders about their companies and products.
Tomer Cohen
guest
Co-founder at David AI, describing the company’s audio/voice data research work and go-to-market traction.
Ben Wiley
guest
Co-founder at David AI, discussing technical challenges and systems for collecting high-quality conversational audio data.

EPISODE SUMMARY

In this episode of YC Root Access, featuring Diana Hu and Tomer Cohen, David AI: Powering the Voice Era of AI explores david AI builds high-quality conversational speech datasets for next-gen voice models David AI focuses narrowly on speech—especially multilingual, multi-accent conversational audio—because high-performing voice models depend on specialized data that is scarce online.

RELATED EPISODES