YC Root AccessThis Startup Built the Infrastructure Powering Voice AI
EVERY SPOKEN WORD
50 min read · 9,658 words- 0:00 – 2:08
Intro
- JFJared Friedman
[upbeat music] It's super cool to be here today with Dylan from AssemblyAI. Dylan, thank you so much for joining us.
- DFDylan Fox
Yeah, I'm excited to be here.
- JFJared Friedman
So AssemblyAI is one of the first AI companies YC ever funded in it, the, in our first show, we called at the time, AI Cohort [laughs] back in the summer of 2017, and today it's actually one of the most successful companies YC has ever funded. For folks who aren't familiar, how about you just tell everybody what AssemblyAI is and, uh, where you're at now?
- DFDylan Fox
So we help other companies build voice AI features and applications. Everything from AI note takers, to AI capabilities in contact centers, to real-time voice agents. Healthcare companies are building ambient medical scribes on top of our voice AI infrastructure platform. So we have about a million developers that have signed up to the platform. Um, we have about 10,000 customers. Last year, o- around 250 million voice hours were sent through our platform, and we're now doing-
- JFJared Friedman
250 million voice hours. Okay. [laughs]
- DFDylan Fox
Yeah. We're now doing, um, almost two million hours per day. Uh, so the run rate-
- JFJared Friedman
Wow
- DFDylan Fox
... on that would be, you know, like 700 million [laughs] voice hours for this year.
- JFJared Friedman
Voice hours. Wow.
- DFDylan Fox
And it's continuing to grow week over week. Um, so we really are focused on helping other companies, uh, have the voice AI infrastructure and primitives they need to just innovate around voice and to build innovative products and features, uh, either like standalone or in their, in their products.
- JFJared Friedman
Do you have an example of maybe one or two customers that people would know and how, uh-
- DFDylan Fox
Yeah
- JFJared Friedman
... how, how they're built on Assembly?
- DFDylan Fox
Yeah. For sure. So note taking is a really popular use case that companies are, are, uh, shipping with our platform. So if you've used Granola or Fireflies note takers, those use AssemblyAI's, uh, platform.
- JFJared Friedman
Interesting. So, so all, all my Granola notes that I, you know, I had said to where w- it was actually all going through Assembly.
- DFDylan Fox
Yeah. Uh-
- JFJared Friedman
Okay
- DFDylan Fox
... that's right. Um, also, um, in, in hiring, uh, hiring
- 2:08 – 5:23
What AssemblyAI actually does
- DFDylan Fox
segment, so, um, if you've used MetaView or, um, Ashby's note taker-
- JFJared Friedman
Okay. Yeah. Ashby. Yeah. Excellent
- DFDylan Fox
... yeah. Um, yeah. So note takers are a really big segment. We're also, um, pretty widely deployed in contact centers, so there's a big contact center company, they're called Collaboro. Um, they provide contact center, uh, software to big, uh, brands like Delta Airlines.
- JFJared Friedman
Okay. So if I called 1-800-Delta or, or, or whatever to like change my Delta flight, there's a good chance that like under the hood that audio was actually being processed by Assembly through this like chain of customer relationships.
- DFDylan Fox
Most likely, yeah.
- JFJared Friedman
Okay.
- DFDylan Fox
So right now, um, product teams are shipping, uh, products and cap- like capabilities around voice with our platform, but then we're also seeing enterprises. So right now we have a, a lot of, um, Fortune 500 enterprises that are building AI capabilities themselves within their contact center operations or within their trust-
- JFJared Friedman
Hmm
- DFDylan Fox
... and safety operations around voice to automate workflows for their team. So it's a mix of product teams, but then also enterprise companies that are, are, uh, looking to build with voice AI. So Zoom, for example, um, has used our, our infrastructure for-
- JFJared Friedman
Oh, wow
- DFDylan Fox
... a, a number of different capabilities. Yeah.
- JFJared Friedman
You guys were super early to AI. 2017, this was like way, way before ChatGPT. I remember your batch. There was what? A handful of companies working on AI?
- DFDylan Fox
Yeah. Yeah, there were like 10 other companies in the Y, in the AI cohort.
- JFJared Friedman
Tell us the origin story. How did you end up starting this company? How did you end up getting into AI so early?
- DFDylan Fox
Yeah. So I, um, I started a c- I mean, the, the really, you know, ear- early story is I started a company in college, um, and, and it was not an AI company, but through that, uh, that experience, I taught myself how to code. So, um, you know, you wouldn't need to do this now with AI, but back then I bought a bunch of programming books on PHP and Python and Django, and I just read them all and taught myself how to code and started-
- JFJared Friedman
Hmm
- DFDylan Fox
... building, um, like SAS software products and, and, and launching them as, as little startups. Um-
- JFJared Friedman
Like what?
- DFDylan Fox
So one-
- JFJared Friedman
What did you launch?
- DFDylan Fox
[laughs] One was like a fundraising tool for college organizations.
- JFJared Friedman
Okay.
- DFDylan Fox
Um, one was a tool where, uh, small businesses could like put up a QR code and then customers could text them, uh, f- uh, like a, a number that they could text feedback to, so it was like anonymous feedback capture through text message, a bunch of random stuff. Um, and I, I learned from that, that I really loved programming and that I really loved the, the creative, um, like journey of building something and, and like thinking of this, you have this like fuzzy picture of a product or something in your mind, and then you could just, you know, through programming, build it and see it come to life and get feedback. It was this very like addictive kind of, um, feedback loop that I, I really enjoyed. So after college, I just, I kept programming and kept, you know, learning about, um, uh, everything from like, uh, uh, security to machine learning, and I really got interested in machine learning. So this is like support vector machine,
- 5:23 – 7:11
Dylan learns to code and discovers ML
- DFDylan Fox
uh, time. This is not like deep neural net net time.
- JFJared Friedman
Yeah.
- DFDylan Fox
Um, eventually I joined a machine learning team out in San Francisco. So I, I moved to San Francisco, joined a team at Cisco, um, the company Cisco, that's based here. It's actually like a few blocks from where we're filming right now.
- JFJared Friedman
Oh, cool.
- DFDylan Fox
That's where my job was in Dogpatch. Yeah. And that was where I got, I, I started getting, learning about neural networks and getting into neural networks.
- JFJared Friedman
Hmm.
- DFDylan Fox
So to give you a sense of time, this is like 2015. Um, so, uh, like very early into, to, you know, the neural network boom.
- JFJared Friedman
It's just like right when, when we ... Wait. W- w-When did we crack ImageNet?
- DFDylan Fox
I think it was like 2013 or '14.
- JFJared Friedman
'14. Okay. Yeah.
- DFDylan Fox
It was around that time.
- JFJared Friedman
Okay.
- DFDylan Fox
Um, but I do remember going to the very first TensorFlow meetup down at Google's headquarters in like 2014 or '15. So it was, you know, that, that was the, the, um, the state of things at that time. A- so a couple things happened around that time. Um, one is I saw that the AI and, and neural networks and deep learning were going to, um, take off over the next decade. That it was such early days and data compute algorithms were just gonna continue to get stronger and stronger, and they were going to bring, you know, really disruptive new capabilities to everything from self-driving cars to natural language processing and, you know, ev- everything that we've seen over the last 10 years. Um, and so I just really felt like, wow, that's an area I wanna... I, I was drawn to and wanted to keep going, uh, deeper in. And then I, [clears throat] I bought an Amazon Echo when that came out in 2015. It was crazy because my experience with voice recognition, uh, prior to the Amazon Echo was that it
- 7:11 – 9:32
The Amazon Echo moment
- DFDylan Fox
sucked, right?
- JFJared Friedman
It sucked. Yeah, yeah.
- DFDylan Fox
Siri, Siri didn't work.
- JFJared Friedman
Yeah.
- DFDylan Fox
Everything sucked.
- JFJared Friedman
Yeah.
- DFDylan Fox
You didn't use it. But the Amazon Echo, you, you could use it across the room, so far field it worked, like, you know, great distance from the microphone. Your TV could be on, you could be, you know, shouting to it over a bunch of background noise, and it would still work. And what was so incredible was that it worked so well that I found I was building new habits around the product because it was reliable.
- JFJared Friedman
Mm.
- DFDylan Fox
So setting timers, asking for the weather, playing music. It was just like, I was blown away by how well those things worked over voice and that it was reliable. Um, and so I started looking into voice recognition technology to just build stuff on my own and to innovate and experiment on my own. Um, and there was nothing in the market at the time. It was either, like, bad or, um... So the, the company Nuance at the time, which was like a big-
- JFJared Friedman
They were the big incumbent.
- DFDylan Fox
Yeah, yeah, exactly. Um, I got in touch with them, and I, you know, was trying to find their developer SDKs, and it, it... You had to pay, I think it was like, you know, thousands of dollars up front.
- JFJared Friedman
I, I, I remember. I, I purchased it, yeah. [laughs]
- DFDylan Fox
And then they would mail you a CD-ROM-
- JFJared Friedman
Ooh
- DFDylan Fox
... with a developer SDK on it, and I didn't even have a CD-ROM drive-
- JFJared Friedman
[laughs]
- DFDylan Fox
... on my laptop. So, you know, for me, I was used to the Twilio or Stripe-
- JFJared Friedman
Mm
- DFDylan Fox
... style developer experience and, and [laughs] this was, like, the opposite of that. Um, so all these, like, ideas kinda came together for me and I got really excited about the idea of what if you could, um, you know, use like these new deep learning algorithms, a- all this innovation that was happening around AI and deep learning at the time and to, to build way better, uh, voice AI capabilities and technology, and then make it super easy for any developer to build with those things. 'Cause for me, I remember getting access to Twilio, you know, back in the day-
- JFJared Friedman
Yeah
- DFDylan Fox
... so like 2012 or 2013, getting access to Twilio, and I had s- I, I was just overwhelmed with creativity-
- JFJared Friedman
Yeah
- DFDylan Fox
... because I was like, wow, now I ha- like, I have access to this cool technology-
- JFJared Friedman
Yeah
- DFDylan Fox
... me as a, as a college student developer, and, like, I can e- experiment with all these really cool
- 9:32 – 13:02
Why Dylan built voice AI infrastructure
- DFDylan Fox
ideas. And, and that has been the whole spirit about Assembly. You know, we're not an application company. We're really focused on being a developer infrastructure tool that just makes it really easy for a developer, whether you're, you know, a, a college student or a high school student or a development team at a Fortune 500, to have access to this amazingly powerful technology, um, [clears throat] effortlessly and with a really good experience around it. So that, that was the, [clears throat] kinda, um, the, the mix of, of ideas that came together and then, you know, is still what we're, what we're focused on, you know, delivering, uh, many years later.
- JFJared Friedman
I don't know if you've read Paul Graham's essays, but, um-
- DFDylan Fox
Some of them
- JFJared Friedman
... a lot of his, his early essays talk about how, um, most of the best companies of the past, like Microsoft and Apple, didn't start as companies. They started as just an engineer building technology that they found intellectually interesting, without any clear idea of how it would turn into-
- DFDylan Fox
Yeah
- JFJared Friedman
... a business and scratching their own itch, right? Like, building the thing that th- that they saw missing in, in the world-
- DFDylan Fox
Yeah
- JFJared Friedman
... that they personally wanted to use. And, like, you know, at the time that you started this, like, AI was not the hot thing. It was not, it was not the thing raising huge venture rounds. It-
- DFDylan Fox
You couldn't use AI.
- JFJared Friedman
Yeah. [laughs]
- DFDylan Fox
If you said AI in your... It, it was deep learning was the, the-
- JFJared Friedman
Okay
- DFDylan Fox
... the thing to say. If you used AI-
- JFJared Friedman
AI
- DFDylan Fox
... it was, it was, um, it was like a scam. [laughs]
- JFJared Friedman
Yes. [laughs] Exactly.
- DFDylan Fox
That was the state of things. Yeah.
- JFJared Friedman
Totally. Yeah. Really 'cause there had been like decades of, like, failed promises-
- DFDylan Fox
Yeah, yeah, yeah
- JFJared Friedman
... from AI startups-
- DFDylan Fox
Yeah
- JFJared Friedman
... so VCs were allergic to anything with AI in it. Yeah. [laughs]
- DFDylan Fox
Yeah, exactly. Exactly. My experience as a founder has, has, has been one that I, I think you have to be obsessed with the problem that you're solving.
- JFJared Friedman
Yeah.
- DFDylan Fox
Because what keeps me going is, like, I really want this product to exist that, that we have not, you know, finished creating yet, you know? And, and I, I'm, I'm obsessed with, like, that problem and wanting that to, to be realized. Whereas I think if you're just focused on like a business opportunity, you know, um, then, and you're not obsessed with the product or the, the subject matter, um, you can find yourself losing interest. But I think if you're... You know, you should pick... M- my lesson has been, like, and what I share with other founders, like, you should pick a problem that you're, you're obsessed about. Because when I think about, like, what would I be doing if I wasn't building Assembly, you know, I, I don't know. Like [laughs] I'm having a lot of fun-
- JFJared Friedman
[laughs]
- DFDylan Fox
... building Assembly. Um, and I use our product all the time. Like, I'm constantly-Dogfooding it and playing with it and, and showing people demos, and it's fun. So I think that, you know, picking something, working on something that, like, y- you, you want to exist because you wanna be a user of it, um, ha- has been really fun.
- JFJared Friedman
To, to me, probably the craziest part of the Assembly story is that you started working on this in 2017 when very few people were interested in AI.
- DFDylan Fox
Yeah.
- JFJared Friedman
And it basically didn't start taking off until 2021, 2022. You, you raised your Series A in 2022, five years after starting the company.
- 13:02 – 16:50
Building AI before anyone cared
- DFDylan Fox
Yeah. Yeah, so I got into YC summer to- summer 2017. Um, solo founder. I had pretty much just started working on the company, and I got in la- I applied late, got in a week before the batch started, and I was just so stressed out. [laughs]
- JFJared Friedman
[laughs]
- DFDylan Fox
Because, uh, you had all these other companies that, you know, had, like, real products, and they were further along, and, um, they could iterate so much faster. But, like, it... I picked a very hard product to work on back then, and I just remember being, you know, s- [laughs] so stressed out and so overwhelmed ba- back in 2017. Um-
- JFJared Friedman
Yeah, everybody else was just, like, building database-backed websites.
- DFDylan Fox
Right.
- JFJared Friedman
They could, like, push things every day.
- DFDylan Fox
It's like, oh, you get feedback-
- JFJared Friedman
Yeah. [laughs]
- DFDylan Fox
... you can iterate on it that night, right? Whereas, like, our early users would give us feedback, and it's like, "All right. We'll come back to you in a month [laughs] when we have a, a better model." And YC is three months, so we had three-
- JFJared Friedman
[laughs]
- DFDylan Fox
... you know, iteration cycles.
- JFJared Friedman
[laughs]
- DFDylan Fox
It's not, it's not a lot of iteration. Um, and no one was building with voice because, like, what I have now realized is that, um, this whole ecosystem of technology needed to be created l- so that you can build voice AI applications. What do I mean by that? I mean, yeah, you need really good voice AI models, but you also need LMS. You also need vector databases. You also need WebRTC. You need mobile 5G. Like, you need this whole ecosystem-
- JFJared Friedman
Yeah
- DFDylan Fox
... um, to, to build what we're seeing developers build. You know, w- I remember, like, 2021, we started to get transcription models to work, like, decently well. Um, but the interesting applications were not just, like, transcribing something. It was sentiment analysis of a phone call. It was summarizing a meeting. But if you wanted to do those types of tasks back then, you had to train sentiment analysis models or train summarization models, right? So whereas now, you can use Assembly and, uh, you know, an LM, and, like, what you can do is... It's, it's insane.
- JFJared Friedman
Yeah.
- DFDylan Fox
And it's super easy to build-
- JFJared Friedman
Yeah
- DFDylan Fox
... and innovate around voice data. So it took a while for that ecosystem of technology to come together, and I think it goes back to probably what I was saying earlier, where if in 2017 I, if I had put together, like, a business plan-
- JFJared Friedman
[laughs]
- DFDylan Fox
... I'd probably been like, "I shouldn't work on this," because [laughs] you know-
- JFJared Friedman
Like, like-
- DFDylan Fox
... the market is so small.
- JFJared Friedman
Yeah, yeah. The tab is, like, $10 million.
- DFDylan Fox
Yeah, exactly.
- JFJared Friedman
[laughs]
- DFDylan Fox
The market's so small, and all the technology sucks, and, like, why am I working on this?
- JFJared Friedman
[laughs]
- DFDylan Fox
But I, you know, I, I thought it was a fun problem to work on. Um, and I was really fortunate that back then... So Daniel Gross was our YC partner. Um, he had worked at Apple. He saw the state of voice recognition technology, and he was a really early believer. And so he invested personally in the company, um, and really supported the company early on post-YC. And we just had these, like, early believers in the company that knew it was very early. And I think for me, I always had conviction that it was kind of like self-driving cars, where, okay, once you... It wasn't a question of, like, product-market fit. Like, will, will people want voice AI technology? It was a question of, um, when will it be... When will it have product-market fit? When will it get good enough, like self-driving cars?
- JFJared Friedman
Yeah.
- 16:50 – 24:13
The 2021 inflection point
- JFJared Friedman
Who was that? W- The first real customer was-
- DFDylan Fox
Like, real legit customer-
- JFJared Friedman
Real legit customer
- DFDylan Fox
... in 2021. Still a customer of ours. Um, a contact center company.
- JFJared Friedman
Okay.
- DFDylan Fox
And raised our Series A in January of 2022. Um, and then to date, we've raised about-
- JFJared Friedman
Which was still a little bit before LLM. So, like-
- DFDylan Fox
It was, yeah
- JFJared Friedman
... what, what happened in 2021, 2022-
- DFDylan Fox
Yeah
- JFJared Friedman
... that caused the, like, fir- the, the beginning of the inflection point?
- DFDylan Fox
Yeah. So, um-
- JFJared Friedman
Okay
- DFDylan Fox
... COVID was a big part of it.
- JFJared Friedman
Mm.
- DFDylan Fox
So during COVID, all of a sudden you had way more voice data created and captured over the internet than ever before with remote work. Podcasts started to get popular around that time. Our models started to get better. We started to use, you know, modern transformer architectures, train on more data. So, like, transcription was getting better, it was getting cheaper. Then you had other NLP models like BERT that made it easier to do summarization-
- JFJared Friedman
Mm
- DFDylan Fox
... or sentiment analysis-
- JFJared Friedman
Mm
- DFDylan Fox
... on top of transcription. So you had, like, the ecosystem of technology starting to come together, where it was easier to build a voice AI application. And so there was acceleration in the TAM around that time. Um, and Accel led our Series A, and they saw that too. Um, and then between then and now, obviously it's just continued to accelerate, you know, pretty dramatically. We raised about $160 million in the course of, like, uh-... I don't know, three years or something, um, last. So or, or yeah, like three years. So it was, uh, it, it took a while to get to that point, but then once it did, it really started to accelerate.
- JFJared Friedman
All the use cases in those early years, were they all non-real-time use cases?
- DFDylan Fox
They're all not real-time.
- JFJared Friedman
On-
- DFDylan Fox
Yeah
- JFJared Friedman
... all so, like, analyze all the calls that went through this call center o- for the past week-
- DFDylan Fox
Yeah
- JFJared Friedman
... and, like, find the ones with bad, bad, like, NPS scores-
- DFDylan Fox
Yeah. It was-
- JFJared Friedman
... or things like that
- DFDylan Fox
... pretty much all, you know, use cases on top of pre-recorded audio. And then now, um, there's still a ton of growth in those use cases. Um, like usage to our non-real-time APIs is still growing, you know, 200% year over year. But real-time use cases are exploding because the real-time capabilities are now way better, uh, way lower cost. You have, you know, ecosystem of real-time technology to build real-time note-takers, real-time voice agents. That market is really taking off, that use case. Um, but, but back then it was all non-real time.
- 24:13 – 28:26
Real-time voice agents are here
- DFDylan Fox
Yeah.
- JFJared Friedman
Okay.
- DFDylan Fox
And it's, we're still in this phase where-You can't tell, but then like some, some, some people will midway through be able to tell, and it just gets weird. It's like, wait, I'm- I, I just realized I'm talking to an AI.
- JFJared Friedman
Yeah.
- DFDylan Fox
Um, so I think real-time voice agents are gonna continue to be widely deployed, and, um, and the ROI there is like crazy, and the capabilities around real time are getting really amazing. We're also seeing a lot of demand around, [sighs] um, robotics and consumer hardware. So a lot of like pretty popular robotics companies, I don't know if I can name them or not, [laughs] but a lot of really popular robotics companies are, um, putting our models on their robots, so you know, humanoid robots that are walking around.
- JFJared Friedman
Yeah.
- DFDylan Fox
'Cause-
- JFJared Friedman
Oh, that's cool.
- DFDylan Fox
Yeah. I think robotics, but also consumer hardware, you're gonna start interacting with that more and more through your voice. So even, you know, the coffee machine downstairs, right? Like you have to kind of-
- JFJared Friedman
Right
- DFDylan Fox
... flip through a touch screen. But if you could walk up to that coffee machine and just ask for the-
- JFJared Friedman
Yeah
- DFDylan Fox
... exact type of coffee that you want, you know, there's no reason why that technology shouldn't also exist.
- JFJared Friedman
Totally.
- DFDylan Fox
It took a while for consumer hardware to all have a touch screen on it. Um, but now I think, you know, voice will also be a, a another modality into these, um, uh, hardware devices that we, that we buy. So we're seeing a lot of demand there. But then also a- these ambient devices. So, uh, healthcare. We have a ton of healthcare companies-
- JFJared Friedman
Hmm
- DFDylan Fox
... building ambient, uh, scribes for doctors.
- JFJared Friedman
And this is a physical device?
- DFDylan Fox
This is, uh, software-
- JFJared Friedman
Okay
- DFDylan Fox
... that they're just running on a laptop-
- JFJared Friedman
Got it
- DFDylan Fox
... or running on a phone.
- JFJared Friedman
Sure.
- DFDylan Fox
And the doctor/patient visit is captured.
- JFJared Friedman
Yeah.
- DFDylan Fox
'Cause if you listen to some of this audio, it's actually like cra- [laughs] crazy hard. Uh, it's someone's laptop is recording a conversation that's like 10, 15 feet away. Um, tons of background noise, low speech. But the models can do a pretty good job now-
- JFJared Friedman
Wow
- DFDylan Fox
... at capturing all that at- with like, you know, accuracy rates in the 90 percentile. And so, um, now you can actually automate chart notes, and you can automate insurance submissions post-doctor visit. So we're seeing a ton of healthcare companies build ambient scribes, but also sales companies build ambient scribes. So-
- JFJared Friedman
Mm-hmm
- 28:26 – 45:33
Inside AssemblyAI’s new voice models
- DFDylan Fox
a common issue is like which voice do I listen to and take commands from?
- JFJared Friedman
Ah.
- DFDylan Fox
And which voice is the voice that's just like chatting with the human?
- JFJared Friedman
Interesting.
- DFDylan Fox
Right? So-
- JFJared Friedman
Kind of the like noisy restaurant problem of-
- DFDylan Fox
Yeah.
- JFJared Friedman
Yeah.
- DFDylan Fox
Yeah. And, and knowing like who, who is saying what, what's the role of the person. So all this we think about as, you know, more intelligent, you know, voice AI capabilities that we're focused on building that will operate in real time. Um, I, we think that's like a, a, a big capability that we're seeing a lot of demand for. And so that's... We just released a model last week, um, called Universal 3 Pro. It's our latest model, and it's our most intelligent model 'cause you can give it instructions on what you want it to do around the audio that it's listening to, and it can follow those instructions pretty well, and this is just the first release. Um, and I, I can show you some of that.
- JFJared Friedman
In, in, in, in instructions like a, like an LLM prompt.
- DFDylan Fox
Correct. Yeah. So it's, it's... We, we, we think about it somewhere in, in the middle between a multimodal LLM and a traditional speech-to-text model, where what we really did was focus on creating a reliable, you know, transcription model, um, that can, can stay on the rails on transcription and transcription-like tasks, but that c- you can still give instructions to. Whereas if you were to use a multimodal LLM today and give it instructions, it goes off the rails too much. It's not reliable enough in real world use cases. So, um, we focused a lot on the post-training of this model so that it's like a reliable transcription model, but you can give it instructions and really guide it on what you want it to do.
- JFJared Friedman
Okay, so let me see if I, if I, if I understand this right. Like historically, the trans- transcription models, like the ones that you guys built, d- only do transcription. They don't... They, they lack the general in- like the, the general intelligence of a, of a, of an LLM. And so like the magical combination came when you paired a transcription model with an LLM. What this new model, it's kind of like the fusion of the two of them. It is a transcription model, but you've actually injected the like generalized intelligence of an LLM into it.
- DFDylan Fox
Yeah. Mor- more or less, exactly. Yeah.
- JFJared Friedman
Okay.
- DFDylan Fox
Yeah. 'Cause we-If you think about a multimodal general purpose LLM, it can do transcription tasks, but that's maybe 10% of its total training data, right?
- JFJared Friedman
Mm.
- DFDylan Fox
So it also can do math tasks, and for, um, our customers, they need a reliable voice AI model that can understand speech and speakers. And so we really focus the model on those types of tasks.
- JFJared Friedman
Yeah.
- DFDylan Fox
And, um, it doesn't get confused and think it's like a- an assistant. [laughs]
- JFJared Friedman
[laughs]
- DFDylan Fox
It, it will, it will kinda operate as this like... as this kind of i- i- in this like narrow space.
- JFJared Friedman
And I assume it's also much smaller and faster and cheaper-
- DFDylan Fox
Yeah, correct
- JFJared Friedman
... to run than a full LLM, where you're, like, taking this giant brain and having it only do this one little thing, right?
- DFDylan Fox
Yeah.
- JFJared Friedman
Yeah.
- DFDylan Fox
You can run it in real time. Um, you can actually deploy it on your own server, so we support self hosting these models-
- JFJared Friedman
Mm-hmm
- DFDylan Fox
... uh, to get latency down. And where we're really excited is for customers to use this in, um, their real-time voice agents, because you can just capture a lot more, uh, a lot more information from the speaker.
- JFJared Friedman
Can we see a demo?
- 45:33 – 52:00
Lessons from hypergrowth
- DFDylan Fox
and hire more people into that. But I think a lot of companies y- you know, will follow the traditional playbook of, like, "Okay, this company, which is similar to mine, did X and Y and Z, therefore, like, I should do X and Y and Z, and I should go build out a, you know, 50-person SDR team. And I should go, you know, I need this and that," as if it's, like, a franchise business.
- JFJared Friedman
Mm-hmm.
- DFDylan Fox
Um, but I think every company's different. And so knowing what to invest in, what teams to build, um, especially in an emerging market like the one we're in, is- is- is difficult. And, um, now we're more conservative over, like, hey, where do we really wanna invest versus where is this just an, you know, something- s- an area we wanna explore? And if it's exploration, let's do that with the existing team. Um, and then also just, like, really just strongly filtering for, um, non-negotiables and- and culture fits. I think-
- JFJared Friedman
Like what?
- DFDylan Fox
So every role, I think, has different non-negotiables. Um, but then you start talking to people and you're like, "Oh, I like this person," and, you know, we're- we're vibing and so, "Yeah, let's hire them." But then you realize, oh, I really, you know... This thing was really important in the role, and I should've been firmer in that- on that non-negotiable. So, like, now we're very clear on, all right, for these, for this role, these are the non-negotiables. For this role, these are the non-negotiables. For me as a founder, one thing that's really important when hiring is, um, why do you wanna work at a voice AI company? Like, why here? Um, not just an AI company, not just a VC-backed company, not just a company that's growing fast, but, like, what about our product? What about our space, our market, our customers are y- are you passionate about? Um, because I think that, um, that- that's a really important ingredient, right? Like, there's- there's challenging problems-
- JFJared Friedman
Mm-hmm
- DFDylan Fox
... here that, you know, in any company there's a lot of challenging problems. But I think that, you know, what I care a lot about is- is- is finding people that are really excited about our product and what our customers are building, um, and- and will be a lot more, uh, you know, obsessed and passionate as a result. And so that's something that I think I've, I also, um, am a lot more, like, strict on when we're hiring people now. So there's all these, like, things that I've- I've- I've learned, and I think that- that was probably, um, one of the harder lessons to learn. But now we're- we're a- about 80 people at the company, um, super dense team.
- JFJared Friedman
Only 80 people?
- DFDylan Fox
Yeah.
- JFJared Friedman
That's actually a lot smaller than I would've expected given the- the 700 billion hours of audio [laughs] that you're processing every year.
- DFDylan Fox
Yeah, yeah. Super, super small, dense team. I mean, we're, like, you know, using AI everywhere across the company-
- JFJared Friedman
Mm-hmm
- DFDylan Fox
... um, to just be, like, to move as fast as possible. And I- I think, you know, one thing we talk about at the company is we're gonna optimize for speed and innovation. Like, that's gonna be an explicit decision. So we don't do, like, OKR cascades, and we don't do planning exercises. Like, we have m- clear metrics that we wanna hit, and those are very transparent across the whole company. And then we're all just running towards those, and we can move quickly and innovate and be dynamic and responsive because we're this small, transparent team. Um, and I- I think that my goal is, like...Having as little operational overhead as possible at all times, like teetering like just on the, the, the line of like complete chaos. Like, [laughs] like I wanna be like right there.
- JFJared Friedman
[laughs]
- DFDylan Fox
Um, 'cause we've, you know, we've like oscillated, right? Like, it feels good to have like, here's every team's OKRs, and here's every team's six-month roadmap, and here's every team's-
- JFJared Friedman
Yeah
- DFDylan Fox
... like strategy doc. And-
- JFJared Friedman
Like Microsoft style.
- DFDylan Fox
Yeah. It's like-
- JFJared Friedman
Yeah
- DFDylan Fox
... here everything is like planned and organized, and it's like that, that feels good.
- JFJared Friedman
Yeah.
- DFDylan Fox
But like that's not a replacement for like you actually gotta go build the thing and sh- and do what's in that document.
- JFJared Friedman
Yeah.
- DFDylan Fox
And then what if two months in you get new information that you should adapt to?
- JFJared Friedman
Right.
- DFDylan Fox
If you're committed to some plan, you're like, "Oh, we'll do that next quarter."
- JFJared Friedman
Yeah.
- DFDylan Fox
It's like, well, we should be doing that right now if that's-
- JFJared Friedman
Yeah
- 52:00 – 53:00
The future of voice AI
- DFDylan Fox
getting. Like-
- JFJared Friedman
Hmm
- DFDylan Fox
... all that is brought in-
- JFJared Friedman
Hmm
- DFDylan Fox
... to the company. Th- there's like all the like truth out there, right? Of like your product and feedback in the market and stuff, right? Um, and I think it's easy to like put a subjective point of view on that, right? And usually it's like the people in charge that get to decide what, what the subjective point of view is.
- JFJared Friedman
Yeah. That's fair.
- DFDylan Fox
But if, if you have all that, all that truth just like publicly available, it's like e- everyone is looking at this information objectively, and there is no subjective layer. Um, and I think that's been so powerful for us as a company, and I think that that's how like most companies should and will... The companies that are g- successful I think will operate in the same way.
- JFJared Friedman
Yeah, I mean, it's, it's just essential. Like the, the, the, the companies that operate without AI augmentation of human intelligence will lose to the ones that have augmented their intelligence.
- DFDylan Fox
Totally. Yeah. [upbeat music]
Episode duration: 53:00
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode Elm2p_TRPwk
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome