Skip to content
YC Root AccessYC Root Access

This Startup Built the Infrastructure Powering Voice AI

In this episode of Founder Firesides, YC Managing Partner Jared Friedman talks to Dylan Fox, the Founder of Assembly AI (S17), which has raised $160M to date. AssemblyAI is the voice AI infrastructure platform powering 10,000 companies, including Granola, Zoom and Delta Airlines. https://www.assemblyai.com/ Apply to Y Combinator: https://www.ycombinator.com/apply Work at a startup: https://www.ycombinator.com/jobs Chapters: 02:08 - What AssemblyAI actually does 05:23 - Dylan learns to code and discovers ML 07:11 - The Amazon Echo moment 09:32 - Why Dylan built voice AI infrastructure 13:02 - Building AI before anyone cared 16:50 - The 2021 inflection point 24:13 - Real-time voice agents are here 28:26 - Inside AssemblyAI’s new voice models 45:33 - Lessons from hypergrowth 52:00 - The future of voice AI

Jared FriedmanhostDylan Foxguest
Mar 5, 202653mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:002:08

    Intro

    1. JF

      [upbeat music] It's super cool to be here today with Dylan from AssemblyAI. Dylan, thank you so much for joining us.

    2. DF

      Yeah, I'm excited to be here.

    3. JF

      So AssemblyAI is one of the first AI companies YC ever funded in it, the, in our first show, we called at the time, AI Cohort [laughs] back in the summer of 2017, and today it's actually one of the most successful companies YC has ever funded. For folks who aren't familiar, how about you just tell everybody what AssemblyAI is and, uh, where you're at now?

    4. DF

      So we help other companies build voice AI features and applications. Everything from AI note takers, to AI capabilities in contact centers, to real-time voice agents. Healthcare companies are building ambient medical scribes on top of our voice AI infrastructure platform. So we have about a million developers that have signed up to the platform. Um, we have about 10,000 customers. Last year, o- around 250 million voice hours were sent through our platform, and we're now doing-

    5. JF

      250 million voice hours. Okay. [laughs]

    6. DF

      Yeah. We're now doing, um, almost two million hours per day. Uh, so the run rate-

    7. JF

      Wow

    8. DF

      ... on that would be, you know, like 700 million [laughs] voice hours for this year.

    9. JF

      Voice hours. Wow.

    10. DF

      And it's continuing to grow week over week. Um, so we really are focused on helping other companies, uh, have the voice AI infrastructure and primitives they need to just innovate around voice and to build innovative products and features, uh, either like standalone or in their, in their products.

    11. JF

      Do you have an example of maybe one or two customers that people would know and how, uh-

    12. DF

      Yeah

    13. JF

      ... how, how they're built on Assembly?

    14. DF

      Yeah. For sure. So note taking is a really popular use case that companies are, are, uh, shipping with our platform. So if you've used Granola or Fireflies note takers, those use AssemblyAI's, uh, platform.

    15. JF

      Interesting. So, so all, all my Granola notes that I, you know, I had said to where w- it was actually all going through Assembly.

    16. DF

      Yeah. Uh-

    17. JF

      Okay

    18. DF

      ... that's right. Um, also, um, in, in hiring, uh, hiring

  2. 2:085:23

    What AssemblyAI actually does

    1. DF

      segment, so, um, if you've used MetaView or, um, Ashby's note taker-

    2. JF

      Okay. Yeah. Ashby. Yeah. Excellent

    3. DF

      ... yeah. Um, yeah. So note takers are a really big segment. We're also, um, pretty widely deployed in contact centers, so there's a big contact center company, they're called Collaboro. Um, they provide contact center, uh, software to big, uh, brands like Delta Airlines.

    4. JF

      Okay. So if I called 1-800-Delta or, or, or whatever to like change my Delta flight, there's a good chance that like under the hood that audio was actually being processed by Assembly through this like chain of customer relationships.

    5. DF

      Most likely, yeah.

    6. JF

      Okay.

    7. DF

      So right now, um, product teams are shipping, uh, products and cap- like capabilities around voice with our platform, but then we're also seeing enterprises. So right now we have a, a lot of, um, Fortune 500 enterprises that are building AI capabilities themselves within their contact center operations or within their trust-

    8. JF

      Hmm

    9. DF

      ... and safety operations around voice to automate workflows for their team. So it's a mix of product teams, but then also enterprise companies that are, are, uh, looking to build with voice AI. So Zoom, for example, um, has used our, our infrastructure for-

    10. JF

      Oh, wow

    11. DF

      ... a, a number of different capabilities. Yeah.

    12. JF

      You guys were super early to AI. 2017, this was like way, way before ChatGPT. I remember your batch. There was what? A handful of companies working on AI?

    13. DF

      Yeah. Yeah, there were like 10 other companies in the Y, in the AI cohort.

    14. JF

      Tell us the origin story. How did you end up starting this company? How did you end up getting into AI so early?

    15. DF

      Yeah. So I, um, I started a c- I mean, the, the really, you know, ear- early story is I started a company in college, um, and, and it was not an AI company, but through that, uh, that experience, I taught myself how to code. So, um, you know, you wouldn't need to do this now with AI, but back then I bought a bunch of programming books on PHP and Python and Django, and I just read them all and taught myself how to code and started-

    16. JF

      Hmm

    17. DF

      ... building, um, like SAS software products and, and, and launching them as, as little startups. Um-

    18. JF

      Like what?

    19. DF

      So one-

    20. JF

      What did you launch?

    21. DF

      [laughs] One was like a fundraising tool for college organizations.

    22. JF

      Okay.

    23. DF

      Um, one was a tool where, uh, small businesses could like put up a QR code and then customers could text them, uh, f- uh, like a, a number that they could text feedback to, so it was like anonymous feedback capture through text message, a bunch of random stuff. Um, and I, I learned from that, that I really loved programming and that I really loved the, the creative, um, like journey of building something and, and like thinking of this, you have this like fuzzy picture of a product or something in your mind, and then you could just, you know, through programming, build it and see it come to life and get feedback. It was this very like addictive kind of, um, feedback loop that I, I really enjoyed. So after college, I just, I kept programming and kept, you know, learning about, um, uh, everything from like, uh, uh, security to machine learning, and I really got interested in machine learning. So this is like support vector machine,

  3. 5:237:11

    Dylan learns to code and discovers ML

    1. DF

      uh, time. This is not like deep neural net net time.

    2. JF

      Yeah.

    3. DF

      Um, eventually I joined a machine learning team out in San Francisco. So I, I moved to San Francisco, joined a team at Cisco, um, the company Cisco, that's based here. It's actually like a few blocks from where we're filming right now.

    4. JF

      Oh, cool.

    5. DF

      That's where my job was in Dogpatch. Yeah. And that was where I got, I, I started getting, learning about neural networks and getting into neural networks.

    6. JF

      Hmm.

    7. DF

      So to give you a sense of time, this is like 2015. Um, so, uh, like very early into, to, you know, the neural network boom.

    8. JF

      It's just like right when, when we ... Wait. W- w-When did we crack ImageNet?

    9. DF

      I think it was like 2013 or '14.

    10. JF

      '14. Okay. Yeah.

    11. DF

      It was around that time.

    12. JF

      Okay.

    13. DF

      Um, but I do remember going to the very first TensorFlow meetup down at Google's headquarters in like 2014 or '15. So it was, you know, that, that was the, the, um, the state of things at that time. A- so a couple things happened around that time. Um, one is I saw that the AI and, and neural networks and deep learning were going to, um, take off over the next decade. That it was such early days and data compute algorithms were just gonna continue to get stronger and stronger, and they were going to bring, you know, really disruptive new capabilities to everything from self-driving cars to natural language processing and, you know, ev- everything that we've seen over the last 10 years. Um, and so I just really felt like, wow, that's an area I wanna... I, I was drawn to and wanted to keep going, uh, deeper in. And then I, [clears throat] I bought an Amazon Echo when that came out in 2015. It was crazy because my experience with voice recognition, uh, prior to the Amazon Echo was that it

  4. 7:119:32

    The Amazon Echo moment

    1. DF

      sucked, right?

    2. JF

      It sucked. Yeah, yeah.

    3. DF

      Siri, Siri didn't work.

    4. JF

      Yeah.

    5. DF

      Everything sucked.

    6. JF

      Yeah.

    7. DF

      You didn't use it. But the Amazon Echo, you, you could use it across the room, so far field it worked, like, you know, great distance from the microphone. Your TV could be on, you could be, you know, shouting to it over a bunch of background noise, and it would still work. And what was so incredible was that it worked so well that I found I was building new habits around the product because it was reliable.

    8. JF

      Mm.

    9. DF

      So setting timers, asking for the weather, playing music. It was just like, I was blown away by how well those things worked over voice and that it was reliable. Um, and so I started looking into voice recognition technology to just build stuff on my own and to innovate and experiment on my own. Um, and there was nothing in the market at the time. It was either, like, bad or, um... So the, the company Nuance at the time, which was like a big-

    10. JF

      They were the big incumbent.

    11. DF

      Yeah, yeah, exactly. Um, I got in touch with them, and I, you know, was trying to find their developer SDKs, and it, it... You had to pay, I think it was like, you know, thousands of dollars up front.

    12. JF

      I, I, I remember. I, I purchased it, yeah. [laughs]

    13. DF

      And then they would mail you a CD-ROM-

    14. JF

      Ooh

    15. DF

      ... with a developer SDK on it, and I didn't even have a CD-ROM drive-

    16. JF

      [laughs]

    17. DF

      ... on my laptop. So, you know, for me, I was used to the Twilio or Stripe-

    18. JF

      Mm

    19. DF

      ... style developer experience and, and [laughs] this was, like, the opposite of that. Um, so all these, like, ideas kinda came together for me and I got really excited about the idea of what if you could, um, you know, use like these new deep learning algorithms, a- all this innovation that was happening around AI and deep learning at the time and to, to build way better, uh, voice AI capabilities and technology, and then make it super easy for any developer to build with those things. 'Cause for me, I remember getting access to Twilio, you know, back in the day-

    20. JF

      Yeah

    21. DF

      ... so like 2012 or 2013, getting access to Twilio, and I had s- I, I was just overwhelmed with creativity-

    22. JF

      Yeah

    23. DF

      ... because I was like, wow, now I ha- like, I have access to this cool technology-

    24. JF

      Yeah

    25. DF

      ... me as a, as a college student developer, and, like, I can e- experiment with all these really cool

  5. 9:3213:02

    Why Dylan built voice AI infrastructure

    1. DF

      ideas. And, and that has been the whole spirit about Assembly. You know, we're not an application company. We're really focused on being a developer infrastructure tool that just makes it really easy for a developer, whether you're, you know, a, a college student or a high school student or a development team at a Fortune 500, to have access to this amazingly powerful technology, um, [clears throat] effortlessly and with a really good experience around it. So that, that was the, [clears throat] kinda, um, the, the mix of, of ideas that came together and then, you know, is still what we're, what we're focused on, you know, delivering, uh, many years later.

    2. JF

      I don't know if you've read Paul Graham's essays, but, um-

    3. DF

      Some of them

    4. JF

      ... a lot of his, his early essays talk about how, um, most of the best companies of the past, like Microsoft and Apple, didn't start as companies. They started as just an engineer building technology that they found intellectually interesting, without any clear idea of how it would turn into-

    5. DF

      Yeah

    6. JF

      ... a business and scratching their own itch, right? Like, building the thing that th- that they saw missing in, in the world-

    7. DF

      Yeah

    8. JF

      ... that they personally wanted to use. And, like, you know, at the time that you started this, like, AI was not the hot thing. It was not, it was not the thing raising huge venture rounds. It-

    9. DF

      You couldn't use AI.

    10. JF

      Yeah. [laughs]

    11. DF

      If you said AI in your... It, it was deep learning was the, the-

    12. JF

      Okay

    13. DF

      ... the thing to say. If you used AI-

    14. JF

      AI

    15. DF

      ... it was, it was, um, it was like a scam. [laughs]

    16. JF

      Yes. [laughs] Exactly.

    17. DF

      That was the state of things. Yeah.

    18. JF

      Totally. Yeah. Really 'cause there had been like decades of, like, failed promises-

    19. DF

      Yeah, yeah, yeah

    20. JF

      ... from AI startups-

    21. DF

      Yeah

    22. JF

      ... so VCs were allergic to anything with AI in it. Yeah. [laughs]

    23. DF

      Yeah, exactly. Exactly. My experience as a founder has, has, has been one that I, I think you have to be obsessed with the problem that you're solving.

    24. JF

      Yeah.

    25. DF

      Because what keeps me going is, like, I really want this product to exist that, that we have not, you know, finished creating yet, you know? And, and I, I'm, I'm obsessed with, like, that problem and wanting that to, to be realized. Whereas I think if you're just focused on like a business opportunity, you know, um, then, and you're not obsessed with the product or the, the subject matter, um, you can find yourself losing interest. But I think if you're... You know, you should pick... M- my lesson has been, like, and what I share with other founders, like, you should pick a problem that you're, you're obsessed about. Because when I think about, like, what would I be doing if I wasn't building Assembly, you know, I, I don't know. Like [laughs] I'm having a lot of fun-

    26. JF

      [laughs]

    27. DF

      ... building Assembly. Um, and I use our product all the time. Like, I'm constantly-Dogfooding it and playing with it and, and showing people demos, and it's fun. So I think that, you know, picking something, working on something that, like, y- you, you want to exist because you wanna be a user of it, um, ha- has been really fun.

    28. JF

      To, to me, probably the craziest part of the Assembly story is that you started working on this in 2017 when very few people were interested in AI.

    29. DF

      Yeah.

    30. JF

      And it basically didn't start taking off until 2021, 2022. You, you raised your Series A in 2022, five years after starting the company.

  6. 13:0216:50

    Building AI before anyone cared

    1. DF

      Yeah. Yeah, so I got into YC summer to- summer 2017. Um, solo founder. I had pretty much just started working on the company, and I got in la- I applied late, got in a week before the batch started, and I was just so stressed out. [laughs]

    2. JF

      [laughs]

    3. DF

      Because, uh, you had all these other companies that, you know, had, like, real products, and they were further along, and, um, they could iterate so much faster. But, like, it... I picked a very hard product to work on back then, and I just remember being, you know, s- [laughs] so stressed out and so overwhelmed ba- back in 2017. Um-

    4. JF

      Yeah, everybody else was just, like, building database-backed websites.

    5. DF

      Right.

    6. JF

      They could, like, push things every day.

    7. DF

      It's like, oh, you get feedback-

    8. JF

      Yeah. [laughs]

    9. DF

      ... you can iterate on it that night, right? Whereas, like, our early users would give us feedback, and it's like, "All right. We'll come back to you in a month [laughs] when we have a, a better model." And YC is three months, so we had three-

    10. JF

      [laughs]

    11. DF

      ... you know, iteration cycles.

    12. JF

      [laughs]

    13. DF

      It's not, it's not a lot of iteration. Um, and no one was building with voice because, like, what I have now realized is that, um, this whole ecosystem of technology needed to be created l- so that you can build voice AI applications. What do I mean by that? I mean, yeah, you need really good voice AI models, but you also need LMS. You also need vector databases. You also need WebRTC. You need mobile 5G. Like, you need this whole ecosystem-

    14. JF

      Yeah

    15. DF

      ... um, to, to build what we're seeing developers build. You know, w- I remember, like, 2021, we started to get transcription models to work, like, decently well. Um, but the interesting applications were not just, like, transcribing something. It was sentiment analysis of a phone call. It was summarizing a meeting. But if you wanted to do those types of tasks back then, you had to train sentiment analysis models or train summarization models, right? So whereas now, you can use Assembly and, uh, you know, an LM, and, like, what you can do is... It's, it's insane.

    16. JF

      Yeah.

    17. DF

      And it's super easy to build-

    18. JF

      Yeah

    19. DF

      ... and innovate around voice data. So it took a while for that ecosystem of technology to come together, and I think it goes back to probably what I was saying earlier, where if in 2017 I, if I had put together, like, a business plan-

    20. JF

      [laughs]

    21. DF

      ... I'd probably been like, "I shouldn't work on this," because [laughs] you know-

    22. JF

      Like, like-

    23. DF

      ... the market is so small.

    24. JF

      Yeah, yeah. The tab is, like, $10 million.

    25. DF

      Yeah, exactly.

    26. JF

      [laughs]

    27. DF

      The market's so small, and all the technology sucks, and, like, why am I working on this?

    28. JF

      [laughs]

    29. DF

      But I, you know, I, I thought it was a fun problem to work on. Um, and I was really fortunate that back then... So Daniel Gross was our YC partner. Um, he had worked at Apple. He saw the state of voice recognition technology, and he was a really early believer. And so he invested personally in the company, um, and really supported the company early on post-YC. And we just had these, like, early believers in the company that knew it was very early. And I think for me, I always had conviction that it was kind of like self-driving cars, where, okay, once you... It wasn't a question of, like, product-market fit. Like, will, will people want voice AI technology? It was a question of, um, when will it be... When will it have product-market fit? When will it get good enough, like self-driving cars?

    30. JF

      Yeah.

  7. 16:5024:13

    The 2021 inflection point

    1. JF

      Who was that? W- The first real customer was-

    2. DF

      Like, real legit customer-

    3. JF

      Real legit customer

    4. DF

      ... in 2021. Still a customer of ours. Um, a contact center company.

    5. JF

      Okay.

    6. DF

      And raised our Series A in January of 2022. Um, and then to date, we've raised about-

    7. JF

      Which was still a little bit before LLM. So, like-

    8. DF

      It was, yeah

    9. JF

      ... what, what happened in 2021, 2022-

    10. DF

      Yeah

    11. JF

      ... that caused the, like, fir- the, the beginning of the inflection point?

    12. DF

      Yeah. So, um-

    13. JF

      Okay

    14. DF

      ... COVID was a big part of it.

    15. JF

      Mm.

    16. DF

      So during COVID, all of a sudden you had way more voice data created and captured over the internet than ever before with remote work. Podcasts started to get popular around that time. Our models started to get better. We started to use, you know, modern transformer architectures, train on more data. So, like, transcription was getting better, it was getting cheaper. Then you had other NLP models like BERT that made it easier to do summarization-

    17. JF

      Mm

    18. DF

      ... or sentiment analysis-

    19. JF

      Mm

    20. DF

      ... on top of transcription. So you had, like, the ecosystem of technology starting to come together, where it was easier to build a voice AI application. And so there was acceleration in the TAM around that time. Um, and Accel led our Series A, and they saw that too. Um, and then between then and now, obviously it's just continued to accelerate, you know, pretty dramatically. We raised about $160 million in the course of, like, uh-... I don't know, three years or something, um, last. So or, or yeah, like three years. So it was, uh, it, it took a while to get to that point, but then once it did, it really started to accelerate.

    21. JF

      All the use cases in those early years, were they all non-real-time use cases?

    22. DF

      They're all not real-time.

    23. JF

      On-

    24. DF

      Yeah

    25. JF

      ... all so, like, analyze all the calls that went through this call center o- for the past week-

    26. DF

      Yeah

    27. JF

      ... and, like, find the ones with bad, bad, like, NPS scores-

    28. DF

      Yeah. It was-

    29. JF

      ... or things like that

    30. DF

      ... pretty much all, you know, use cases on top of pre-recorded audio. And then now, um, there's still a ton of growth in those use cases. Um, like usage to our non-real-time APIs is still growing, you know, 200% year over year. But real-time use cases are exploding because the real-time capabilities are now way better, uh, way lower cost. You have, you know, ecosystem of real-time technology to build real-time note-takers, real-time voice agents. That market is really taking off, that use case. Um, but, but back then it was all non-real time.

  8. 24:1328:26

    Real-time voice agents are here

    1. DF

      Yeah.

    2. JF

      Okay.

    3. DF

      And it's, we're still in this phase where-You can't tell, but then like some, some, some people will midway through be able to tell, and it just gets weird. It's like, wait, I'm- I, I just realized I'm talking to an AI.

    4. JF

      Yeah.

    5. DF

      Um, so I think real-time voice agents are gonna continue to be widely deployed, and, um, and the ROI there is like crazy, and the capabilities around real time are getting really amazing. We're also seeing a lot of demand around, [sighs] um, robotics and consumer hardware. So a lot of like pretty popular robotics companies, I don't know if I can name them or not, [laughs] but a lot of really popular robotics companies are, um, putting our models on their robots, so you know, humanoid robots that are walking around.

    6. JF

      Yeah.

    7. DF

      'Cause-

    8. JF

      Oh, that's cool.

    9. DF

      Yeah. I think robotics, but also consumer hardware, you're gonna start interacting with that more and more through your voice. So even, you know, the coffee machine downstairs, right? Like you have to kind of-

    10. JF

      Right

    11. DF

      ... flip through a touch screen. But if you could walk up to that coffee machine and just ask for the-

    12. JF

      Yeah

    13. DF

      ... exact type of coffee that you want, you know, there's no reason why that technology shouldn't also exist.

    14. JF

      Totally.

    15. DF

      It took a while for consumer hardware to all have a touch screen on it. Um, but now I think, you know, voice will also be a, a another modality into these, um, uh, hardware devices that we, that we buy. So we're seeing a lot of demand there. But then also a- these ambient devices. So, uh, healthcare. We have a ton of healthcare companies-

    16. JF

      Hmm

    17. DF

      ... building ambient, uh, scribes for doctors.

    18. JF

      And this is a physical device?

    19. DF

      This is, uh, software-

    20. JF

      Okay

    21. DF

      ... that they're just running on a laptop-

    22. JF

      Got it

    23. DF

      ... or running on a phone.

    24. JF

      Sure.

    25. DF

      And the doctor/patient visit is captured.

    26. JF

      Yeah.

    27. DF

      'Cause if you listen to some of this audio, it's actually like cra- [laughs] crazy hard. Uh, it's someone's laptop is recording a conversation that's like 10, 15 feet away. Um, tons of background noise, low speech. But the models can do a pretty good job now-

    28. JF

      Wow

    29. DF

      ... at capturing all that at- with like, you know, accuracy rates in the 90 percentile. And so, um, now you can actually automate chart notes, and you can automate insurance submissions post-doctor visit. So we're seeing a ton of healthcare companies build ambient scribes, but also sales companies build ambient scribes. So-

    30. JF

      Mm-hmm

  9. 28:2645:33

    Inside AssemblyAI’s new voice models

    1. DF

      a common issue is like which voice do I listen to and take commands from?

    2. JF

      Ah.

    3. DF

      And which voice is the voice that's just like chatting with the human?

    4. JF

      Interesting.

    5. DF

      Right? So-

    6. JF

      Kind of the like noisy restaurant problem of-

    7. DF

      Yeah.

    8. JF

      Yeah.

    9. DF

      Yeah. And, and knowing like who, who is saying what, what's the role of the person. So all this we think about as, you know, more intelligent, you know, voice AI capabilities that we're focused on building that will operate in real time. Um, I, we think that's like a, a, a big capability that we're seeing a lot of demand for. And so that's... We just released a model last week, um, called Universal 3 Pro. It's our latest model, and it's our most intelligent model 'cause you can give it instructions on what you want it to do around the audio that it's listening to, and it can follow those instructions pretty well, and this is just the first release. Um, and I, I can show you some of that.

    10. JF

      In, in, in, in instructions like a, like an LLM prompt.

    11. DF

      Correct. Yeah. So it's, it's... We, we, we think about it somewhere in, in the middle between a multimodal LLM and a traditional speech-to-text model, where what we really did was focus on creating a reliable, you know, transcription model, um, that can, can stay on the rails on transcription and transcription-like tasks, but that c- you can still give instructions to. Whereas if you were to use a multimodal LLM today and give it instructions, it goes off the rails too much. It's not reliable enough in real world use cases. So, um, we focused a lot on the post-training of this model so that it's like a reliable transcription model, but you can give it instructions and really guide it on what you want it to do.

    12. JF

      Okay, so let me see if I, if I, if I understand this right. Like historically, the trans- transcription models, like the ones that you guys built, d- only do transcription. They don't... They, they lack the general in- like the, the general intelligence of a, of a, of an LLM. And so like the magical combination came when you paired a transcription model with an LLM. What this new model, it's kind of like the fusion of the two of them. It is a transcription model, but you've actually injected the like generalized intelligence of an LLM into it.

    13. DF

      Yeah. Mor- more or less, exactly. Yeah.

    14. JF

      Okay.

    15. DF

      Yeah. 'Cause we-If you think about a multimodal general purpose LLM, it can do transcription tasks, but that's maybe 10% of its total training data, right?

    16. JF

      Mm.

    17. DF

      So it also can do math tasks, and for, um, our customers, they need a reliable voice AI model that can understand speech and speakers. And so we really focus the model on those types of tasks.

    18. JF

      Yeah.

    19. DF

      And, um, it doesn't get confused and think it's like a- an assistant. [laughs]

    20. JF

      [laughs]

    21. DF

      It, it will, it will kinda operate as this like... as this kind of i- i- in this like narrow space.

    22. JF

      And I assume it's also much smaller and faster and cheaper-

    23. DF

      Yeah, correct

    24. JF

      ... to run than a full LLM, where you're, like, taking this giant brain and having it only do this one little thing, right?

    25. DF

      Yeah.

    26. JF

      Yeah.

    27. DF

      You can run it in real time. Um, you can actually deploy it on your own server, so we support self hosting these models-

    28. JF

      Mm-hmm

    29. DF

      ... uh, to get latency down. And where we're really excited is for customers to use this in, um, their real-time voice agents, because you can just capture a lot more, uh, a lot more information from the speaker.

    30. JF

      Can we see a demo?

  10. 45:3352:00

    Lessons from hypergrowth

    1. DF

      and hire more people into that. But I think a lot of companies y- you know, will follow the traditional playbook of, like, "Okay, this company, which is similar to mine, did X and Y and Z, therefore, like, I should do X and Y and Z, and I should go build out a, you know, 50-person SDR team. And I should go, you know, I need this and that," as if it's, like, a franchise business.

    2. JF

      Mm-hmm.

    3. DF

      Um, but I think every company's different. And so knowing what to invest in, what teams to build, um, especially in an emerging market like the one we're in, is- is- is difficult. And, um, now we're more conservative over, like, hey, where do we really wanna invest versus where is this just an, you know, something- s- an area we wanna explore? And if it's exploration, let's do that with the existing team. Um, and then also just, like, really just strongly filtering for, um, non-negotiables and- and culture fits. I think-

    4. JF

      Like what?

    5. DF

      So every role, I think, has different non-negotiables. Um, but then you start talking to people and you're like, "Oh, I like this person," and, you know, we're- we're vibing and so, "Yeah, let's hire them." But then you realize, oh, I really, you know... This thing was really important in the role, and I should've been firmer in that- on that non-negotiable. So, like, now we're very clear on, all right, for these, for this role, these are the non-negotiables. For this role, these are the non-negotiables. For me as a founder, one thing that's really important when hiring is, um, why do you wanna work at a voice AI company? Like, why here? Um, not just an AI company, not just a VC-backed company, not just a company that's growing fast, but, like, what about our product? What about our space, our market, our customers are y- are you passionate about? Um, because I think that, um, that- that's a really important ingredient, right? Like, there's- there's challenging problems-

    6. JF

      Mm-hmm

    7. DF

      ... here that, you know, in any company there's a lot of challenging problems. But I think that, you know, what I care a lot about is- is- is finding people that are really excited about our product and what our customers are building, um, and- and will be a lot more, uh, you know, obsessed and passionate as a result. And so that's something that I think I've, I also, um, am a lot more, like, strict on when we're hiring people now. So there's all these, like, things that I've- I've- I've learned, and I think that- that was probably, um, one of the harder lessons to learn. But now we're- we're a- about 80 people at the company, um, super dense team.

    8. JF

      Only 80 people?

    9. DF

      Yeah.

    10. JF

      That's actually a lot smaller than I would've expected given the- the 700 billion hours of audio [laughs] that you're processing every year.

    11. DF

      Yeah, yeah. Super, super small, dense team. I mean, we're, like, you know, using AI everywhere across the company-

    12. JF

      Mm-hmm

    13. DF

      ... um, to just be, like, to move as fast as possible. And I- I think, you know, one thing we talk about at the company is we're gonna optimize for speed and innovation. Like, that's gonna be an explicit decision. So we don't do, like, OKR cascades, and we don't do planning exercises. Like, we have m- clear metrics that we wanna hit, and those are very transparent across the whole company. And then we're all just running towards those, and we can move quickly and innovate and be dynamic and responsive because we're this small, transparent team. Um, and I- I think that my goal is, like...Having as little operational overhead as possible at all times, like teetering like just on the, the, the line of like complete chaos. Like, [laughs] like I wanna be like right there.

    14. JF

      [laughs]

    15. DF

      Um, 'cause we've, you know, we've like oscillated, right? Like, it feels good to have like, here's every team's OKRs, and here's every team's six-month roadmap, and here's every team's-

    16. JF

      Yeah

    17. DF

      ... like strategy doc. And-

    18. JF

      Like Microsoft style.

    19. DF

      Yeah. It's like-

    20. JF

      Yeah

    21. DF

      ... here everything is like planned and organized, and it's like that, that feels good.

    22. JF

      Yeah.

    23. DF

      But like that's not a replacement for like you actually gotta go build the thing and sh- and do what's in that document.

    24. JF

      Yeah.

    25. DF

      And then what if two months in you get new information that you should adapt to?

    26. JF

      Right.

    27. DF

      If you're committed to some plan, you're like, "Oh, we'll do that next quarter."

    28. JF

      Yeah.

    29. DF

      It's like, well, we should be doing that right now if that's-

    30. JF

      Yeah

  11. 52:0053:00

    The future of voice AI

    1. DF

      getting. Like-

    2. JF

      Hmm

    3. DF

      ... all that is brought in-

    4. JF

      Hmm

    5. DF

      ... to the company. Th- there's like all the like truth out there, right? Of like your product and feedback in the market and stuff, right? Um, and I think it's easy to like put a subjective point of view on that, right? And usually it's like the people in charge that get to decide what, what the subjective point of view is.

    6. JF

      Yeah. That's fair.

    7. DF

      But if, if you have all that, all that truth just like publicly available, it's like e- everyone is looking at this information objectively, and there is no subjective layer. Um, and I think that's been so powerful for us as a company, and I think that that's how like most companies should and will... The companies that are g- successful I think will operate in the same way.

    8. JF

      Yeah, I mean, it's, it's just essential. Like the, the, the, the companies that operate without AI augmentation of human intelligence will lose to the ones that have augmented their intelligence.

    9. DF

      Totally. Yeah. [upbeat music]

Episode duration: 53:00

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode Elm2p_TRPwk

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome