$6.6B AI CEO: How to Make Your First $10,000 with AI
EVERY SPOKEN WORD
45 min read · 9,253 words- 0:00 – 1:23
In this video
- MSMati Staniszewski
we paid about $5 million to the entire community.
- MMMarina Mogilko
Meet Mati, CEO and co-founder of ElevenLabs, a company that has grown into a $6.6 billion leader in the Voice AI space, shaping how we talk, work, and even earn money. They've created an entire voice marketplace. Now, anyone can clone their voice and earn passive income. Can you name some opportunities that you see that can make people a decent amount of money so they can make a living, like $10k a month, something that's immediate?
- MSMati Staniszewski
Business, and you just want to make good money, I would try to take those voice agents and go to, let's say, local doctor's office and-
- MMMarina Mogilko
ElevenLabs built the world's most realistic voice tech. The question is: Can they control what happens next?
- MSMati Staniszewski
Most of those companies just don't know this is possible. You don't have to be the coder, you just need to- [beep]
- MMMarina Mogilko
If my voice is authorized to use my credit card to buy anything, and then somebody just uses the resemblance of it-
- MSMati Staniszewski
I think it's, it's, it's going to happen, but, uh-
- MMMarina Mogilko
Hey, guys, welcome to Silicon Valley Girl. We have one of the guests today whose product I've been using for a while now, so I'm gonna ask a lot of technical questions [chuckles] as well. But please welcome Mati from ElevenLabs.
- MSMati Staniszewski
Well-
- MMMarina Mogilko
Thank you so much.
- MSMati Staniszewski
Thank you so much, Marina. Great to see you again, and thanks for, thanks for having me.
- MMMarina Mogilko
Yeah, thank you. I feel like you're one of the pioneers of this AI industry, because when I ask people, like, what apps they're using, when I'm talking about apps that I'm using, I always mention ElevenLabs, 'cause it's been a lifesaver.
- 1:23 – 2:28
Role of voice in AI
- MMMarina Mogilko
I wanted to start with a question, um, about the role of voice in AI. So what it feels to me is that 2023, you know, we started adopting ChatGPT, it was all text, and then these voice capabilities became more and more powerful. It understands what I'm saying now, it understands my accent. If I mispronounce something, it still gets me. Do you feel like we're moving into the era where voice is our main tool to interact with AI?
- MSMati Staniszewski
I mean, 100%. I do think that voice will be the... one of the key interfaces to the technology around us, and, um, and that shift is happening. Like you said, it's like a few years back, you wouldn't even dream of this being possible, and now I think it's, it's becoming a reality, where it, it allows you to transfer so much information, more than the text. You can, you can get the emotionality, the inflection pattern, the imperfections reflected in the voice, which of course makes it easier for the, um, if it's an input, for the, uh, for the technology to understand a lot more about the, the setup that you... or what you are trying to achieve. And then if you hear it back as well, I think it's a lot better and more pleasurable, um-
- MMMarina Mogilko
Right
- MSMati Staniszewski
... experience
- 2:28 – 5:22
Can AI Voice agent generate and convert leads
- MSMati Staniszewski
as well.
- MMMarina Mogilko
How do you see voice transforming businesses? Do you have any cases where people are using voice to generate leads or convert leads?
- MSMati Staniszewski
There's definitely a few different areas, whether it's on the more classic, uh, customer support use cases, where you, instead of having a old IVR system or n- no system, you can now deploy a voice agent that will take the calls instead, and, and will both delight the customers on the other side because it understands you, it's quick, it's good, um, but then also just performs better. And then outside of customer support, we are seeing that across the entire life cycle of, of, of the user journey. In some places, where, uh, uh, it adds some- an experience that was impossible before, and a simple case is, uh, inside of the product or even outside of the product, um... and you might have seen back in the day, there was those widgets for chat.
- MMMarina Mogilko
Yeah.
- MSMati Staniszewski
Now, you could have a voice agent that helps you navigate through the product experience, so it becomes your, like, a partner, programmer, product person that helps you navigate through that, that life cycle. And you also mentioned, so of course, some of the big pieces is in inbounding and outbounding. We actually use it ourselves in ElevenLabs too, where, um, where, of course, we, we do have a standard flow. We have people that will answer the, the, the, the, the reply and take a, uh, a phone call, too. But if you want to go quicker, you can speak straight directly with our agents to understand our product offering, understand our pricing, understand what you, what you can do with the product, which helps you accelerate through the pipeline, depending, uh, and sometimes self-disqualify if you are not the right, uh, um, fit for our product offering, and sometimes helps you accelerate, "Okay, this is exactly the set of use cases I can do. This is how I can deploy," and then routes it to other people.
- MMMarina Mogilko
So it doesn't actually convert?
- MSMati Staniszewski
It, uh, in some cases, it does. In some cases, it's, um... as a quick step back, we have a few different tiers. We have, like, a business tier and an enterprise tier, so it does convert immediately sometimes to the business tier program.
- MMMarina Mogilko
'Cause it's a preset, like-
- MSMati Staniszewski
Because it's preset-
- MMMarina Mogilko
Yeah. Mm-hmm
- MSMati Staniszewski
... it's self-serve. Um, on the enterprise side, we all still run KYC checks, so it doesn't do that immediately.
- MMMarina Mogilko
Mm-hmm.
- MSMati Staniszewski
Uh, but, uh, but on the business one, it, it, it does. And, and then we've seen some of those voice, um, agents also, um, from, from a lot of the technology and platform we built, help in a completely different non-commercial aspects, too.
- MMMarina Mogilko
Quick follow-up question for, uh, uh, like, about the, the sales process. Have you measured the conversion, uh, percentage into sales with the AI voice salesperson? [chuckles]
- MSMati Staniszewski
We did, but given it was, uh, and I don't remember the number off the top of my head, but given there was alternative before, would have been just waiting, so it was just a net new a- amount of leads, and we received so much inbound of, of using a lot of the products, which we are l- lucky to, to have, that it helped us just convert so many more leads that we would have otherwise taken weeks, months, or, or maybe never gotten into, which is great.
- MMMarina Mogilko
How can I set this up for my company?
- 5:22 – 7:14
Get the best domain for your business
- MMMarina Mogilko
Let's take a quick break here. You know, as we're talking about how AI is transforming sales and support, there is one thing that hasn't changed for any business. No matter what tools you use, you still need a home for your product, a website where your customers can actually find you. And here's the challenge that I've experienced myself many, many times: Try registering a good .com domain today. [ding] ... Almost everything is taken. You end up with these long, awkward names that don't really match your brand. That's why I was so excited to discover .online domains. It's actually the world's second-largest new domain extension, trusted by more than 3.5 million businesses worldwide. And the word online itself is incredibly powerful. It's searched over 500 million times every month, which means it helps you rank higher and become more discoverable in search. What I really like is how it works for literally any type of business: freelancers, creators, service providers, big or small companies. I've seen everyone from global stars like Maluma, Colombian megastar with over 100 million audience across different social medias, with maluma.online, to the classic game Minesweeper, we all spent hours playing, now lives on minesweeper.online. So if you're trying to build your business on .com domain, for example, voiceagent.com, you know how often the good names are gone. With .online, it's much easier to secure the domain that actually fits your business. Whether it's an AI startup, a side project, or your personal brand, now is the perfect moment to claim your domain. And the good news, for a limited time, you can get it for just 99 cents for the first year with my exclusive link and coupon code. Just go to www.get.online or use the code from the description and secure your name today. Let's get back to the interview with Mati.
- MSMati Staniszewski
Which is great.
- MMMarina Mogilko
How can I set this up for my company?
- MSMati Staniszewski
The easiest one would
- 7:14 – 12:00
How to set up voice agents in your business
- MSMati Staniszewski
be to register on our platform, uh, so that- that part of offering, and we have two key offerings, is our agentic platform offering. You jump into the platform, and we help you abstract two elements. The first one is all the research or experience complexity, so we help you connect the speech, the LLM elements, the text-to-speech elements, so, so the agent speaks in a smooth and a, a, a quick way. So it's a very, a low latency, a reliable part on that side. And then there's a second part, where you will need to spend a little bit more time on bringing your business logic in place. So an example could be, what's the knowledge base of how your business operates, or what are the questions you want to be asked? What are the materials you want to surface? So you would bring that into the platform. Then we have a set of workflows that you can set up. Effectively, imagine like if this happens, this happens, or if this happens, I want this function to trigger. Um, this could be if someone is calling me and I want to appoint a schedule, an appointment, we have a predefined workflow for you to be able to do it, so it can look into your calendar, uh, find an appointment-
- MMMarina Mogilko
Okay, I'm selling a course, basically. Like, what Language Trip does, we're sell- we sell courses, so basically-
- MSMati Staniszewski
Okay, so even simpler
- MMMarina Mogilko
... want to be able to sell courses to people. Can I do it in different languages using my voice?
- MSMati Staniszewski
You can.
- MMMarina Mogilko
Wow!
- MSMati Staniszewski
So you could, you could... And so, so it's selling the courses, and the people would call in, buy the course, and off, off, off to they go, and maybe they onboard with the agent later on to help.
- MMMarina Mogilko
How do they, how do they buy over the phone? Do you send them a link, ask for their email, or they just... Yeah?
- MSMati Staniszewski
Depends. Uh, but the simplest would be what you suggest, which is we do have an omni-channel solution, where you effectively get the link as part of that, and you can leave additional details, or you have a follow-up on the email of, like, a checkout subscription for the course. So both of those would be possible. Or you could, depending on how that website is set up, you could effectively embed the agent on your website, so it helps you redirect to the subscription page. It guides you through it, and they check out themselves live with the agent that helps them-
- MMMarina Mogilko
Wow!
- MSMati Staniszewski
... fill, fill in the form. But like you said, uh, one of the great things on the function side is that you can, you can, you can switch languages, you can hand over, like, let's say-
- MMMarina Mogilko
That's fascinating for my business.
- MSMati Staniszewski
So it's... I mean, you've been pioneering a lot of that, uh, language learning work, and I think this would be amazing because both it would switch the language and it would switch it with your own voice. If that was your own voice, it continues speaking in that same manner. And then, of course, the last piece is all the integration, so we support integrations with-
- MMMarina Mogilko
I didn't know you had it. Congratulations.
- MSMati Staniszewski
Thank you. It's one of the big... So yeah, it's, it's, and maybe that's a good cue for me as well, because when we started the company, we of course, started from pioneering a lot of the research on the speech side, so text-to-speech voices, and then we expanded to speech-to-text, orchestration models, now music. But as we think about the research, it's always how we can push the audio frontier forward.
- MMMarina Mogilko
I love how you found this new opportunity, and now it's bigger chunk of your business, as far as I understand. How much would it cost for a business like mine, small business, to have AI answer the calls and sell?
- MSMati Staniszewski
I think the... And, and of course, it depends on the volume, but I think what hopefully will happen is that both you will see more people coming through, and if we set it up in the right way, maybe this will mean even opening up the channel, which over time hopefully means even more calls. But I think to start, it would be in order of hundreds of dollars per month.
- MMMarina Mogilko
Mm-hmm. It's also IP calling, right? Uh, is integrated in that.
- MSMati Staniszewski
Yes. So we integrate with Twilio or, or, or telephony systems-
- MMMarina Mogilko
Okay, so whatever we're using
- MSMati Staniszewski
... so you can bring, yeah, you can bring-
- MMMarina Mogilko
Nice
- MSMati Staniszewski
... any phone number that you already have, and it works. I, I don't know who-- currently, do you already accept any of the calls coming through the telephone too, or it's all, all on the website?
- MMMarina Mogilko
We mostly try to navigate them to WhatsApp, because a lot of people who are calling, they don't speak English, so they don't feel comfortable. But if we advertise that it's, you know, Marina's voice, AI, nobody's judging your accent. Because I feel like when people even talk to me, if they're a non-native speaker, they first- the first thing they do, they're like: "I'm sorry, my English is [chuckles] not as good as..." You're like: "It doesn't matter." But I feel like even, like, using English to make a phone call is such a huge barrier for non-native speakers, and I feel like if you understand that you're talking to AI, it just makes it so much easier.
- MSMati Staniszewski
That's true. It doesn't judge.
- MMMarina Mogilko
Yeah.
- MSMati Staniszewski
You can be bold mistakes, which is maybe a, you know, like, uh, uh, there's a completely other aspect, what, uh, you of course, have been helping people learn languages for a long time, but maybe there's even an aspect where they could practice speaking their language with you, uh, which would be like a, you know, kind of a slightly different, of course, deployment, but completely possible, where you can give them tips, improve, uh, and, and, and effectively create a, a Marina's Duolingo that people-
- MMMarina Mogilko
Yeah, yeah, absolutely
- MSMati Staniszewski
... have dynamic experience with-
- MMMarina Mogilko
Well-
- 12:00 – 15:18
How to make money by selling your voice
- MMMarina Mogilko
ElevenLabs to work as a sales agent. Let's talk about, like, I have this number here, where you paid $2 million in royalties?... to people who kind of share their vo- voices with ElevenLabs. Can you talk about that? How can people start making money by [chuckles] -
- MSMati Staniszewski
Yeah
- MMMarina Mogilko
... sharing their voice with ElevenLabs?
- MSMati Staniszewski
So, uh, it's, it's, and one of the efforts we launched in the early days, where we, we effectively created a voice marketplace, voice ecosystem, where every person can create their own voice. Go through authentication flow, you need to record roughly 30 minutes or more of you speaking. Then you have a perfect replica of your own voice that speaks in, in the language you recorded, plus all the language we support. So you have usually 30 or so, um, different variations now. With the new model we are releasing, it'll be 70. So, um, so you have the voice that, that's now available for your own use, and then if you decide, you can share it to our marketplace. And if you share it to your- to our marketplace, a specific period of time, specific conditions of what you are sharing it for, then other people can use it across ElevenLabs ecosystem, and when your voice is being used, you get paid back as a result. This way, we have now almost 10,000 voices that people shared and created. What is incredible is it spans so many different languages, accents, um, um, different styles. So like, now, if you are logging to the, to the, to the platform, you just have this incredible plethora of voices, and we pay, uh, p- pay voice, uh, p- pay voice down back. So it was, I think, $2 million at the beginning of the years that we paid back, and now, um, I think, uh, last time I checked, it was a few months ago, we paid back $5 million-
- MMMarina Mogilko
Wow
- MSMati Staniszewski
... to the entire community. And it's growing.
- MMMarina Mogilko
How much does average- an average voice creator make?
- MSMati Staniszewski
De- it depends. Uh, of course, you know, like, so it's like the, the, like, probably in total approaching close to $10 million, and we have close to 10- 10,000 voices. Um, so that would be like, you know, if you, if you, if you take the average. Uh, but I think it, it's, um, especially given a lot of the voices got are kind of new, and it takes a little bit of time before they take attention. You also, to actually make it successful, ideally, you try to engage some of the community around that they can see the voice-
- MMMarina Mogilko
Mm
- MSMati Staniszewski
... whether it's the Discord, the Reddit, some of the other forums. It definitely helps break through that initial. And if not, over time, we also try to surface new voices and, and, and get them out in the audiences. So it really depends. I think there'll be a lot of people in, like, a few hundred dollar per month category, uh, and that's probably what you could expect if, if you, if you do a little bit of that effort, and, and what you could, what you could earn. However, the, you know, it's, it's, um, I think it's true that it's... If your, if, if your voice sounds very similar to other voices, it's very much harder.
- MMMarina Mogilko
Yeah, yeah. It's interesting how many voices, like, in general, do you-
- MSMati Staniszewski
And how many can you distinguish?
- MMMarina Mogilko
Yeah.
- MSMati Staniszewski
But if you, if you have a unique voice-
- MMMarina Mogilko
Yeah, if you're a creator, right?
- MSMati Staniszewski
... if you have a new accent, exactly, then, then it, it can, it can be, it can be incredible. Our first voice, uh, one of our first voices that got shared, and it was a Spanish voice that had a very deep, um, way of, of speaking, the, the prosody, and, uh, that voice became one of the most popular, not in Spanish, but in English-speaking countries, and became, like, our top-ten voice, um, where, where, where it was just such a unique and different-
- MMMarina Mogilko
Interesting
- MSMati Staniszewski
... experience.
- MMMarina Mogilko
Let's talk about the nuances of cloning your
- 15:18 – 17:37
How to clone your voice
- MMMarina Mogilko
voice, 'cause, for example, so what happens sometimes in my team, we clone my voice using all the different mics that I have, but sometimes we insert it, and it's still slightly different from the video. 'Cause the way we use it is that, you know, we recorded something here. I recorded some brand deal or whatever, and then I start traveling, and they're like: "Could you re-record this phrase?" So we just take a piece from the video, uh, redo it with the phrase that the brand asked for, but then we insert in the video, and it's slightly different. Like, the... It sounds in a different way. Are there any ways to fix it? Yes, of course, we re- like, ask it to, uh, remake it again, but it's still, like, not exactly what we recorded.
- MSMati Staniszewski
No, it's, it's, it's, um, it's, of course, a tricky problem, where when you create a voice, you most likely take the voice throughout the entire video, and then you create that voice, and then and it, it, it is the effectively, the average of how you spoke around that video. But in a given scene, you will have maybe changed the intonation pattern a little bit, or the emotional pattern is slightly off that average. Um, the ideal way would be to effect for us to do more of the conditioning on, of, like, what you do pre and post in the video, so we take that more of as an input, and we try to morph it in in a slightly better way. Uh, and then there's a second thing. Sometimes, even though I know you'll try to clean up the voice and, and, and then add the background sounds, background effects, they might be by, by, by just the process, be mixed in and then still doesn't smooth entirely. So from our side, what we hope to do over time is that the, as you insert those videos, we can precondition it after three seconds and after, and it will sound better. So that's something we are working-
- MMMarina Mogilko
Okay, so you have that feature. So upload the video-
- MSMati Staniszewski
So we are working on that. Not yet.
- MMMarina Mogilko
Okay. Not yet.
- MSMati Staniszewski
It's not applied, but it's, it's going to be the big piece.
- MMMarina Mogilko
Oh, we need that feature. [chuckles]
- MSMati Staniszewski
We definitely need to bring it there. I think in the, in the short term, what you mentioned is, is what we see as the most common pattern, which is redoing and, and, and regenerating. But the other thing you could try is, uh, try to, instead of, um, taking longer audio sample across the video, just take few, even few seconds, uh, which I know sounds like maybe it will be wrong- worse result, but if you just take a few seconds from that fragment and create that lower-quality version, it actually can, could, could sound pretty good.
- MMMarina Mogilko
Mm. Okay, thank you.
- MSMati Staniszewski
Well, happy-
- MMMarina Mogilko
So where, where do you see all of this going with people recreating their voices?
- 17:37 – 21:37
The future of Voice AI
- MMMarina Mogilko
Will everybody have a clone in two or three years? Like, because that we couldn't... We could have thought about, you know, ElevenLabs when I heard about it, like, two or three years ago, right? I couldn't think about a salesperson using my voice. Now, we have it. What do you think is gonna happen in two years? What is this new use case that this all is going to unlock?
- MSMati Staniszewski
Interesting question. Of course, we are seeing, like, kind of entirely new ways of, of, of, of interacting with voices. So I do think, yes-... you will have your digital AI voice, and I think even a step further, you will have your own digital voice agent that does things for you, that you want to make sure it's authenticated, people know you operate. So, you know, like y- we spoke about the example of how people can call in, you can configure a voice agent, but I think the other side will be also true. I will have my voice agent that maybe-
- MMMarina Mogilko
Calling the bank? [chuckles]
- MSMati Staniszewski
Or calling-
- MMMarina Mogilko
'Cause they use voice authentication, right? It's gonna change.
- MSMati Staniszewski
I think that's not the best mechanism for, for the future.
- MMMarina Mogilko
Anymore, yeah. [chuckles]
- MSMati Staniszewski
Not, not anymore. Um, but, uh, but like, say, you want to book a restaurant or follow up about appointment in a, in a, in a, in a, in a healthcare and, um, and you want to make sure that they know your most recent details or that it's confirmed. I think you will want an authenticated version of voice agent. I'm saying the authenticated, because like you say, most of the verification, if they don't, will, will fail, and you want to know that it's a permissioned voice. Um, so you will need to start embedding watermarks and, and, and, and metadata around that. Um, but I think the, the-- to, to kind of go back to your question of like where it all evolves, I think there will be like an interesting pattern where... And I think it will happen on both sides, as a user, but also as a business, you will be able to serve so many different voices to your customers, or you, as a customer, can decide what voice speaks to you. So to speak for specific examples, we are working with a company in, in, in Korea, Korea and Japan, um, it's a multinational company there, which has a very different, um, age groups calling in, um, set of older, uh, uh, patients and then much younger, uh, uh, um, set of, set of people. And they want to serve, depending on the data, the number that is calling in, serve different voice to that group, both in terms of how it speaks, um, or how it sounds, but also the style in which it speaks. Um, of course, it's a, it's a, you know, it's a generalization, but roughly, they wanted that if you-- if an older person is calling in, the voice speaks much slower, much calmer, less emotionality. If it's a younger person, much quicker, a lot of higher amplitude of emotions. And I think this same pattern will start happening across everything, where if you are calling in a specific region, you might have an accent of that region. If you are calling a restaurant that's maybe representing a specific cuisine, you get a voice of that cuisine speaking with you. Um, and, and maybe there are like variations of all those different types, um, which, which, which, which can work. And then separately, as a person calling in to any of those services, you could pre-select that too. So if you are calling a bank and you enjoy speaking always with the voice of, um, this specific style, then you can select it, and that voice will be the voice of your preference. We've seen this, uh, happen in, in an, uh, in a company in... also in, in, uh, in Asia, where they created a, um, effectively, a, a, a travel agent or like a, a Google Maps, uh, uh, competitive product, where you can select a voice that narrates your direction, and one of the voices they selected became, like, viral, and everybody wants to use it now-
- MMMarina Mogilko
Nice. [chuckles]
- MSMati Staniszewski
... and, uh, in the, in the travel directions because it just made for such a better experience.
- MMMarina Mogilko
That's so cool.
- MSMati Staniszewski
So if I extrapolate in the future, I think there will be a lot more both personalization, but also selection that you can choose into. I think 100% true, you will have your own authenticated voice that you can use for your voice agent, for your content.
- MMMarina Mogilko
That has all the information about you, right?
- MSMati Staniszewski
That has all the information that you can-
- MMMarina Mogilko
Interesting
- MSMati Staniszewski
... that you can use.
- MMMarina Mogilko
That's very interesting. I like that part, like having my voice call and be authorized to use my data. How do you talk about impersonation with voice? Like, if there's- if
- 21:37 – 25:10
Deepfakes & the 3-layer safeguard model
- MMMarina Mogilko
my voice is authorized to use my credit card to buy anything, and then somebody just uses the resemblance of it, uh, will there be any metadata that could be detected by other systems? And how would it... what would it look like?
- MSMati Staniszewski
Yeah, it's, um... So I think first of all, I think it's, it's, it's going to happen. Like I, I think the assumption we should be going with is that where, um, where, you know, you will have good actors, good technology, trying to avoid it, but then there will be also more permissive and, and, uh, technology and, and, and, and bad actors trying to abuse it with any technology shift. And already now, there is a lot of open-source technology, other commercial technology, which doesn't have the same safeguards, that could clone your voice and create a mimicking, and that sounds like you. Uh, so I think any system that we think about devising in the future kind of needs to, uh, assume that you can create a clone of a voice and, and, and, and, and make it a perfect replica. Now, of course, if you're, like, like, as I think about ElevenLabs, we can and we do add safeguards. As you create a voice, we cannot do that or if you do, we detect it and moderate and can flag it internally if we are not sure. Um, so whether it's, it's being able to trace everything back to, to the account or moderate what text was used, whether it was trying to do a scam. Um, but to core of your question, like, as we think about the future, the ideal system, and it would require cooperation from number of parties, would have three different layers. And then the first layer is, instead of trying to check for AI, you actually check for human. That's easy for me to say. Of course, there's like, how do you check for humanness? But a, a sim- simpler step or a general step could be that you, on the devices that you use, so on my telephone or on my, uh, laptop, I am encoding that this is my phone, my, my laptop. When I'm calling from it, it's being decoded on the other side. They know that this is the device I use, so most likely, this is me. That's the first layer. Second layer is actually what we spoke about earlier, where, uh, and that's, that's possible, you watermark authenticated AI. So if I'm using, uh, a specific tooling, the tool- the tools that can add this watermark are known, and I watermark that within the content. It's not, um, super straightforward, especially in audio, because you- if you add a watermark in content, it can affect the quality of the content itself, but it's roughly, roughly good. And, um, and that's the second layer, so you check for authenticated AI. And then the third layer is, by default, it's AI, and you assume it's AI. So if it didn't pass the first or second layer, and you see content that hasn't been authenticated or approved for being a human, it's AI by default, and you don't trust it. And then you can add more mechanisms on top of that third layer, where you're like-... try to explicitly check or add additional signal of like, ah, this is real. But that would be a mindset shift, where today, if you look for content, you're like, "Oh, maybe this is AI." It should be the opposite, where it's like, "Oh, no, this is definitely AI. Is it maybe human, or is it maybe AI that was created with creator's permission?" And then you have those cases in between that will be interesting as, as you, of course, create the content. You mentioned that sometimes if you need to re-record, you might create an AI voice with, of course, with your, with your, with your, with your permission. But then, um, do you do that across the clip? And maybe you do that, like, 1% or 5% of the content is AI voice. Maybe in the future it will be 30 or 50%. And at what stage would you say this is, like, your AI delivery or, or human delivery?
- MMMarina Mogilko
You're, you're a founder in AI. How do you sleep at night [chuckles]
- 25:10 – 27:35
Main fears of an AI founder
- MMMarina Mogilko
when everything is moving so fast? Uh, what are your main fears, or what keeps you up at night?
- MSMati Staniszewski
I, you know, like, I think there are two parts to it. I think the first part that I need to, to mention is that it's, it's, it's, it's, uh, such an incredible opportunity with the shift. Like, it's, uh, the biggest shift or maybe bigger shift than the internet, and we are at ElevenLabs, so I'm happy and lucky to be part of that shift and be leading on the voice frontier. So I, I, I think that... And I think that the team and all of us are feeling that, that we have a unique opportunity that never happens in your life, that you can create a technology, define how it will be used, and hopefully create value across, across, whether it's voice agents and how a voice interface will look in the future, whether it's making content global, whether it's making content available in audio. Um, but of course, with all of that, as you think about being at the frontier, it, like, also makes us carry some of the responsibility for how we define that. So, um, so a lot of our parts will, will stem from that. I think the first one is, we still think there's innovations on the research level that you can bring into the space, at least one or two big ones in audio. And we've been able to do it so far in text-to-speech, speech-to-text, recently in music, but we still want to continue leading and continue being better than some of the biggest labs in the world, whether it's, uh, some of the new, new AI companies or all in humans. We, we think we have that opportunity, and, uh, and that is motivating, but of course, definitely causes less sleep at night. Uh, um, uh, the team is, is, is super hardworking too, which, which makes for shorter nights. Um, then from the risks perspective, we spoke about some of those. We, uh, uh, we do feel like it's our responsibility to make sure that we avoid some of those risks, so we are trying to invest a lot of time in developing safeguards around that. Then, of course, the third one, with a lot of the technology, how the economy, uh, or how the jobs in that economy will change, and we would like to do it in a way which brings a lot of the people in that economy together with the change, rather than it's change that will just affect it and disrupt it. But how can so- some the, of the people that want to be part of it, be part of that disruption, too? That's the, the voice ecosystem that we built is part of that- that- that reason. Um, uh, but yeah, of course, I think we need to, we need to keep hiring amazing people, keep, keep pushing ahead. As well, so much is happening, I still think it's very early. I may be biased and self-serving here, but, but it's, it's still very early.
- MMMarina Mogilko
You mentioned jobs that are being replaced
- 27:35 – 31:23
Jobs at risk and how to adapt
- MMMarina Mogilko
with voice technologies. What do you think are the jobs that are at most risk? I guess, like customer support, and what should these people be doing now to not get replaced in a couple of years?
- MSMati Staniszewski
I think the, the trope, and I, uh, and I think it's very true, is that all the people that will be replaced will be replaced by people that use AI. So I, I think this is the key message, that, like, you should effectively go into trying a lot of those tools, um, uh, and products, so you stay at the, at the frontier. And then the people that are in any of those jobs that use AI, I think can actually benefit a lot, a lot, too. And, um, e- even in, uh, customer support, of course, a lot of that will, will, will, will shift. But, for example, what we are seeing is that the simple manual tasks of, an appointment, picking, or doing and processing a simpler refund, all of that is, uh, is, like, very manual, very, uh, uh, recipe-based in most cases. But then, as you go to the more complex parts, you need a human expert to help close that gap. Um, and that part of the process is actually even more in need, whether it would be debugging a harder problem that, that you have in the product, whether it's understanding your... What happens after the appointment, there is a specific thing you receive, then you want to decide whether you need, uh, the, the X or Y, uh, help, which of course, needs to go through some of the regulation, too. But for all of those, you, I- the kind of the, the pattern is that the expertise is even more valued. And of course, over time, I think that AI will start shifting and taking more of that. So there will be, like, some percentage that goes across. Um, but, uh, but that'll be my, my, my main piece of, like, if you understand how the AI works, you can become more of the expert and better knowledgeable yourself, um, and, and, and help. And that's also true in a creative space, too. I think in the, uh, so, so much-- you can do so- you can iterate so much more frequently. You can produce to the wider audience, um-
- MMMarina Mogilko
You have to move faster and faster. That's what I'm feeling with this AI.
- MSMati Staniszewski
You can definitely do faster iterations.
- MMMarina Mogilko
You, you have to run [chuckles] to stay where you are. I don't know if you get this feeling, but for me, it's like the world is speeding up every single day.
- MSMati Staniszewski
I do think it's speeding up, but I, at the same time, I think it's not zero-sum, where it's not, uh, y- by, by speeding up in this category, it doesn't take away from another category. I think the entire economy is just growing as well with, with a lot of that adoption, so there will be more creative opportunity, um, than it ever was before. And yes, to be part of that creative opportunity, you probably need to move faster with a lot of the innovation than you might have needed to before. But you, you... I, I think, still, like, a wide set of, of, of people can, can and will benefit. But of course, you know, it's going to a lot of the, the-... repetitive, manual, non-talented intelligence, uh, non, like, basic intelligence-based work will be, will be replaced with, well, AI workflows. Um, and the best, the best way to, to avoid this is, is, is by learning a lot of the AI tooling, so you yourself are better. And, and maybe just to finish off and maybe to summarize the customer support piece, thinking about it slightly differently and outside of customer support, is that frequently, if you have a domain expertise, whichever domain that is, then you- that's, that's where you can, um, deliver even more value. So combin- combining your domain expertise with AI is, um, is, is, is much higher, uh, um, uh, value and, and, and, and output. And if you don't have domain expertise, then you probably want to gain that domain expertise, uh, mm, which, which, which, which would be-
- MMMarina Mogilko
Yeah, I've seen-
- MSMati Staniszewski
... right
- MMMarina Mogilko
- a lot of graphs for, like, future of jobs reports, and, uh, there's this section like: your expertise plus AI, and it goes like this in terms of demand. What would be the tools that you would recommend everyone to start using now?
- 31:23 – 33:50
Tops 3 AI tools
- MMMarina Mogilko
Name top three AI tools.
- MSMati Staniszewski
Top three AI tools. Okay. Outside of ElevenLabs, which you do need to try and use. [laughing]
- MMMarina Mogilko
Of course. [laughing]
- MSMati Staniszewski
Uh, I would say I really like Black Forest Labs for, for their, for their image, uh, image work. I mean, Midjourney has been cranking out for, for so many years, but Black Forest Labs, I really like as a kind of the, uh, new iteration. I think they have a good realism, and I think they will go through a set of additional iterations that, that are, that are great. From the classic ones, um, I mean, Anthropic's Claude Claude I think is incredible, uh, where, where, where I think it helps you, uh, like, be another level engineer, or even if you're not engineer, try to be a little bit more of the engineer. And then last one, I would really- I really like Lovable, um, but similarly, I mean, v0-
- MMMarina Mogilko
Like coding
- MSMati Staniszewski
... are great.
- MMMarina Mogilko
Yeah.
- MSMati Staniszewski
Uh, uh, but, uh, partic- uh, given, given we are in Europe, I, I feel, uh, Lovable deserves the, the, the, the, the-
- MMMarina Mogilko
They're from Sweden, right?
- MSMati Staniszewski
They are from Sweden, yeah. Uh, but all of them, I mean, it's, it's just so incredible to see, like, our go-to-market teams try whether it's Lovable, v0 or Replit. Um, I think now Figma also, uh, launched there, so we- I haven't tried it yet, but, uh, that's, uh... it's, it's, it's fun to see how, like, people that haven't been traditionally on the engineering front are closer, and they understand the product pain points, they understand the use case a lot better. So there's both this path of like prototyping, showing the clients, which is amazing, but then also, by extension, they are effectively getting closer to what is behind the scenes on the product side, too.
- MMMarina Mogilko
Yeah, and when, when you mentioned Lovable, do you build something for yourself or for ElevenLabs?
- MSMati Staniszewski
Um, both. So on the go-to-market side, we frequently will do a demonstration to, to a customer of like, let's say we were doing the use case that you mentioned. We could build a prototype on a mock-up website of how the checkout would look like, how the agent would interact with you. That, that type of, um, um, type of use case a- all the time, whether it's on the private or, or conferences or with the client calls. Uh, but also on a personal side, I recently tried with my two nieces to, um, to-- they are five and seven years old, so I have the best job of being fun uncle or trying to be, and they, um, uh, uh, they were- we were speaking about, uh, how they could potentially create a story generator for themselves, where you would type in the character names, and the story would be created.
- MMMarina Mogilko
You're an entrepreneur. You started this company, spotted this opportunity.
- 33:50 – 36:49
The $10k/mo voice-agent opportunity
- MMMarina Mogilko
Do you see any other areas, aside from voice, where people should be doubling down? Because, um, one of the founders I had on this podcast told me that, uh, actually co-founder of Hugging Face, he told me that, "In the, in the next five years, you have to be an entrepreneur [chuckles] or you're done." So a lot of people are learning how to become an entrepreneur. Can you name some opportunities that you see that can make people decent amount of money, so they can make a living, like 10K a month? Something that's immediate, something that you see a gap in the market.
- MSMati Staniszewski
It will be voice specific, but I think it's so, so early that I think it's, it's a huge one, is, um, there's definitely a lot of the infrastructure being built for the voice agents. We, we, we, we build it, but other companies are, are, are, too. Um, and I think there is a big gap between voice agents and then actually deploying them in a lot of those businesses. And you don't have to have the engineering expertise to deploy those voice agents. The platform now frequently will support a relatively self-serve manner of taking it, but you can easily take that voice agent and deploy that in a specific domains. And most of the businesses in the world still don't know, know, know about it. If it's, you know, not, um, venture-scale, uh, business, and you just want to make good money, I would try to take those voice agents and go to, um, l- let's say, local doctor's office and help them appointment schedule for, for, for the dentist, so they can take appointments more easily. And they can then focus more on their work instead of nurse doing that in between or missing appointments. That's actually one of the most common, uh, I don't know the, the percentage, but so frequently, those, those appointments don't get booked because there's no one on the phone and can take them. Um, you can go to local mechanics and help them-
- MMMarina Mogilko
Yeah
- MSMati Staniszewski
... take appointments. And I think there's- all of these require slight variation of the domain piece that you need to know, um, all of those businesses are in thousands to tens of thousands of dollars per month if you get it to the few. Um, the infrastructure is there. You just need to bring it to, to, to those domains.
- MMMarina Mogilko
I love it. Yeah, it's like B2B automated businesses with AI.
- MSMati Staniszewski
Yeah, and small businesses all around the world.
- MMMarina Mogilko
And you don't have to be a coder.
- MSMati Staniszewski
You don't have to be the coder. You just need to spend the time, call d- them and ask or, or, or go to them. Um, and I think there's this, like, this category, which might not be, um, taken off by, by some of the biggest companies that will focus on bigger enterprise, uh, elements, like the, you know, classic, uh, uh, this is like small, medium businesses rather than, than, than the enterprise segment. And at the same time, most of those companies just don't know this is possible. So, like, next year or two is just a incredible opportunity to do it. And of course, you know, start is in English speaking, but I think the same is true for so many of the, of the countries and languages which, which might be, uh, given s- so much of that work isn't-... always localized. I think in our case, uh, we're doing a pretty good job there. You can, you can bring it to a local market and do exactly the same-
- MMMarina Mogilko
Mm-hmm
- MSMati Staniszewski
... the same work.
- MMMarina Mogilko
Absolutely love it. Thank you.
- MSMati Staniszewski
Thank you.
- MMMarina Mogilko
So if you were starting a company today, and you're a brand-new entrepreneur, what would be your advice for
- 36:49 – 41:53
Advice for everyone who’s starting out
- MMMarina Mogilko
anyone who's starting out?
- MSMati Staniszewski
The first advice would be that you deeply understand your, your user, um, and the problem that you're trying to f- fix. Like, I think that would be the first piece is like, do I know the problem, and do I know people have that problem?
- MMMarina Mogilko
'Cause you-
- MSMati Staniszewski
It's-
- MMMarina Mogilko
... you started ElevenLabs because you were, uh, you didn't like the transcribing, the translation of-
- MSMati Staniszewski
Yeah, this is, like, a super, um, super crazy piece, uh, that in Poland, if you watch a movie, all the characters, whether it's a male or female character, are narrated with one single voice.
- MMMarina Mogilko
With no intonation, right?
- MSMati Staniszewski
No intonation.
- MMMarina Mogilko
Like, mm-mm-mm. [laughing]
- MSMati Staniszewski
It's flat. Exactly, exactly.
- MMMarina Mogilko
I think it was the same in s- post-Soviet times in, in Russia.
- MSMati Staniszewski
Exactly, because post...
- MMMarina Mogilko
Yeah.
- MSMati Staniszewski
And it still continues today. But, you know, it was, it was an kind of obvious when we started looking into the audio space and then realized that this is still a problem, something we grew up with, something that you ask any Polish person, or most of Polish people, and they will tell you how bad of an experience that is.
- MMMarina Mogilko
Mm.
- MSMati Staniszewski
As you can likely imagine, it's pretty bad. And it will, it will, it will change, and, um, and it will second obvious, okay, if you're thinking about the future, you will have all different or different, um, uh, uh, original voices represented. So if the movie is streamed, you will just hear exactly the same language. Of course, it expanded from the dubbing to, to just, uh, voiceovers and speech, because so much of the content is unavailable in audio in the first place, and, and now a lot of voice agent stuff. But it was a very clear problem, and I think as I think about starting a company, or if I were to start a company again, I would try to obsess about the problem. And then the second one is, um, do people actually have that problem? Is it actually burning a problem? And in dubbing, it was a good example where we, we thought the dubbing is the biggest problems, but before we actually solved the dubbing, we realized from a lot of conversation with users that there are so many other problems that they would like to fix first. The most common one is actually one you mentioned, where people just wanted to repair lines a- after recording, um, or just being able to deliver a voiceover without speaking. And that was, like, the most common thing after we tried to reach out to people like, "Wow," before we had it ready, it's like, "Hey, we are a- almost finished with our dubbing product. Would you like to dub your movies?" And then most likely we would get some small percentage of replies, and then int- inside of those replies, it would be: "Yes, this would be interesting, but actually, if you could help me with just, uh, post-production-
- MMMarina Mogilko
Just take my voice and do it, yeah
- MSMati Staniszewski
... Yeah, that would be much, much better.
- MMMarina Mogilko
Mm-hmm.
- MSMati Staniszewski
So then we're like, okay, there's this incredible opportunity that's smaller, uh, component of the technology we want to build, that we should, we should do instead, uh, first. And, and, and we did, and then we validated that again, and people were, "Yes, that's, that's, that's something we would love." And then, um, given we started from, um, creators on, on, on social media, uh, after we heard this, but then we realized that there are actually other people not on social media, but also on voiceovers. Uh, being the biggest group for us was book authors initially. Everybody just-
- MMMarina Mogilko
Oh, yeah
- MSMati Staniszewski
... couldn't afford.
- MMMarina Mogilko
Audiobooks.
- MSMati Staniszewski
Exactly.
- MMMarina Mogilko
'Cause that's, like, a few days in the studio.
- MSMati Staniszewski
A few days in a studio, very expensive. So many people get tired with the voice, so it's never, uh, as, as expected initially, so it takes more than that, and then that turned out to be, like, second of the first biggest, biggest ones. So, um-
- MMMarina Mogilko
But you actually built the dubbing product first, and you realized nobody wanted to pay for it. [chuckles]
- MSMati Staniszewski
Yeah. So we, we, we did the prototype. We did, uh-
- MMMarina Mogilko
As a product.
- MSMati Staniszewski
Yeah. So it was, uh, a little bit of a, like, you know, like a stitch-up of not, um, not a- it did, it did have a lot of our own research, but, uh, but it wasn't, it wasn't, um, months of work. Uh, it was like we, we created a prototype. We-- while we were building the prototype, we were reaching out to customers, like, "We, we want- we are working on this. Do you want it?" We had a good waiting list. Then we tried to show them what is, what it-- how it looks, and they were like: "Uh, this quality isn't as good. If you could actually help me with this and this instead, it would be better," which is the same technology, because people notice that if you dub, you can hear the voice of the person in the other language, it still sounds the same. And, and it turned out that the problem was even earlier. It was like, "Oh, just my voice is good."
- 41:53 – 43:58
Will we still learn languages?
- MMMarina Mogilko
advocating learning languages, will people still learn languages in three years if they can have their AI-authorized voice speaking any language, join any Zoom call? The only thing that's left is maybe a one-on-one conversation, but then maybe we have a device that translates everything. [chuckles]
- MSMati Staniszewski
The, uh, interesting one. I think y- they will, but, uh, the, not always the primary purpos- purpose will be for, for understanding others. It will be frequently for, uh, um, just developing yourself as a more of an enjoyable thing you want to do for your own sake.
- MMMarina Mogilko
Like horse riding, right? [chuckles]
- MSMati Staniszewski
Yeah.
- MMMarina Mogilko
From a necessity-
- MSMati Staniszewski
It's like a little bit-
- MMMarina Mogilko
... to a hobby.
- MSMati Staniszewski
Yes.
- MMMarina Mogilko
Right?
- MSMati Staniszewski
To, to more of a h- a hobby.
- MMMarina Mogilko
Mm-hmm.
- MSMati Staniszewski
Uh, and of course, there are, like, parts that by learning language, you learn the culture, you learn... And your, your kind of your perspective opens. I think that still will be true.
- MMMarina Mogilko
Or if you're moving to another country.
- MSMati Staniszewski
Or you're moving to the country.
- MMMarina Mogilko
I mean, like, if you want to move to the US, you would still learn some English, right?
- MSMati Staniszewski
Hopefully will not need to do it, and you will still be able to understand the culture and the level that you never could before, so-
- MMMarina Mogilko
With a device or-
- MSMati Staniszewski
... The Hitchhiker's Guide, it'll be like a Babel fish variation, like headphone, maybe device, maybe Neuralink. But even in those cases, there will be some processing time involved because you need to finish speaking for the device to pick it up and then translate it. So l- language, natively speaking, will be better. Uh, but yes, I do think most of that need will disappear for you to be able to interact and, and, and understand, which I think will be a beautiful thing. And then hopefully you can, you can learn it for other purposes.
- MMMarina Mogilko
Interesting how the whole industry is, like, might disappear or might transform completely, but it's, it's happening not to just language learning. It's happening to everything.
- MSMati Staniszewski
Hundred percent, but I think it will stay. Uh, it's, uh, you know, uh, definitely it will morph, um, but, uh, but some, some, some of that will definitely stay.
- MMMarina Mogilko
Okay. Thank you so much, Mati.
- MSMati Staniszewski
Thank you.
- MMMarina Mogilko
It was very inspiring and very practical. I love that.
- MSMati Staniszewski
And thank you so much for being an early user and all the feedback as well.
- MMMarina Mogilko
Thank you, and I'm hoping we're gonna integrate the sales part.
- MSMati Staniszewski
[chuckles]
- MMMarina Mogilko
I'm excited about that.
- MSMati Staniszewski
Amazing. Let's make it happen.
- MMMarina Mogilko
I'm gonna talk to my team right now. [chuckles]
- MSMati Staniszewski
Let's go. Thanks.
Episode duration: 43:58
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode EUrkIpq838c
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome