EVERY SPOKEN WORD
30 min read · 6,033 words- 0:00 – 0:28
Introduction
- OMOlivia Moore
AI video completely taking over our social feeds in the span of a week, which is absolutely insane.
- JMJustine Moore
Veo 3 was sort of like the ChatGPT moment for AI video.
- OMOlivia Moore
The next generation of entrepreneurs are gonna be completely AI assisted. Like, a world of possibilities has been opened up for AI storytelling, especially in video form.
- JMJustine Moore
Yes, it's an exhausting time for AI creatives. It's great, but exhausting. [upbeat music]
- 0:28 – 0:45
Meet the Hosts: Justine and Olivia
- JMJustine Moore
I'm Justine.
- OMOlivia Moore
I'm Olivia.
- JMJustine Moore
And this is our very first edition of This Week in Consumer AI. So we are both partners on the investing team here at a16z, and we are also identical twins.
- OMOlivia Moore
Very confusing.
- JMJustine Moore
Extremely confusing, but should be fun for a podcast. Um, and we're excited
- 0:45 – 6:34
Veo 3: The Game-Changer in AI Video
- JMJustine Moore
to chat about some of the cool things we saw in the wild world of consumer AI this week, starting with Veo 3, Google's video model.
- OMOlivia Moore
Mm-hmm.
- JMJustine Moore
Then we're gonna talk through the ChatGPT Advanced Voice Mode updates and Apple's big AI announcements.
- OMOlivia Moore
And then we're gonna cover ElevenLabs' new voice model. We're gonna talk about some data that our team put out recently about how fast consumer AI startups are ramping revenue, and then we'll talk about Flux's new editing model, Context, and how Justine used it to make her own froyo brand.
- JMJustine Moore
And stay tuned for the end because we have a cool tutorial and some demo footage on how to make your own brand.
- OMOlivia Moore
Things are moving so quickly that it feels like we went from n- exciting but maybe not super realistic AI video to AI video completely taking over our social feeds in the span of a week, which is absolutely insane.
- JMJustine Moore
Yeah. I've been following AI video for a few years now. I... You probably remember, I've been an early user of all these models-
- OMOlivia Moore
Yes
- JMJustine Moore
... and I have wanted them to work and to make cool things that everyday people would like for so long. And I would say Veo 3 was sort of like the ChatGPT moment for AI video, where we were suddenly seeing all of these Veo 3 generations blowing up with millions of views, channels only featuring Veo 3 videos getting hundreds of thousands of subscribers within days.
- OMOlivia Moore
Yeah.
- JMJustine Moore
Um-
- OMOlivia Moore
What's actually different about Veo 3?
- JMJustine Moore
Yeah, so okay. I should give the overview first. So Veo 3 is Google DeepMind's latest video model effort.
- OMOlivia Moore
Mm-hmm.
- JMJustine Moore
Um, so they released Veo 2 late last year, which was, like, the first sort of breakthrough in showing that you could get really high-quality video, like a consistent scene, consistent characters, um, physics, like, things that just looked good. Um, and Veo 3 is the next iteration of that model series, and what's very different about it is it generates audio natively at the same time it generates video. So you can actually prompt it with a text prompt to say something like, "A street-style interview where a man and a woman are talking about dating apps," or you can be even more specific and say something like, "A street-style interview where a man walks up to a woman and asks her, 'What dating apps are you on?' And she replies, 'Why are you asking?' Um, and then gives him a suspicious look." Uh, and so you no longer have to go to another platform to do an audio voiceover or anything like that. You can get a full-featured talking human video with multiple characters in one place.
- OMOlivia Moore
They left behind a, a ball today. It bounced higher than I can jump.
- SPSpeaker
Oh, what manner of magic is that?
- OMOlivia Moore
It feels like a real unlock to me, as someone who's been following AI video less closely, in that people are now able to generate, in one prompt, a full vlog, a full talking head video, something that looks like a podcast-
- JMJustine Moore
Yes
- OMOlivia Moore
... in one go, and I think that's why we've seen things like the stormtrooper vlogs completely blowing up on TikTok and Instagram.
- SPSpeaker
I told you, Greg. I told you not to touch the nav system.
- SPSpeaker
I followed the route.
- SPSpeaker
You plotted it upside down, Greg.
- JMJustine Moore
Yeah, so the interesting thing about Veo 3 is it's limited to eight-second generations only, um, and it doesn't generate audio if you start from an image to video, only if you start from text, which means that it's really hard to have longer than an eight-second clip with character consistency unless in your text prompt you are referencing a character that the model already knows.
- OMOlivia Moore
Okay.
- JMJustine Moore
And so that's why we've seen all of these hacks of all the viral vlogs featuring, like, stormtroopers or a yeti.
- OMOlivia Moore
Yeti. You can't see their faces. They're covered by a mask.
- JMJustine Moore
Yes, or the yeti, the model knows what the yeti looks like.
- OMOlivia Moore
Yeah.
- JMJustine Moore
Or a capybara. Like, if it's not a human face, I think we're less sensitive to-
- 6:34 – 10:22
ChatGPT's Advanced Voice Mode Updates
- JMJustine Moore
Okay, so there was a lot of news last week-
- OMOlivia Moore
Yeah
- JMJustine Moore
... so this got kind of lost, but I heard there was a big update to ChatGPT's Advanced Voice Mode?
- OMOlivia Moore
Yes. They announced it on Saturday, which was an interesting choice. Um-
- JMJustine Moore
Weird time to drop.
- OMOlivia Moore
Yeah. I think they actually dropped the improvements last Thursday or Friday. It was first only for all paid users, and now I think it started rolling out across the broader use- user base. But essentially they made Advanced Voice Mode a lot more human.
- JMJustine Moore
Hmm.
- OMOlivia Moore
The really interesting thing was ChatGPT was the first one to do what I would call kind of real-time consumer voice-
- JMJustine Moore
Yeah
- OMOlivia Moore
... where you could have, like, a conversation, and this was, like, last September in the-
- JMJustine Moore
Yeah
- OMOlivia Moore
... ChatGPT app. But then-... they didn't really improve the product or the model that much-
- JMJustine Moore
Right
- OMOlivia Moore
... at least from my perspective, and we saw Sesame and other open source providers come out with arguably better and way more human-like models. We saw Gemini, we saw Grok launch voice products that were much more realistic.
- JMJustine Moore
Right.
- OMOlivia Moore
So I think it was kind of a question mark for a lot of people what ChatGPT was doing with consumer voice.
- JMJustine Moore
And so what were the, what, what makes it better now, or what were the main upgrades?
- OMOlivia Moore
We should actually just maybe talk to it and see.
- JMJustine Moore
It's live demo time, folks. [laughs]
- OMOlivia Moore
Hey, ChatGPT, are you there?
- SPSpeaker
Hey there. Yep, I'm here and ready for the demo. What can I help you with today?
- OMOlivia Moore
Did they make some improvements to your voice over the last week?
- SPSpeaker
Yeah, they did. They've been rolling out some updates to make my voice, uh, sound more natural and expressive. I'm glad you noticed.
- OMOlivia Moore
Yeah. I can hear, like, your voice maybe inflects upwards when you're approaching a question, or sometimes you'll say um or uh or something that sounds like a mistake but actually makes you sound more human.
- SPSpeaker
Exactly.
- OMOlivia Moore
The-
- SPSpeaker
Those little touches are all intentional to make the conversation feel more natural and relatable. It, it definitely makes things a bit more fun and engaging, I think.
- OMOlivia Moore
Very cool.
- JMJustine Moore
Amazing. Yeah, it's crazy to hear... It's always wild when a voice model, like, coughs-
- OMOlivia Moore
Yes
- 10:22 – 12:18
Apple's AI Announcements and Siri's Shortcomings
- OMOlivia Moore
But the other big, the, the other big tech consumer update this week, which was the Apple developer conference-
- JMJustine Moore
Yes
- OMOlivia Moore
... and, and all of the things that they announced around AI, and-
- JMJustine Moore
Or didn't announce
- OMOlivia Moore
... and, or didn't announce.
- JMJustine Moore
Right.
- OMOlivia Moore
And the fact that I think that people have been so far somewhat disappointed-
- JMJustine Moore
Yeah
- OMOlivia Moore
... by Apple Intelligence, which is their bundled set of AI features.
- JMJustine Moore
Yep.
- OMOlivia Moore
I think we've all been waiting on, like, the AI version of Siri or some kind of true personal assistant on mobile.
- JMJustine Moore
Yeah. I asked Siri, so I had this the other day-
- OMOlivia Moore
Yes
- JMJustine Moore
... where I asked Siri, um, "Okay, tomorrow's Monday. What Monday is it of the month?" Because SF street cleaning-
- OMOlivia Moore
Okay. Yes
- JMJustine Moore
... I had to know if it was gonna be-
- OMOlivia Moore
Yes
- JMJustine Moore
... the second Monday of, of the month. And it said, "I can't... I don't know that. Can I search ChatGPT for you?"
- OMOlivia Moore
Tough.
- JMJustine Moore
And I was like, "Siri-
- OMOlivia Moore
Yes
- JMJustine Moore
... how can you not answer this basic question?"
- OMOlivia Moore
Well, okay, it does seem like from a lot of Apple's updates that they put out, they're kind of outsourcing a lot of the-
- JMJustine Moore
Yeah
- OMOlivia Moore
... true AI features to ChatGPT just running on your phone.
- JMJustine Moore
Yeah.
- OMOlivia Moore
Um, and I think a similar story, it seemed like when they rolled out those AI-powered notification summaries-
- JMJustine Moore
Yes
- OMOlivia Moore
... where they would group, like, three or four sets of notifications into one-
- JMJustine Moore
Yeah
- 12:18 – 15:50
ElevenLabs' New Voice Model: 11 V3
- JMJustine Moore
Um, okay, and before we get too far off voice, should we talk about Eleven V3?
- OMOlivia Moore
Yes.
- JMJustine Moore
Uh, so ElevenLabs, the text-to-speech company, actually broader AI voice company, uh, released their third generation model called Eleven V3.
- OMOlivia Moore
We're off under the lights here for this semifinal clash, the stadium buzzing with anticipation.
- JMJustine Moore
And what makes Eleven V3 really special is it does a bunch of stuff with voice that you used to have to do via speech to text to speech. So before, if you wanted to have a character that was, you know, crying while talking-
- OMOlivia Moore
Yeah
- JMJustine Moore
... or had some sort of emotion or even had, like, a weird inflection, you would have to record yourself saying it like that, upload it to Eleven, and then they-
- OMOlivia Moore
Yeah
- JMJustine Moore
... would translate it into the AI voice.
- OMOlivia Moore
Yep.
- JMJustine Moore
And now they essentially take all of the weird inflections, emotion, even accents, and they turn it into text prompting-
- OMOlivia Moore
Yep
- JMJustine Moore
... through these things called tags. So basically, the Eleven, and I'm sure we'll show this, the Eleven, um, interface is an editor where you can take a sentence that you want the character to say, you pick your voice, you write your sentence, and then you can tag it, like sadly or resigned or whispering or something like that.
- OMOlivia Moore
Liam, have you tried the new ElevenLabs V3?
- SPSpeaker
Just got it. The emotion is amazing. I can actually do whispers now like this.
- OMOlivia Moore
And you can do sound effects too, right?
- JMJustine Moore
That is huge. So, uh, actually, should I bring up my example of this?
- OMOlivia Moore
Go for it.
- JMJustine Moore
I don't know if it's gonna play or not. Let's see. Um, so this is a 20-second clip I made of two characters talking back and forth.
- OMOlivia Moore
And what's the prompt on it?
- JMJustine Moore
Oh, it's a text prompt. It'll say, "Hey, y'all. My name is Austin. I'm coming to you live from our family farm in Fort Worth." Then he's gonna walk-
- OMOlivia Moore
Okay
- JMJustine Moore
... through milking a cow, and someone's gonna interrupt him.
- OMOlivia Moore
Great.
- SPSpeaker
Hey, y'all. My name is Austin. I'm coming to you live from our family farm in Fort Worth. [cow moos] Today, I'm gonna walk through what it's like to-
- JMJustine Moore
Austin, are you faking an accent again?
- SPSpeaker
It's not faking. I was born here.
- JMJustine Moore
Everyone knows you don't talk like that. So my favorite thing about that is it showcases a couple of things-
- OMOlivia Moore
Yeah
- JMJustine Moore
... about the model.
- 15:50 – 23:14
Report from a16z: AI Revenue Growth
- JMJustine Moore
week-
- OMOlivia Moore
Yep
- JMJustine Moore
... about AI revenue ramp and how fast companies are growing.
- OMOlivia Moore
Yep.
- JMJustine Moore
Let's chat through the main takeaways from that.
- OMOlivia Moore
Yeah. So basically the methodology here, or maybe to even b- back up, the, the purpose here was I think we all have this idea in mind or, or maybe we have that idea because we've heard it a billion times that, like, we're in a new era of growth now.
- JMJustine Moore
Yes.
- OMOlivia Moore
Thanks to AI, companies are scaling faster than ever before.
- JMJustine Moore
Right.
- OMOlivia Moore
But my question was like, what does that really mean, and how fast is that? Is it 20% faster? Is it 50% faster than what we saw pre-AI?
- JMJustine Moore
Right.
- OMOlivia Moore
So we are blessed to get to meet tons of companies here every day. We meet dozens of companies a week. So we went back and essentially just pulled all the data from companies we've met in the gen AI era, which I would say is the last, you know, 22 to 24 months.
- JMJustine Moore
Right.
- OMOlivia Moore
And we looked at once they started monetizing, how fast are they growing? I would say pre-AI, if you're a B2B startup selling to enterprises, if you got to a million dollars in ARR in the first year, that's, like, amazing, best in class. [laughs]
- JMJustine Moore
That was, like, the rule of thumb. I remember that.
- OMOlivia Moore
Yes.
- JMJustine Moore
It's the known metric.
- OMOlivia Moore
Very exciting. If you were a consumer startup, you would not make money for three, five years, maybe longer.
- JMJustine Moore
Yes.
- OMOlivia Moore
The whole idea was to build up a user base and then probably monetize them directly, uh, via ads.
- JMJustine Moore
Or transactions for like a marketplace maybe.
- OMOlivia Moore
Yes, down the line.
- JMJustine Moore
Right.
- OMOlivia Moore
And there were counter examples to that, some subscription companies, but that was definitely not the dominant model.
- JMJustine Moore
Yep.
- OMOlivia Moore
That has fully shifted in the AI, AI era, and most companies are now making money directly from consumers via subscription. What we found was actually pretty surprising, which is that the median ARR, annualized revenue run rate, is now $4.2 million at month 12-
- JMJustine Moore
Wow
- OMOlivia Moore
... for consumer startups. The bottom quartile is 2.9 million.
- JMJustine Moore
Yeah.
- OMOlivia Moore
And the top quartile is 8.7 million.
- 23:14 – 29:17
Demo of the Week: AI in Brand Creation
- JMJustine Moore
All right. Awesome. We're moving on to our demo of the week.
- OMOlivia Moore
Love it.
- JMJustine Moore
So one fun fact about us is that we love, we genuinely love, like, at least for me, it's probably my number one hobby now-
- OMOlivia Moore
Yeah
- JMJustine Moore
... trying out all of the AI creative tools especially, but also AI, like, consumer products more broadly. Um, figuring out how to make cool things and then sharing the workflows to other people whose number one hobby is not doing this.
- OMOlivia Moore
[laughs]
- JMJustine Moore
So this week we are going to talk about brand creation and ideation using AI. I made this new frozen yogurt brand called Melt that I iterated on with ChatGPT, then I took to Ideagram, and then I took to Krea to kinda do the final touches and to make these really cool product photos and even store photos.
- OMOlivia Moore
Yeah.
- JMJustine Moore
Um, and I think that the initial idea about this was seeing Flux Context come out, which is the new image editing model from Black Forest Labs, which is hosted on Krea. Um, and Flux Context, you can kind of think of it like the GPT-4.0 image model, where you can upload an image, um, and then you can say, you know, "Make this Ghibli style" was the-
- OMOlivia Moore
Yeah
- JMJustine Moore
... was vi- viral example. You can also say, like, "Take the person from this photo and put them in a new environment," or, you know, "Take the logo and change it slightly."
- OMOlivia Moore
Yeah.
- JMJustine Moore
Add or remove objects.
- OMOlivia Moore
I've seen it described as kind of like Photoshop, but with natural language prompts.
- JMJustine Moore
Yes.
- OMOlivia Moore
Like, you can edit with words for the first time.
- JMJustine Moore
And that's... I think that is what makes it different than the 4.0 image model-
- OMOlivia Moore
Yeah
- JMJustine Moore
... which is, um, the consistency to which it retains the item or the character or whatever-
- OMOlivia Moore
Yeah
- JMJustine Moore
... is much, much better. Uh, we'll, we'll show some examples here.
- OMOlivia Moore
Yeah.
- JMJustine Moore
But basically, if you're t- taking a photo of yourself and uploading it to GPT-4.0 and saying, like, "Put me in a podcast studio," you will likely end up looking completely different in the new photo than you did in the initial photo.
- OMOlivia Moore
Yes. [laughs]
- JMJustine Moore
Or maybe some similar features, but quite different, whereas this model does an amazing job at maintaining consistency.
- OMOlivia Moore
Yep.
- JMJustine Moore
And so that sparked this idea for me of like, "Oh, that means that this can actually be used for, like, brands-
- OMOlivia Moore
Yeah
- JMJustine Moore
... to do product photos or, or, uh, other sorts of marketing collateral," because the logos and the products can be consistent.
- OMOlivia Moore
Awesome.
Episode duration: 29:35
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode fySodSi4aUU
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome