Skip to content
a16za16z

What You Missed in AI This Week (Google, Apple, ChatGPT)

Things in consumer AI are moving fast. In this episode, Justine and Olivia Moore, investing partners (and identical twins!) at a16z, break down what’s real, what’s overhyped, and what’s next across the consumer AI space. They cover: - Veo 3: how Google's video model unlocked a new genre of content - OpenAI’s Advanced Voice Mode: upgrades, realism, and... um, human-like hesitation - Apple's AI announcements - ElevenLabs' V3: expressive voice tags, real-time interruptions, and narrative tools for creators - New data from a16z: AI consumer startups are ramping revenue faster than ever—and they show you how - Justine walks through how she used ChatGPT, Ideogram, and Krea to launch a fully AI-assisted brand prototype (store photos and all) It’s exhausting (in the best way) to be a creative in the age of AI. Timecodes: 00:00 Introduction 00:28 Meet the Hosts: Justine and Olivia 00:45 Veo 3: The Game-Changer in AI Video 06:34 ChatGPT's Advanced Voice Mode Updates 10:22 Apple's AI Announcements and Siri's Shortcomings 12:18 ElevenLabs' New Voice Model: 11 V3 15:50 Report from a16z: AI Revenue Growth 23:14 Demo of the Week: AI in Brand Creation Resources: Read ‘What “Working” Means in the Era of AI Apps’: https://a16z.com/revenue-benchmarks-ai-apps/ Find Justine on X: https://x.com/venturetwins Find Olivia on X: https://x.com/omooretweets Tools Discussed: Veo 3: https://gemini.google/overview/video-generation OpenAI: https://openai.com/chatgpt ElevenLabs (V3 voice model) – https://elevenlabs.io/ Ideogram (logo/image generation) – https://ideogram.ai/ Black Forest Labs/Flux Context (image editing via Krea) – https://www.krea.ai/ Flux Context demo (Krea launch post) – https://www.krea.ai/blog/flux-context Hedra: https://www.hedra.com/ Stay Updated: Let us know what you think: https://ratethispodcast.com/a16z Find a16z on Twitter: https://twitter.com/a16z Find a16z on LinkedIn: https://www.linkedin.com/company/a16z Subscribe on your favorite podcast app: https://a16z.simplecast.com/ Follow our host: https://x.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Olivia MoorehostJustine Moorehost
Jun 12, 202529mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

AI video, voice, and consumer monetization accelerate creative entrepreneurship

  1. Google’s Veo 3 sparked a “ChatGPT moment” for AI video by generating native audio alongside video, enabling viral short-form talking-character content.
  2. Despite impressive realism, Veo 3 has key constraints (notably 8‑second clips and audio only from text prompts) that creators work around using masked or well-known characters for consistency.
  3. ChatGPT’s Advanced Voice Mode received upgrades that make conversations sound more human through expressive inflection, fillers, and natural timing, after competitors had arguably surpassed it.
  4. Apple’s AI announcements emphasized incremental features (e.g., Genmoji, transcription, real-time translation) while the long-awaited upgraded Siri still appears delayed and partially outsourced to ChatGPT.
  5. a16z data suggests consumer AI startups are monetizing faster than prior eras via subscriptions driven by inference costs and high perceived value, and a demo shows how modern image-editing models can rapidly produce brand assets end-to-end.

IDEAS WORTH REMEMBERING

5 ideas

Native audio is the unlock that made AI video feel mainstream.

Veo 3’s ability to generate dialogue and sound along with video in a single text prompt enables “one-shot” vlog/podcast-style clips that spread easily on social platforms.

Veo 3’s limitations shape what goes viral.

Because generations are capped at ~8 seconds and audio doesn’t work for image-to-video, creators lean on characters with covered faces (stormtroopers, yetis) or recognizable archetypes to mask continuity breaks.

Access is broadening, but cost still matters.

Veo 3 started behind a $250/month plan but is now reachable via APIs and third-party tools; however pricing around ~$0.75/second makes prompt discipline and iteration strategy important.

Voice UX is converging on “human imperfections” as a feature.

ChatGPT’s updated Advanced Voice Mode adds realistic disfluencies (ums/uhs), expressive prosody, and natural pacing—signals users now associate with trust and presence rather than “robot voice.”

Policy and perception can slow frontier productization.

The hosts speculate OpenAI’s slower voice iteration may reflect the “Her” controversy risk and prioritization tradeoffs across reasoning, images, and video.

WORDS WORTH SAVING

5 quotes

Veo 3 was sort of like the ChatGPT moment for AI video.

Justine Moore

Things are moving so quickly that it feels like we went from n- exciting but maybe not super realistic AI video to AI video completely taking over our social feeds in the span of a week, which is absolutely insane.

Olivia Moore

I asked Siri, um, "Okay, tomorrow's Monday. What Monday is it of the month?" ... And it said, "I can't... I don't know that. Can I search ChatGPT for you?"

Justine Moore

What we found was actually pretty surprising, which is that the median ARR, annualized revenue run rate, is now $4.2 million at month 12... for consumer startups.

Olivia Moore

The next generation of entrepreneurs, like-... are gonna be completely AI assisted-... Like, there'll be no reason for any person not to have their own product line, small business, open a store if they want to.

Olivia Moore

Veo 3 native audio + video generation8-second clip limit and character-consistency hacksChatGPT Advanced Voice Mode realism upgradesApple Intelligence, Siri gaps, and ChatGPT outsourcingElevenLabs 11 V3 emotion, interruptions, and SFX tagsConsumer AI subscription economics and revenue ramp benchmarksAI-assisted brand creation workflow (ChatGPT → Ideogram → Krea/Flux Context)

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome