I cloned myself with Gemini Omni in 15 minutes (and it's terrifyingly good)

In this experimental episode, I document my real-time attempt to create an AI avatar of myself using Google Flow and the new Gemini Omni video generation model. I walk through the entire process—from scanning my face with my phone to generating a complete one-minute hype video for the podcast, all in about 15 minutes. *What you’ll learn:* 1. How to create an AI avatar using Google Flow in under five minutes 2. Why video AI tools unlock creative possibilities for people with zero video production skills 3. The step-by-step process of generating a full storyboard using AI as your creative producer 4. How to use character consistency features to generate multiple video scenes with the same avatar 5. The uncanny-valley moments you’ll encounter when your AI clone doesn’t quite nail emotions or physics 6. How to stitch together AI-generated scenes into a complete video using built-in editing tools *Brought to you by:* Merge—Connective infrastructure for production AI: https://www.merge.dev/howiai Jira Product Discovery—Prioritize with insights, build with confidence: https://atlassian.com/howiai *In this episode, we cover:* (00:00) Getting started with Google Flow and Gemini Omni (01:38) The avatar creation process: scanning and photo capture (02:55) Using Flow to brainstorm a hype video storyboard (06:59) Generating the first video scene with the avatar (08:41) Troubleshooting: accidentally generating images instead of videos (09:32) Generating all seven scenes for the complete video (11:37) Reviewing the avatar videos (13:13) Stitching the videos together in the browser-based editor (14:32) The complete How I AI hype video (15:32) What worked and what didn’t (19:04) Final thoughts *Blog & detailed workflow walkthroughs from this episode:* How I Built an AI Avatar and Hype Video in 15 Minutes with Google Flow: https://www.chatprd.ai/how-i-ai/ai-avatar-video-in-15-minutes-with-google-omni-flow ↳ How to Create a Promotional Video with an AI Creative Director: https://www.chatprd.ai/how-i-ai/workflows/how-to-create-a-promotional-video-with-an-ai-creative-director ↳ How to Create a Personalized AI Avatar with Google Flow: https://www.chatprd.ai/how-i-ai/workflows/how-to-create-a-personalized-ai-avatar-with-google-flow *Tools referenced:* • Google Flow: https://labs.google/fx/tools/flow • Gemini Omni: https://gemini.google/overview/video-generation/ • Veo 3: https://deepmind.google/technologies/veo/ *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Claire Vohost

Jun 3, 202620mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Building a convincing AI video avatar using Flow quickly, imperfectly

Claire uses Google Flow’s avatar feature and Gemini Omni to create an AI video double of herself and produce a complete podcast hype video in roughly 15 minutes.
Flow helps not only generate video clips but also brainstorm a seven-scene storyboard, positioning itself as an end-to-end creative suite rather than a single video model.
The process includes real-world friction—like mistakenly generating images instead of videos—yet still results in usable scenes and a stitched final cut using a browser-based timeline editor.
The avatar output is “terrifyingly good” in moments but inconsistent across shots, with uncanny facial expressions, shifting hair/background details, and stereotypical “futuristic AI” visual tropes.
Claire concludes the tool is already valuable for fast solo content creation, and with tighter prompting and more reference inputs it could become convincing to most viewers.

IDEAS WORTH REMEMBERING

5 ideas

Flow’s real differentiator is combining ideation, generation, and editing in one place.

Claire leans on Flow to brainstorm a storyboard, generate multiple takes per scene, and assemble clips on a timeline in the browser—reducing tool-hopping and specialized skills.

Avatar capture can import unintended “truth” from your environment.

Background posters/books from the scan appear in generations, suggesting the model anchors heavily on whatever context is visible during capture and may leak personal/environmental cues into outputs.

Small UI mistakes can derail output type, but recovery is fast.

Claire accidentally generates images instead of videos due to a mode toggle; the workflow still makes it easy to re-run prompts and continue without major rework.

Character consistency is the current bottleneck for believable avatar video.

Across scenes, her hair length changes, the room color and props shift, and her face matches only “about 50% of the time,” indicating reference + prompting still isn’t enough for stable identity across cuts.

Facial emotion and performance remain the fastest route to uncanny valley.

Neutral or serious shots look more convincing, while laughing/smiling clips appear strange and “medicated,” implying expression synthesis is less reliable than static likeness.

WORDS WORTH SAVING

5 quotes

Today, I am doing a very strange episode where I'm gonna create a video avatar of myself, and in about 15 minutes, get to a full minute-long video starring none other than your favorite podcast host, Claire Vo.

— Claire Vo

I have no idea what we're gonna get into, and hopefully it won't be terrifying.

— Claire Vo

We were told AI would replace us. That is quite spooky.

— Claire Vo

Sorry. Sorry. For you all that are listening and not watching, I just got, um, jump scared by the AI version of myself wearing glasses, um, turning around in a spinning chair.

— Claire Vo

This took zero time and effort, and it is ... I wouldn't say it's, like, 80% there, but is it 50% there? 100% yes.

— Claire Vo

Google Flow setup and avatar creation via phone scanGemini Omni video generation workflowAI-assisted storyboard ideation (seven scenes)Generation errors and troubleshooting (image vs video)Scene-by-scene avatar consistency (face, hair, background)Uncanny valley and emotion/voice timing artifactsIn-browser editing and rapid assembly into final hype video

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.