How I AIUsing Veo 3 to create AI-generated music videos, like a Tiny Desk concert with Notorious B.I.G.
CHAPTERS
Why Anish uses AI for music: from DJ constraints to “creative satisfaction”
Claire Vo introduces Anish Acharya (a16z) and frames the episode as a fun, consumer-focused tour of AI workflows. Anish explains how AI removes longstanding audio constraints (like isolating vocals) and reignites remix culture in a new medium.
Tiny Desk as a format: using constraints to unlock creativity
They discuss why the Tiny Desk format works so well—tight constraints, recognizable setting, and intimate audio. Anish uses this as the conceptual template for resurrecting “impossible” performances in a respectful, non-derivative way.
Case study: building an AI Notorious B.I.G. Tiny Desk performance (overview)
Anish shares the finished Biggie-style Tiny Desk clip and outlines the overall workflow. The key idea: assemble a believable still frame + the right audio layers, then use a tool to animate and lip-sync.
Generating the hero still image with GPT-4o Image Gen
Anish demonstrates creating a Tiny Desk-style still frame (using Kurt Cobain as the live example). They highlight why 4o’s image generation is effective: strong prompt adherence and controllable edits.
Animating a still image with Hedra: frame-to-video + custom audio lip-sync
They introduce Hedra as a practical tool that both generates video motion from a still and synchronizes uploaded audio. The chapter broadens into other applications like translating speeches and animating characters for storytelling.
Sourcing and preparing audio: pulling from YouTube and trimming in Adobe Audition
Anish downloads a reference performance from YouTube and uses Adobe Audition to trim and align usable segments. They discuss current limitations (short clip lengths) and why constraints can actually improve creativity.
Extracting vocals with Demucs: turning any track into stems
Anish introduces Demucs to separate vocals from instrumentation via a simple command-line flow. This enables custom mashups like an a cappella Kurt vocal or isolating Biggie vocals for live-band overlays.
Assembling the Tiny Desk clip in Hedra: minimal prompts, strong results
With the still frame and audio ready, Anish uploads both into Hedra and generates the performance clip. They discuss how short, simple prompts can outperform over-engineered prompting when the model is strong.
Creating a ’90s-style Nirvana music video with Veo 3 (and refining prompts with 4o)
Anish shows a multi-clip Veo 3 workflow to produce a gritty, camcorder-like grunge music video. He uses GPT-4o to diagnose “wrong energy” generations and iteratively steer toward the desired Seattle ’90s aesthetic.
Evaluating realism: what looks incredible vs what still breaks
Claire reacts strongly to the realism—wardrobe, emotion, sequencing—while noting telltale artifacts. They call out specific failure modes (duplication, odd props) that creators learn to spot and work around.
Workflow #2: building a video-based book/record cataloger with Gemini Flash in AI Studio
Anish pivots to a practical multimodal app: video of flipping through a collection → extracted frames → recognized titles/authors. He argues video is a “native” interface for bringing the physical world online.
Deploying personal software: from quick prototype to shareable Cloud Run app
They compare how fast it is to build a working demo (minutes) versus making it production/shareable (hours). The takeaway is the rise of “personal software”—people building one-off tools for their own lives.
Comet browser for personal finance: AI agents operating websites (RPA)
In a lightning round, Anish explains why he uses Perplexity’s Comet browser—its assistant can operate web apps and summarize insights without manual clicking. He applies it to portfolio analysis inside Robinhood.
AI for kids: interactive stories, play, and social-emotional learning
They explore consumer AI adoption through parenting. Anish describes kids using AI as interactive collaborators (not passive media), and predicts classroom impact will extend beyond homework to social dynamics and SEL.
Getting better results: embrace surprises, reset often, avoid sunk-cost prompting
Anish closes with a mindset for when models fail: follow unexpected directions sometimes, but don’t get trapped iterating on a broken approach. Restarting is cheap, and abandoning “bad branches” is a feature of AI creativity.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome