No Priors Ep. 24 | With Devi Parikh from Meta

Video dominates modern media consumption, but video creation is still expensive and difficult. AI-generated and edited video is a holy grail of democratized creative expression. This week on No Priors, Sarah Guo and Elad Gil sit down with Devi Parikh. She is a Research Director in Generative AI at Meta and an Associate Professor in the School of Interactive Computing at Georgia Tech. Her work focuses on multimodality and AI for images, audio and video. Recently, she worked on Make a Video 3D, also called MAV3D, which creates animations from text prompts. She is also a talented AI-generated and analog artist herself. Elad, Sarah and Devi talk about what’s exciting in computer vision, what’s blocking researchers from fully immersive Generative 4-D, and AI controllability. 00:00 - Democratizing Creative Expression With AI-Generated Video 08:31 - Challenges in Video Generation Research 15:57 - Challenges and Implications of Video Processing 20:43 - Control and Multi-Modal Inputs in Video 25:50 - Audio's Role in Visual Content 39:00 - Don't Self-Select & Devi’s tips for young researchers

Sarah GuohostDevi ParikhguestElad Gilhost

Jul 20, 202339mWatch on YouTube ↗

EPISODE INFO

Released: July 20, 2023
Duration: 39m
Channel: No Priors
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Video dominates modern media consumption, but video creation is still expensive and difficult. AI-generated and edited video is a holy grail of democratized creative expression. This week on No Priors, Sarah Guo and Elad Gil sit down with Devi Parikh. She is a Research Director in Generative AI at Meta and an Associate Professor in the School of Interactive Computing at Georgia Tech. Her work focuses on multimodality and AI for images, audio and video. Recently, she worked on Make a Video 3D, also called MAV3D, which creates animations from text prompts. She is also a talented AI-generated and analog artist herself. Elad, Sarah and Devi talk about what’s exciting in computer vision, what’s blocking researchers from fully immersive Generative 4-D, and AI controllability. 00:00 - Democratizing Creative Expression With AI-Generated Video 08:31 - Challenges in Video Generation Research 15:57 - Challenges and Implications of Video Processing 20:43 - Control and Multi-Modal Inputs in Video 25:50 - Audio's Role in Visual Content 39:00 - Don't Self-Select & Devi’s tips for young researchers

SPEAKERS

Sarah Guo
host
Devi Parikh
guest
Elad Gil
host

EPISODE SUMMARY

In this episode of No Priors, featuring Sarah Guo and Devi Parikh, No Priors Ep. 24 | With Devi Parikh from Meta explores devi Parikh on generative video, multimodal AI, and creative control Devi Parikh, research director in generative AI at Meta and professor at Georgia Tech, traces her path from early “pattern recognition” work to leading-edge multimodal generative models. She explains the Make-A-Video project, which builds text-to-video by leveraging powerful image diffusion models and separating appearance from motion learning. Parikh outlines why video generation is progressing more slowly than images—citing infrastructure costs, representation challenges, architecture complexity, and immature data curricula—while emphasizing the importance of controllability and multimodal prompts for creative tools. She also reflects on AI’s role in democratizing creative expression, underexplored research directions like cross-modal models, and practical career advice such as not self-selecting out of opportunities.

RELATED EPISODES