No PriorsNo Priors Ep. 60 | With Playground AI Founder Suhail Doshi
Episode Details
EPISODE INFO
- Released
- April 18, 2024
- Duration
- 24m
- Channel
- No Priors
- Watch on YouTube
- ▶ Open ↗
EPISODE DESCRIPTION
Multimodal models are making it possible to create AI art and augment creativity across artistic mediums. This week on No Priors, Sarah and Elad talk with Suhail Doshi, the founder of Playground AI, an image generator and editor. Playground AI has been open-sourcing foundation diffusion models, most recently releasing Playground V2.5. In this episode, Suhail talks with Sarah and Elad about how the integration of language and vision models enhances the multimodal capabilities, how the Playground team thought about creating a user-friendly interface to make AI-generated content more accessible, and the future of AI-powered image generation and editing. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Suhail Show Notes: 0:00 Introduction 0:52 Focusing on image generation 3:01 Differentiating from other AI creative tools 5:58 Training a Stable Diffusion model 8:31 Long term vision for Playground AI 15:00 Evolution of AI architecture 17:21 Capabilities of multimodal models 22:30 Parallels between audio AI tools and image-generation
SPEAKERS
Sarah Guo
hostSuhail Doshi
guestElad Gil
host
EPISODE SUMMARY
In this episode of No Priors, featuring Sarah Guo and Suhail Doshi, No Priors Ep. 60 | With Playground AI Founder Suhail Doshi explores playground AI Bets Big On Pixels, Editing, And Open-Source Models Suhail Doshi, founder of Playground AI, discusses why he chose to build a company focused on image generation and editing rather than language or music, seeing a gap in long-term, dedicated effort around pixels. He explains how Playground trains its own diffusion models from scratch, pushing existing architectures like SDXL with new sampling tricks, meticulous data curation, and strong aesthetic judgment. A major theme is moving from “text-to-art loot boxes” toward high-utility workflows centered on editing, consistency, and integrating real and synthetic imagery. Doshi also outlines a long-term vision for a large vision model that can create, edit, and understand pixels, and shares views on future architectures, multimodality, and adjacent areas like AI-generated music.
RELATED EPISODES
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome




