Building The World's Best Image Diffusion Model

Suhail Doshi, a YC alumni who previously founded Mixpanel and Mighty, has created a state-of-the-art (SOTA) AI image diffusion model with Playground. The app allows you to talk to it like a graphic designer and helps you create imagery and text for a wide variety of use cases. In this episode of Lightcone, Suhail sits down with the hosts to talk about his experience building Playground with his team, and what it takes to make a SOTA model. Try Playground: https://playground.com/design Read Playground V3 Paper: https://arxiv.org/pdf/2409.10695 Chapters (Powered by https://bit.ly/chapterme-yc) - 0:00 Intro 1:07 What is Playground? 1:47 What Garry was able to make using Playground 7:04 The focus on text accuracy 10:44 Building a marketplace for Playground 16:00 Prompts are like HTML for graphics 22:25 Creating new design professions 26:13 Using tailwinds of what is happening in language 30:06 Problems with aesthetics evals 32:42 The commercial applications 33:54 When the users you get are not the users you want 40:30 Reflections on going through YC twice 48:30 Running a research lab/startup hybrid vs a pure startup 53:35 What it takes to make a state-of-the-art model 55:09 Outro

Suhail DoshiguestGarry TanhostHarj TaggarhostJared Friedmanhost

Sep 19, 202455mWatch on YouTube ↗

EPISODE INFO

Released: September 19, 2024
Duration: 55m
Channel: Y Combinator
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Suhail Doshi, a YC alumni who previously founded Mixpanel and Mighty, has created a state-of-the-art (SOTA) AI image diffusion model with Playground. The app allows you to talk to it like a graphic designer and helps you create imagery and text for a wide variety of use cases. In this episode of Lightcone, Suhail sits down with the hosts to talk about his experience building Playground with his team, and what it takes to make a SOTA model. Try Playground: https://playground.com/design Read Playground V3 Paper: https://arxiv.org/pdf/2409.10695 Chapters (Powered by https://bit.ly/chapterme-yc) - 0:00 Intro 1:07 What is Playground? 1:47 What Garry was able to make using Playground 7:04 The focus on text accuracy 10:44 Building a marketplace for Playground 16:00 Prompts are like HTML for graphics 22:25 Creating new design professions 26:13 Using tailwinds of what is happening in language 30:06 Problems with aesthetics evals 32:42 The commercial applications 33:54 When the users you get are not the users you want 40:30 Reflections on going through YC twice 48:30 Running a research lab/startup hybrid vs a pure startup 53:35 What it takes to make a state-of-the-art model 55:09 Outro

SPEAKERS

Suhail Doshi
guest
Garry Tan
host
Harj Taggar
host
Jared Friedman
host

EPISODE SUMMARY

In this episode of Y Combinator, featuring Suhail Doshi and Garry Tan, Building The World's Best Image Diffusion Model explores playground Reinvents Image Generation As A True Graphic Design Partner The conversation centers on Playground v3, a state-of-the-art image diffusion model and design product optimized for real-world graphic design tasks rather than artistic toy use. Founder Suhail Doshi explains how the team rebuilt the entire stack—architecture, captioning, UX, and marketplace—to achieve unprecedented text accuracy, prompt understanding, and designer‑like interaction. They emphasize shifting from raw model access and prompt engineering toward visual templates, natural language edits, and a creator ecosystem. Alongside technical details, Doshi shares strategic lessons on choosing users, pivoting from failed directions, and marrying research rigor with product usefulness.

RELATED EPISODES