Skip to content
OpenAIOpenAI

Inside image generation’s Renaissance moment — the OpenAI Podcast Ep. 19

People are generating over 1.5 billion images a week in ChatGPT. In this episode, Product lead Adele Li and researcher Kenji Hata share some of the new use cases and trends since the launch of Images 2.0. Together with host Andrew Mayne, they trace the progress from the early DALL-E days and dive into the latest capabilities, including better text rendering, photorealism, multilingual support, world knowledge, aspect ratios, and character consistency. They also explore what comes next as image generation models evolve into more capable creative assistants. Chapters 00:36 How Adele and Kenji came to work on Images 02:27 Images 2.0 launch reception 05:25 Productivity use cases and and 360 images 09:34: Viral trends, authenticity, and imperfection 10:51 Training breakthroughs and photorealism 14:06 Evals, prompting, and creative control 22:16 Creative agents and what comes next 22:27 Images + Codex 28:08 Prompt tips

Andrew MaynehostAdele LiguestKenji Hataguest
May 14, 202629mWatch on YouTube ↗

CHAPTERS

  1. Why Images 2.0 feels like a “Renaissance” moment

    Andrew Mayne sets the stage for a conversation about Images 2.0 with product lead Adele Li and researcher Kenji Hata. They frame the release as a major paradigm shift in image generation quality and usefulness, not just an incremental version bump.

  2. Adele Li’s path to leading Image Gen product

    Adele describes her background in investing and her transition inside OpenAI from infra-focused work to product leadership. She explains how the market for image generation and ChatGPT’s role evolved, shaping what Image Gen needed to become.

  3. Kenji Hata’s route from audio experiments to Images research

    Kenji recounts starting on an unrelated audio effort before gradually contributing to Images 1.0 and then moving full-time onto the image generation team. The segment highlights how projects evolve internally as models near launch.

  4. Launch reception: massive usage and viral creativity worldwide

    Adele shares early post-launch metrics and the breadth of global trends. They discuss how improvements are immediately “visually obvious” to users and how that accelerates adoption and sharing.

  5. What the team optimized: text rendering, multilingual, and realism

    Adele explains the core mandates for Images 2.0: legible/accurate text, stronger multilingual performance, and photorealistic outputs that better preserve identity. Kenji notes the feedback loop from social media to prioritize fixes in new iterations.

  6. Productivity use cases: infographics, dense layouts, and “100 objects” tests

    Kenji highlights a shift from primarily playful image gen toward practical outputs like infographics and diagrams. He describes internal stress tests—like generating a grid of 100 random objects—to track steady improvements in binding, compositional accuracy, and reliability.

  7. Aspect ratios and 360 panoramas: an emergent feature becomes productized

    They explain how support for arbitrary aspect ratios led users to create long panoramas and “360-style” images. The team turned that emergent behavior into a built-in viewing experience on web and mobile.

  8. Viral imperfection: authenticity, nostalgia, and intentional ‘jank’

    Andrew and Adele discuss why users intentionally generate MS Paint, crayon, and scribble aesthetics. Adele argues that making something convincingly imperfect takes intelligence—and that the trend reflects a desire for authenticity and self-expression with AI.

  9. Training & post-training breakthroughs: speed, tokens, taste, photorealism

    Kenji describes learning-based improvements across versions, including making generation more token-efficient to maintain speed while improving quality. Adele emphasizes post-training focused on “taste,” beauty, and realism—balancing world knowledge with what users find compelling.

  10. Evals and creative control: personal tests, standard prompts, and better harnessing

    Adele shares her “me, me, me eval,” using many photos of herself/friends/family to test personalization and whether ChatGPT understands context around relationships and preferences. Kenji references common standardized photorealism prompts, while they discuss how vague requests (“make it better”) are translated by the model into actionable creative decisions.

  11. Education and internal workflows: images as a communication layer

    Kenji describes educator testing in an internal alpha channel, including accurate graduate-level biology diagrams. Adele notes images are now deeply embedded in communication—reporting that over half of internal slides use Image Gen—and points to future improvements like editability and better composition tools.

  12. What’s next: creative agents that understand your preferences and goals

    Adele outlines a roadmap toward a “creative agent” that collaborates like an interior designer, architect, or event planner—learning user tastes and iterating toward desired outcomes. They connect this to broader expansion of Image Gen throughout ChatGPT’s learning and creation experiences.

  13. Images + Codex: from design concepts to shipped apps (sprite sheets, websites, comics)

    They explore the synergy of image generation with coding agents: using images as the first step for UI concepts, then having Codex implement them. Examples include generating contact sheets for website redesigns, creating game sprites, and producing consistent multi-page comics and slide decks.

  14. Prompting tips: use Thinking mode, be open-ended, and specify style

    Adele recommends using Image Gen in Thinking/Pro experiences for stronger results, leveraging web search/tools and emphasizing open-ended prompts plus aesthetic grounding. Kenji advises being explicit about style preferences (e.g., minimalist, less dense layouts) to steer composition and clarity.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome