OpenAI Codex lead on the new shape of product work | Andrew Ambrosino

Andrew Ambrosino leads development of the Codex desktop app at OpenAI. Nearly 100% of OpenAI employees—not just engineers—now use Codex weekly. A lifelong builder with a background spanning engineering, design, product management, and founding companies, he is now responsible for turning the Codex desktop experience into what he calls “the best desktop app that has ever existed, full stop.” *In our in-depth conversation, we discuss:* 1. Why AI has completely flipped the product development process 2. What “taste” really means as a professional skill, and why it is emerging as the most valuable capability in an AI-first workplace 3. Why Andrew believes the Codex app would have failed if they launched it last November (vs. in February) 4. The “zone defense” model for how product managers at OpenAI operate when everyone can build anything 5. How roles are collapsed on Andrew’s team, and why eliminating the concept of roles entirely is a big mistake 6. How Andrew uses Codex to run his own workflows 7. The vision for a home base that coordinates work across ChatGPT, Codex, and the tools people already use. *Brought to you by:* WorkOS—Make your app enterprise-ready, with SSO, SCIM, RBAC, and more: https://workos.com/lenny Mercury—Radically different banking, now with Command: https://mercury.com/ *Episode transcript:* https://www.lennysnewsletter.com/p/openai-codex-lead-on-the-new-shape *Archive of all Lenny's Podcast transcripts:* https://www.dropbox.com/scl/fo/yxi4s2w998p1gvtpu4193/AMdNPR8AOw0lMklwtnC0TrQ?rlkey=j06x0nipoti519e0xgm23zsn9&st=ahz0fj11&dl=0 *Where to find Andrew Ambrosino:* • X: https://x.com/ajambrosino • LinkedIn: https://www.linkedin.com/in/ajambrosino • Website: https://ambrosino.io *Where to find Lenny:* • Newsletter: https://www.lennysnewsletter.com • X: https://twitter.com/lennysan • LinkedIn: https://www.linkedin.com/in/lennyrachitsky/ *In this episode, we cover:* (00:00) Introduction to Andrew Ambrosino (02:30) How AI is changing the shape of product work (06:32) When to use documents vs. prototypes (10:25) What “taste” actually means (12:06) Why AI is still bad at design (16:18) Is the design process really dead? (21:35) What the design process looks like on the Codex team (23:41) Are product functions disappearing? (27:22) Team structure (30:12) IC vs. management (31:37) Planning roadmaps (35:16) Building features that don’t work yet (38:13) The ambition problem: when you’re too AGI-pilled (39:17) The latest frontier: loops and autonomous development (52:05) How Andrew uses Codex to automate his entire job (46:52) The power of computer use and browser automation (49:10) Will we run all our SaaS apps inside Codex? (52:05) The future vision for Codex (57:20) The videographer who built a Premiere Pro extension with Codex (59:30) Failure corner (1:01:50) Lightning round (1:07:03) BTS: How our producer uses Codex for editing *Referenced:* • Codex: chatgpt.com/codex • The Primal Mark: How the Beginning Shapes the End in the Development of Creative Ideas: https://www.gsb.stanford.edu/faculty-research/publications/primal-mark-how-beginning-shapes-end-development-creative-ideas • Linear: https://linear.app • “Taste” is not just taste in aesthetics: https://x.com/thenanyu/status/2067327619897446721 • Linear’s secret to building beloved B2B products | Nan Yu (Head of Product): https://www.lennysnewsletter.com/p/linears-secret-to-building-beloved-b2b-products-nan-yu • Paul Graham’s website: https://paulgraham.com • The design process is dead. Here’s what’s replacing it. | Jenny Wen (head of design at Claude): https://www.lennysnewsletter.com/p/the-design-process-is-dead • The case study factory: https://essays.uxdesign.cc/case-study-factory • Why humans are AI’s biggest bottleneck (and what’s coming in 2026) | Alexander Embiricos (OpenAI Codex Product Lead): https://www.lennysnewsletter.com/p/why-humans-are-ais-biggest-bottleneck • OpenClaw: https://openclaw.ai • OpenClaw: The complete guide to building, training, and living with your personal AI agent: https://www.lennysnewsletter.com/p/openclaw-the-complete-guide-to-building • From skeptic to true believer: How OpenClaw changed my life | Claire Vo: https://www.lennysnewsletter.com/p/how-openclaw-changed-my-life-claire-vo • The Codex feature that works while you sleep: https://www.lennysnewsletter.com/p/the-codex-feature-that-works-while • The AI paradox: More automation, more humans, more work | Dan Shipper: https://www.lennysnewsletter.com/p/the-ai-paradox-dan-shipper • Atlas: https://chatgpt.com/atlas • Anthropic: https://www.anthropic.com *Recommended books:* • The Gruffalo: https://www.amazon.com/Gruffalo-Julia-Donaldson/dp/0803730470 • The Big Orange Splot: https://www.amazon.com/Big-Orange-Splot-Manus-Pinkwater/dp/0590445103 _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com._ Lenny may be an investor in the companies discussed.

Andrew AmbrosinoguestLenny Rachitskyhost

Jun 28, 20261h 9mWatch on YouTube ↗

CHAPTERS

0:00 – 2:42
Why Codex is becoming OpenAI’s default “home base” app
Andrew opens with adoption and ambition: Codex is used by nearly everyone at OpenAI and is aiming to be the “best desktop app that has ever existed.” They discuss the quality bar required for an app to become the instinctive place you start work—like opening a browser tab.
- •Codex usage is widespread internally, not just among engineers
- •Vision: Codex as the default, hesitation-free app you open to do work
- •Quality bar and reliability as the primary product challenge
- •Codex expanding beyond coding into general knowledge work
2:42 – 6:23
The product-work inversion: implementation is cheap, curation is expensive
Andrew explains how AI flips classic product development: building is no longer the bottleneck. Teams can spin up many competing implementations quickly, so the hard part becomes selecting, framing, and integrating the best work.
- •Anyone can build almost anything quickly with frontier models
- •Explosion of parallel explorations ("90 attempts")
- •Old assumption (implementation is expensive) no longer holds
- •New bottleneck: curation, coherence, and deciding what matters
6:23 – 10:26
Documents vs. prototypes: choosing the right medium in an AI world
They push back on simplistic takes like “PRDs are dead.” Andrew argues that because both docs and prototypes are now cheap to create, teams must be deliberate about which medium drives clarity versus prematurely anchoring decisions.
- •Prototypes can over-anchor teams (“primal mark” problem)
- •Docs still matter when the goal is product clarity in vague spaces
- •Prototypes matter when stress-testing interactions with real use
- •Medium no longer reliably signals maturity or readiness to ship
10:26 – 12:06
What “taste” really means (and why it’s the new bottleneck)
Andrew unpacks “taste” beyond aesthetics—into systems thinking, context, strategy, and coherence. In a world of abundant implementation, taste is the ability to pick the right goal, shape, and integration path.
- •Taste isn’t just visual style; it includes judgment and systems fit
- •Covers framing, prioritization, and how work fits broader themes
- •Includes interaction semantics (e.g., motion matching meaning)
- •Core question: if we can build anything, what should we build?
12:06 – 16:50
Why AI is still bad at design (grading, novelty, and abstraction)
Andrew outlines why design lags coding: it’s harder to evaluate, less tied to labs’ research flywheels, and more culturally dependent. He also highlights deeper design challenges like abstraction and semantic consistency across a codebase.
- •Design is harder to train because feedback loops are subjective
- •Labs prioritize capabilities that accelerate research (coding over design)
- •Design needs novelty; copying patterns isn’t “good design”
- •Hardest layer: abstraction/semantics, not just pixels
16:50 – 21:11
Is the design process dead? What changes when everything can be “production-like”
They critique the traditional “case study factory” design process, which assumed building is costly and you must get it right before implementation. AI makes production-quality artifacts easy, so teams must decouple artifact polish from process stage and intent.
- •Classic process overvalued ritual and implied quality via steps
- •Polished artifacts no longer mean “late stage” or de-risked
- •Design process isn’t gone, but its tools and signals have changed
- •Need explicit clarity on what stage an artifact represents
21:11 – 24:01
How the Codex team actually works: role overlap, dogfooding, and “average of your work”
Andrew describes a pragmatic, fluid collaboration model where roles overlap and people are defined by what they do most, not strict boundaries. Dogfooding is central—even when it’s uncomfortable—because using Codex reveals what Codex must become.
- •More role “collapse” on Codex than in many orgs, but roles still exist
- •People are the “average” of their work over time (PM/eng/design blend)
- •Dogfooding loop drives product: use it even when it’s not best yet
- •Improving the product sometimes takes priority over optimizing process
24:01 – 27:22
Why eliminating roles is risky: specialties, best practices, and the “builder” trap
They explore whether functions like PM, design, and engineering will disappear. Andrew argues that removing rigid lanes is good, but deleting disciplines is harmful because specialties encode hard-won practices and depth that tools can’t replace overnight.
- •Some companies overreact: “get rid of PMs, everyone builds”
- •Roles as disciplines matter because they contain best practices
- •Tools reduce gatekeeping, making role-switching easier
- •Not everyone can (or wants to) do everything; depth still matters
27:22 – 30:14
Team structure on Codex: ‘10 to a few thousand,’ agency-first hiring, and zone defense PM
Andrew explains the unusual scale boundaries: a small core team but broad dependency on research and platform groups across OpenAI. Product leadership becomes “zone defense”—spreading coverage to steer chaos, fill gaps, and curate coherence.
- •Core team is double-digit engineers, smaller design group, few PMs
- •Codex as a culmination of many orgs’ work (models, infra, CUA, etc.)
- •Zone defense: PMs avoid clustering; seek coverage and gap-filling
- •Hiring emphasis: agency + taste; product-minded engineers
30:14 – 31:38
IC vs. management in the agent era: everyone is ‘managing’ something
They reframe the IC/manager divide: ICs increasingly manage agents and orchestrate work rather than write every character of code. Management remains essential, but the difference is granularity—what you’re coordinating and at what scale.
- •AI shifts IC work from typing to supervising/orchestrating
- •Management isn’t disappearing; it’s changing shape
- •Key competency: filtering signal vs. noise with unlimited output
- •Taste becomes critical to prevent “slop” from shipping
31:38 – 35:05
Roadmaps when the models move: hazy long-term plans and feature timing sensitivity
Andrew shares their planning philosophy: detailed near-term, intentionally vague long-term to avoid false precision. Model capability jumps can fully change whether a product succeeds, even if the product shape stays constant.
- •Short horizon = high detail; long horizon = intentionally hazy
- •False precision wastes time in fast-moving applied AI
- •Approach: prototype ideas, wait, and retest as models improve
- •Codex app could have failed months earlier due solely to model quality
35:05 – 39:18
Building features that don’t work yet—and the danger of being too ‘AGI-pilled’
They discuss why teams should build and archive “not-ready-yet” features as test artifacts for future model upgrades. Andrew contrasts overly autonomous early Codex concepts with more interactive, model-appropriate designs that succeeded.
- •Code artifacts can be future testbeds even if not shippable today
- •Re-releasing ‘same idea’ with better intelligence changes outcomes
- •Product must match current model limits (ask questions vs. overpromise)
- •Balance: fix paper cuts while also funding disruptive exploration
39:18 – 42:05
Frontier workflows: loops, autonomous development, and the ‘delete code’ problem
Andrew talks about the next frontier: supervised vs. unsupervised code generation, harnesses, and autonomous maintenance. A key blocker is that models tend to increase complexity, making fully autonomous improvement loops risky today.
- •The meaningful metric shifts to supervised vs. unsupervised generation
- •Explorations: overnight refactors, cleanup, and autonomous maintenance
- •Key limitation: models often add complexity instead of simplifying
- •Not yet at ‘agent reads Twitter/Slack and improves app’—but pushing toward it
42:05 – 46:52
How Andrew uses Codex to run his job: automated briefs, Slack triage, and release coordination
Andrew explains how his Codex usage evolved with his role—from writing the app with the app to managing discovery, alignment, and releases. He describes creating scheduled tasks that monitor Slack, generate daily briefs, and can be coached iteratively.
- •Personal dogfooding loop: fix what blocks your own work
- •Automations: daily brief across many Slack channels and workstreams
- •“Vibe coordinated” releases: auto-collect PR/Slack updates into trackers
- •Key UX challenge: powerful setup exists, but needs to be effortless for non-builders
46:52 – 52:06
Browser automation and ‘apps inside Codex’: connectors vs. in-app browser vs. computer use
They break down the different ways Codex can operate on external systems, including taking over the computer to click through tedious UIs. Andrew describes product trade-offs in making a browser a first-class surface, including security, latency, and keyboard shortcut conflicts.
- •Three modes: connectors, in-app browser, and full computer-use clicking
- •Computer use shines when connectors don’t exist (e.g., cloud consoles)
- •Security and tech stack choices (Electron vs. Owl/Atlas browser stack) matter
- •Hard UX problems: tabs, shortcuts, and muscle memory across many apps
52:06 – 59:25
The long-term vision: Codex as a desktop ‘home base’ that orchestrates specialist tools
Andrew traces Codex’s evolution from CLI to a right-sized desktop surface, then to broader knowledge-work usage—even when the UI is “hostile” to non-engineers. The goal is a flexible hub that can work inside tools (extensions/connectors) and also bring tools into Codex when useful.
- •Internal PMF first with engineering/research, then surprise adoption across functions
- •Attempts to move workflows elsewhere failed because people stayed in Codex
- •Vision: start/end work in Codex; hand off to best-in-class tools when needed
- •Example: Codex built a Premiere Pro extension to control editing primitives
59:25 – 1:09:56
Failure corner, lightning round, and post-roll: staying outcome-focused as process churns
Andrew reflects on years of startup struggle and constant micro-failures even now, emphasizing timing and fit. The lightning round reveals his current season of life (parenting), and the post-roll returns to a core lesson: don’t marry your process—commit to outcomes, adaptability, and self-awareness in the AI era.
- •Career failures: founder slog, regulated industries, repeated attempts
- •At OpenAI: fast feedback loops (and blunt internal critique) harden products
- •Lightning round: children’s books, Magic School Bus, Linear
- •Post-roll advice: don’t attach identity to tools/process; optimize for outcomes and adaptability

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Why Codex is becoming OpenAI’s default “home base” app

The product-work inversion: implementation is cheap, curation is expensive

Documents vs. prototypes: choosing the right medium in an AI world

What “taste” really means (and why it’s the new bottleneck)

Why AI is still bad at design (grading, novelty, and abstraction)

Is the design process dead? What changes when everything can be “production-like”

How the Codex team actually works: role overlap, dogfooding, and “average of your work”

Why eliminating roles is risky: specialties, best practices, and the “builder” trap

Team structure on Codex: ‘10 to a few thousand,’ agency-first hiring, and zone defense PM

IC vs. management in the agent era: everyone is ‘managing’ something

Roadmaps when the models move: hazy long-term plans and feature timing sensitivity

Building features that don’t work yet—and the danger of being too ‘AGI-pilled’

Frontier workflows: loops, autonomous development, and the ‘delete code’ problem

How Andrew uses Codex to run his job: automated briefs, Slack triage, and release coordination

Browser automation and ‘apps inside Codex’: connectors vs. in-app browser vs. computer use

The long-term vision: Codex as a desktop ‘home base’ that orchestrates specialist tools

Failure corner, lightning round, and post-roll: staying outcome-focused as process churns

Get more out of YouTube videos.