Skip to content
Aakash GuptaAakash Gupta

Ryan Podcast Final

Ryan Lapopolo, OpenAI Frontier team, leads the team building Codex. In this episode, he walks through why code is now a liability instead of an asset, how to run a billion tokens a day in your engineering org, the three-phase harness his team uses, and the painted-door technique for validating features before any backend code gets written. If you want to see what the frontier of AI software engineering actually looks like in 2026, this is the episode. Full Writeup: https://www.news.aakashg.com/p/how-pms-ship-100k-lines-of-code Transcript: [VERIFY - blog transcript URL] --- Timestamps: 00:00 - Intro 03:18 - Why code is a liability now 11:30 - Ads 13:46 - Building an internal agent from an empty codebase 20:04 - Breadth-first search through the agent's missing capabilities 20:43 - When a PM's PRD goes straight to a shipped PR 27:54 - The painted-door technique inside the codebase 28:48 - Ads 31:51 - The harness and the three phases of agent work 40:00 - Ryan's actual Codex setup, end to end 50:45 - The billion-tokens-a-day benchmark and the 60-hour PR 60:34 - The Monday morning roadmap for a normal company 71:00 - Why one agent beats multi-agent handoffs 73:31 - Where to find Ryan 74:08 - Outro --- Thanks to our sponsors: 1. Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7 - https://maven.com/product-faculty/ai-product-management-certification?promoCode=AAKASH550C7 2. Bolt - Ship AI-powered products 10x faster - https://bolt.new/solutions/product-manager?utm_source=Promoted&utm_medium=email&utm_campaign=aakash-product-growth 3. Customer.io - Send smarter messages using your product data - http://customer.io/productgrowth 4. Ariso - Ship AI agents and features faster, with fewer regressions - https://ariso.ai/aakash 5. Pendo - The #1 software experience management platform - http://www.pendo.io/aakash --- Key Takeaways: 1. Code is a liability, not an asset - Every engineering org was built around the assumption that code is expensive to produce, validate, and deploy. Codex inverts this. Code is now the cheapest part of the stack and the constraint moves to how clearly you describe the problem. 2. The new constraint is product decisions per week - With code generation effectively free and parallel, the bottleneck is no longer keystrokes. It is the quality of the brief, the clarity of the architectural boundaries, and the speed of verification. 3. A billion tokens a day is the new floor - Ryan's claim is that if you are not running this volume you are negligent. The math comes out to roughly $2K to $3K per engineer per month, which is trivial against the headcount cost of human-only execution. 4. A single PR can burn 350 million tokens - One refactor that would have taken Ryan three weeks ran on Codex for 60 hours straight across three days. He gave it two prompts total after the initial spec. The output matched what he would have produced himself. 5. The harness is the actual product - Codex CLI is the surface. The harness is everything that gets the agent the right context at the right phase. Pre-work, messy middle, and close. Each phase needs different context, different tools, and different verification. 6. agents.md is forcibly injected context - This file lives in the repository root and is always loaded into the agent's context. Use it for the operating model and the non-negotiable rules. Everything else gets pulled in dynamically because context is a hard, scarce resource. 7. The painted-door technique works inside the codebase - Ryan's team enforces package boundaries so a designer can paint a fake UI on top of stubbed APIs. Real usage signal, no backend cost. This only works because the architecture refuses to permit a ball of mud. 8. The PM's PRD can become a shipped PR in one week - In Ryan's setup, the PM wrote a markdown PRD, the team reviewed it in a Monday meeting, and a working feature shipped to customers by the following week with zero PM-to-engineer back-and-forth. 9. The Monday morning roadmap starts with legibility - The first move is making the repository legible to the agent. Write the implicit team decisions down in a documentation tree. Use @-mention Codex to keep that tree updated whenever a Slack thread surfaces a new guardrail. 10. One agent beats multi-agent handoffs - The lossy friction of agent-to-agent handoffs costs more than it saves. The right answer is one agent with full addressability over design, backend, and frontend, powered by a model good enough to hold the whole task in context. --- Where to find Ryan Lapopolo: X: https://x.com/Laoplo LinkedIn: [VERIFY - LinkedIn URL] Where to find Aakash: X: https://x.com/aakashgupta LinkedIn: https://www.linkedin.com/in/aagupta/ Newsletter: https://www.news.aakashg.com #AIPM #OpenAI --- 🧠 About Product Growth: The world's largest podcast focused solely on product + growth, with over 200K+ listeners. 🔔 Subscribe and turn on notifications.

Aakash GuptahostRyanguest
May 24, 20261h 14mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

Harness-driven agentic coding turns teams into high-leverage product factories

  1. Ryan argues “code is a liability” because LLMs make code generation cheap while long-term maintenance and validation remain the real costs to manage.
  2. He describes a “harness” approach—docs, tests, lints, review agents, and repo structure—that injects team judgment into the agent so humans stop micromanaging generation.
  3. OpenAI built a production agentic app from an empty repo to ~1M LOC with effectively zero human-written code, using engineers as system designers who prevent repeat failures rather than writing features line-by-line.
  4. Roles shift from strict PM/Design/Eng ownership to shared codebase contribution, with guardrails like package boundaries, modular architecture, and “painted door” UI experiments to keep changes safe.
  5. He claims leaders should aggressively scale token spend (up to “billion tokens/day”) because long-horizon autonomy and parallel orchestration (e.g., Symphony) can 10× PR throughput when quality gates are strong.

IDEAS WORTH REMEMBERING

5 ideas

Treat code generation as abundant; treat validation and maintainability as scarce.

With Codex/GPT-5, the bottleneck shifts from typing code to ensuring it’s correct, secure, maintainable, and aligned with product intent—so organizations must invest in verification, not hero coding.

The “harness” is the product your team builds for the agent.

A harness is the repo-embedded system of docs, conventions, tests, lints, CI review agents, and observability that continuously supplies the agent the context and constraints it needs across planning, implementation, and review.

Make failures impossible to repeat by converting feedback into guardrails.

When the agent makes a mistake, engineers should diagnose the root cause and encode the fix into tests, lint rules, docs, or architecture so the repository itself “teaches” future runs—reducing slop and rework.

PRD → code works only after you modularize and enforce boundaries.

Their PM could ship from Markdown PRDs because the codebase had strong interfaces, fakes for tests, and package-layering rules that prevent spaghetti dependencies and keep changes local and testable.

Use “painted doors” to let design explore without contaminating the backend.

A failed designer-led cron/scheduler feature produced a spaghetti mess before harness maturity; the fix was to separate UI experimentation from backend reality via no-op implementations plus instrumentation to gather user signal.

WORDS WORTH SAVING

5 quotes

In this world we are today, where the token prediction machine, these lovely, highly advanced models that we have, Codex is fantastic at taking all the best parts of GPT-5 and putting those, uh, you know, codes and words into the world. The code is trivial to generate. It can be generated at arbitrarily parallelism...

Ryan

...we produced a repository with about a million lines of code... But the important thing is not just that we produced an app with a million lines of code. It's that when doing so, literally zero of those lines of code were written by humans.

Ryan

...we can actually move to sort of PRD as code input with the app as compiled output.

Ryan

The new engineering job, I think, is to have everybody be staff engineers.

Ryan

...in multi-agent systems, the correct amount of agents to want to optimize toward is not multi, it is one.

Ryan

“Code is a liability” framingHarness engineering and context injectionPRD-in-Markdown to shipped PR workflowAgent autonomy: long-horizon runs and parallelizationRepo legibility: docs trees, exec plans, llms.txt referencesQuality gates: tests, lints, review-agent CI matricesOrg redesign: everyone as “staff engineer,” interns named Codex

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.