Skip to content
Lenny's PodcastLenny's Podcast

OpenAI's Sherwin Wu: How Codex reviews 100% of OpenAI's PRs

How OpenAI uses Codex on every code review, shrinking review time sharply; engineers now manage fleets of AI agents inside the company at scale.

Sherwin WuguestLenny Rachitskyhost
Feb 11, 20261h 19mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

AI agents reshape engineering, management, startups, and enterprise automation rapidly

  1. Wu reports that Codex is deeply embedded at OpenAI: ~95% of engineers use it daily and 100% of PRs are reviewed by it, with heavy users opening far more PRs.
  2. He argues the engineer role is shifting from writing code to “managing fleets of agents,” requiring new skills in prompting, context management, and preventing agent drift—like “sorcery” with real consequences.
  3. For companies, he warns many AI deployments likely have negative ROI due to top-down mandates without bottoms-up champions; he recommends “tiger teams” of internal power users to drive adoption and best practices.
  4. He predicts major near-term shifts: longer-horizon agents (multi-hour tasks) and big gains in audio/multimodal models, enabling business process automation and potentially a boom of micro-startups supporting “one-person billion-dollar” outcomes.

IDEAS WORTH REMEMBERING

5 ideas

AI is already the default authoring and review layer at OpenAI.

Wu says nearly all code is AI-generated first, ~95% of engineers use Codex daily, and 100% of PRs are reviewed by Codex—making human effort more about steering and verification than typing.

High performers compound faster with AI, widening productivity gaps.

Codex-heavy engineers open ~70% more PRs, and Wu expects the gap to grow as power users learn better workflows and trust models more.

The new core engineering skill is agent management, not syntax.

Engineers run many parallel threads, supervise multiple agent tasks, and must prevent “Sorcerer’s Apprentice” failure modes where agents go off-rails without sufficient guidance.

When agents fail, missing context—not “model stupidity”—is often the root cause.

An internal experiment with a 100% Codex-written codebase shows teams must encode tribal knowledge into repos (docs, comments, structure, .md guidance) because they can’t rely on the “escape hatch” of manual coding.

AI can shrink the pain of code review and CI if you automate the boring parts first.

Wu notes Codex review can cut review time dramatically (e.g., from 10–15 minutes to a few minutes), and AI can auto-fix lint/test issues and rerun CI, collapsing the “get it into production” toil.

WORDS WORTH SAVING

5 quotes

Ninety-five percent of engineers use Codex. One hundred percent of our PRs are reviewed by Codex.

Sherwin Wu

Engineers are becoming tech leads. They're managing fleets and fleets of agents.

Sherwin Wu

It literally feels like we're wizards casting all these spells.

Sherwin Wu

This team doesn't have that escape hatch.

Sherwin Wu

The models will eat your scaffolding for breakfast.

Sherwin Wu

Codex usage at OpenAI (AI-authored code, PR review)Engineers as agent managers (“sorcerers” metaphor)Stress/failure modes when agents stall; context as bottleneckScaling code review and CI via AI automationManagement changes: leverage, top performers, larger spansOne-person billion-dollar startup and second/third-order effectsEnterprise AI ROI: top-down vs bottoms-up adoption“Models eat scaffolding”: shifting tool/architecture betsBuild for where models are going (capability trajectory)API/platform stack: Responses API, Agents SDK, evals, UI kitsNear-term model direction: longer tasks + audio breakthroughsBusiness process automation outside tech as major opportunity

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome