Lenny's PodcastOpenAI's Sherwin Wu: How Codex reviews 100% of OpenAI's PRs
How OpenAI uses Codex on every code review, shrinking review time sharply; engineers now manage fleets of AI agents inside the company at scale.
CHAPTERS
Codex inside OpenAI: near-universal usage, AI-first authoring, and PR review at scale
Sherwin shares how deeply Codex is embedded in OpenAI’s engineering workflow: most engineers use it daily, and every pull request gets a Codex review. They discuss how AI changes not just code generation, but throughput and quality gates across the org.
The new engineer job: tech lead as “wizard” managing fleets of agents
Lenny and Sherwin explore how the software engineer role is shifting from writing code to orchestrating many parallel agent threads. Sherwin frames it as a return to the classic “programming as sorcery” metaphor—now made literal by natural-language ‘incantations.’
Agent friction and “stress when agents aren’t working”: the context problem
They discuss the anxiety of depending on agents and the failure modes that create bottlenecks. Sherwin highlights an internal experiment maintaining a 100% Codex-written codebase, which forces the team to solve problems without manually coding as an escape hatch.
Code review, CI, and “boring work” automation: collapsing time-to-merge
Sherwin explains how Codex shifts engineering effort away from repetitive, annoying tasks like code review and CI/lint triage. With automated review and fixes, the ‘last mile’ of getting code into production becomes dramatically faster.
Trust, verification, and avoiding the “brooms go wild” failure mode
Lenny probes the circularity of Codex reviewing Codex-written code. Sherwin describes a pragmatic stance: humans still inspect, attention is reduced—not eliminated—and internal model variants can provide alternative perspectives.
Engineering management in the AI era: leverage, larger teams, and top performer amplification
Sherwin explains that the EM role hasn’t transformed as radically as IC engineering—yet—though AI is increasing managerial leverage. He argues the productivity spread widens as high-agency, top performers become dramatically more effective with AI tools.
A “surgeon” management model and AI as a corner-looking unblocker
Sherwin shares a core leadership framework inspired by The Mythical Man-Month: make engineers feel like surgeons supported by a team. They brainstorm using AI to proactively detect blockers by scanning internal artifacts and anticipating future constraints.
Unpriced second- and third-order effects: the one-person billion-dollar startup and a boom in micro-SaaS
Sherwin argues people focus too narrowly on the headline concept of a one-person unicorn. The deeper impact may be an explosion of bespoke, vertical tools and a golden age of B2B SaaS enabling ultra-lean companies through outsourcing to specialized micro-startups.
Why many AI deployments have negative ROI: the top-down mandate trap
Sherwin explains why many organizations struggle to see returns: they’re outside the Silicon Valley AI bubble and lack practical adoption skills. Top-down “AI-first” mandates fail without bottom-up champions who adapt tools to real workflows and teach others.
Who belongs on the AI tiger team: technical-adjacent operators, not just engineers
They dive into composition of effective internal AI evangelist groups, especially for non-tech companies. Sherwin notes the best champions are often ‘coding-adjacent’ operators (e.g., Excel power users) who can translate AI into real operational gains.
When ‘listening to customers’ misleads in AI: models eat scaffolding
Sherwin argues customer feedback can anchor teams to today’s constraints in a space where the models rapidly obsolete tooling. He describes how successive waves (agent frameworks, vector stores, skills files) can become less necessary as models improve.
Build for where models are going: timing product bets and riding capability jumps
Sherwin advises founders to design products that are ‘80% possible today’ and become great as models improve. The goal is to align product architecture with the trajectory of capability gains rather than overfitting to current limitations.
What’s next in 6–18 months: longer coherent tasks and audio’s enterprise breakthrough
Sherwin highlights two near-term arcs: agents that can reliably execute multi-hour tasks, and major improvements in native multimodal audio. He argues audio is underrated because so much real business work happens in conversation.
OpenAI as a platform: startup fears, ecosystem commitments, and API stack overview
Sherwin addresses founder anxiety about being ‘squashed’ by OpenAI and emphasizes the market’s size and OpenAI’s platform orientation. He then outlines OpenAI’s developer stack, from low-level model sampling to agent orchestration and deployment tooling.
Closing mindset: engage without overwhelm, and enjoy the rare ‘gold rush’ window
Sherwin encourages listeners to lean in—build, experiment, and develop intuition for what AI can and can’t do—without trying to track every piece of news. He frames the next few years as an unusually creative, high-energy period in tech.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome