I Made an OpenAI PM Teach Me Codex For 67 Minutes

Abhi Muchhal is an International Growth PM at OpenAI, the person responsible for how ChatGPT grows in India, Brazil, Japan, and every market outside the US. Before OpenAI, he was a PM at Meta on election integrity, a growth PM at Nubank across Brazil, Mexico, and Colombia, and a founder building real-time translation tools on the OpenAI API. In this episode, he opens his actual Codex setup on camera: the harness, the automations, the prompts that work, and the ones that failed before he figured it out. Full Writeup: https://www.news.aakashg.com/p/how-to-use-codex-like-an-openai-pm Transcript: https://www.aakashg.com/how-an-openai-pm-uses-codex-and-image-gen-at-work-and-in-his-personal-life/ --- Timestamps: 00:00 - Intro 01:55 - Episode begins 03:48 - What has Codex unlocked for your PM work? 05:34 - Live demo, building the international growth dashboard 10:04 - Ads 11:32 - How to build in Codex, inputs, outputs, and Playwright 14:52 - Moving away from PRDs to Codex prototypes 21:23 - The three automations running before his day starts 28:37 - WhatsApp computer use demo setup 30:06 - Ads 33:05 - Codex takes action inside WhatsApp in 68 seconds 37:00 - Building a 1040 tax filing app in Codex 43:42 - What drove ChatGPT to 900M weekly active users 47:18 - ImageGen 2, the biggest ELO jump of any model 59:26 - How to break into OpenAI as a PM 1:05:27 - Outro --- Thanks to our sponsors: 1. Bolt.new - Ship AI-powered products 10x faster - https://bolt.new/solutions/product-manager/?utm_source=Promoted&utm_medium=email&utm_campaign=aakash-product-growth 2. Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7 - https://maven.com/product-faculty/ai-product-management-certification?promoCode=AAKASH550C7 3. Customer.io - Send smarter messages using your product data - http://customer.io/productgrowth 4. Ariso - Ship AI agents and features faster, with fewer regressions - https://ariso.ai/aakash 5. Jira Product Discovery - Plan with purpose, ship with confidence - https://www.atlassian.com/software/jira/product-discovery --- Key Takeaways: 1. The harness is what separates Codex users from Codex runners - The connectors, the permissions model, and the skills layer are the three components that make Codex a system rather than a chat tool. Without all three, you are using an expensive autocomplete. 2. Generic prompts hit the wrong data - Abhi's team had separate B2C and B2B tables that both matched "tell me about weekly active users." The generic query returned the wrong answer every time. Specificity is the skill, name the exact dashboard and the exact metric, looks simple but saves a lot of time when you scale. 3. Three permission levels - Read tasks and Synthesis get full autonomy. Anything going to another human gets your eyes first. Treating permissions as binary, all control or all autonomy, breaks. 4. The person who cares most builds the skill - One OpenAI growth team built a skill that automates their entire experiment review process. It writes the hypothesis, monitors the run, and prepares the review doc. 5. Real automations run without you - Abhi runs three automations before he opens a single dashboard: a Slack triage, a 9:30AM self-refreshing growth dashboard pulling from 7-8 sources, and a weekly stakeholder update that writes its own first draft. He reviews, makes edits if needed, and sends. 6. Prototype before you document - Build the working prototype first, then write the companion FAQ. Showing engineers something that runs changes the conversation from whether to build to how to build it. 7. India is OpenAI's second largest market and under 10% of working adults are knowledge workers - The ChatGPT use case that drove US growth does not reach the same share of people in the markets driving the most new users. 8. The WhatsApp computer use loop ran in 68 seconds - Point Codex at the WhatsApp desktop app. It reads what you missed, identifies action items, checks your calendar, and types the draft. One tap to send. Every PM building for international markets should run this workflow. 9. Speaking evals is the key to breaking into a frontier lab - Name a capability you care about. Describe how you would measure it. Say how you would know if the model improved. You need to understand why they exist and what a good one measures. 10. Building something real is non-negotiable for frontier lab applications - Abhi had a live Chrome extension running on the OpenAI API at the time of his application. --- Where to find Abhi Muchhal: LinkedIn: https://www.linkedin.com/in/abhimuchhal/ OpenAI: LinkedIn: https://www.linkedin.com/company/openai/ Where to find Aakash: X: https://x.com/aakashgupta LinkedIn: https://www.linkedin.com/in/aagupta/ Newsletter: https://www.news.aakashg.com #AIPM #OpenAI #Codex --- About Product Growth: The world's largest podcast focused solely on product + growth, with over 200K+ listeners. Subscribe and turn on notifications.

Abhi MuchhalguestAakash Guptahost

May 31, 20261h 7mWatch on YouTube ↗

CHAPTERS

0:00 – 1:55
Why this episode matters: a rare look at how OpenAI PMs actually work
Aakash frames the episode around OpenAI’s explosive growth and the lack of public examples showing real OpenAI PM workflows. He introduces Abhi Muchhal (International Growth PM at OpenAI) and previews live Codex-based building and automation demos.
1:55 – 3:48
Codex as an “agent”: what it unlocks for PM leverage
Abhi explains the evolution from chatbot → collaborator → agent, and why Codex changes his day-to-day PM work. He highlights two main value streams: automating repetitive PM tasks and enabling PMs to build prototypes/features to 70–80% without waiting on engineering bandwidth.
3:48 – 5:34
Live demo: an international growth dashboard that replaces “seven dashboards and a headache”
Abhi demos a web app he built with Codex that consolidates multiple internal analytics sources into one international growth cockpit. The dashboard supports per-country views, headline metrics, strengths/risks, and deeper competitive benchmarking, updated daily via automation.
5:34 – 10:04
Generalizable takeaway: synthesis + TL;DR beats raw dashboards
Aakash challenges the “why not just use Databricks?” question, and Abhi clarifies the core principle. Codex adds value by aggregating cross-tool data and generating interpretation—reducing cognitive load and turning metrics into prioritized takeaways.
10:04 – 11:32
How to build with Codex: defining inputs/outputs, previews, and Playwright testing
Abhi walks through the build flow inside Codex: specify desired output behavior and list data inputs/connectors, then let Codex scaffold the app. He shows how Codex runs local previews and uses Playwright/browser screenshots to self-diagnose UI issues and iterate quickly.
11:32 – 14:52
From PRDs to prototypes: a new PM workflow (with a companion FAQ doc)
Abhi describes replacing long PRDs with prototypes as the primary artifact for alignment and feedback. He still maintains a lightweight companion document to cover hypotheses, success metrics, and stakeholder concerns, but the prototype becomes the “main show.”
14:52 – 21:23
Shipping mechanics: working locally vs. real repo, and how PMs get to 80%
Abhi explains when he builds locally (internal tools) versus working against the actual ChatGPT codebase. A key tactic is asking engineers for the closest existing reference in the repo, pointing Codex to it, and iterating to a real PR engineers can refine.
21:23 – 28:37
Codex use cases and honest limitations: repetitive automation vs. net-new building
Abhi buckets Codex use into (1) automating repetitive PM work and (2) unlocking new capabilities like dashboards and prototypes. He also shares failure modes: ambiguity in data definitions, and imperfect signal-to-noise in summaries, requiring human review.
28:37 – 30:06
OpenAI’s Slack-heavy operating system: three pre-day-start automations
Abhi shows how Codex fits into OpenAI’s communication workflow by automating Slack triage, dashboard refresh, and weekly updates. These automations reduce missed messages across time zones and create reliable, repeatable reporting cadences.
30:06 – 33:05
Harness + skills: why “the Codex harness” is the real differentiator
Abhi argues the biggest unlock isn’t just the model—it’s the harness: connectors, skills, and repeatable workflows. He describes a team-built “experiment review” skill that ingests Statsig, tracks progress, drafts hypotheses/postmortems, and proposes recommendations.
33:05 – 37:00
Personal agent demos: Codex “computer use” with WhatsApp + Calendar actions
Abhi demonstrates Codex computer use to triage WhatsApp messages and identify actionable items. He then pushes it further: Codex reads a message, checks Google Calendar availability, drafts a reply in WhatsApp, and leaves final sending to the user for control.
37:00 – 43:42
Building a 1040 tax-filing app with Codex—and how to think about safety
Abhi shares a personal project: a web app that ingests tax documents and outputs a complete 1040 form. He cross-checked it with his accountant, found a missed income source, and discusses safe usage via data controls and permissioned action levels.
43:42 – 47:18
International growth at OpenAI: serving the majority of humanity
Abhi explains the mission-driven rationale for international growth: most users live outside the US and have different contexts and workflows. He outlines his scope across three layers—model improvements, product use-case surfacing, and top-of-funnel storytelling/partnerships.
47:18 – 59:26
What drove ChatGPT to ~900M weekly actives: search, multimodality, and ImageGen
Abhi frames growth as expanding beyond knowledge workers and students, especially in markets where knowledge workers are a small minority. Feature breakthroughs like Search (fresh info) and ImageGen (multimodal, low-text friction) broaden relevance globally.
59:26 – 1:05:27
ImageGen 2 deep dive: biggest quality jump, multilingual text, and pro editing workflows
Abhi showcases ImageGen 2’s improvements: realism, multi-image storytelling, better multilingual character rendering, and finer edit control. He shares practical tips like using “thinking” mode, region-based edits, and ratio formatting, while noting steerability remains an area to improve.
1:05:27 – 1:07:06
Breaking into OpenAI as a PM: core skills + living AI + speaking evals
Abhi outlines what matters for PM candidates: core PM fundamentals still apply, but you must actively use AI tools and understand frontier dynamics. He emphasizes evals as the “currency of progress” and shares his personal path—international experience plus a builder mindset from side projects.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Why this episode matters: a rare look at how OpenAI PMs actually work

Codex as an “agent”: what it unlocks for PM leverage

Live demo: an international growth dashboard that replaces “seven dashboards and a headache”

Generalizable takeaway: synthesis + TL;DR beats raw dashboards

How to build with Codex: defining inputs/outputs, previews, and Playwright testing

From PRDs to prototypes: a new PM workflow (with a companion FAQ doc)

Shipping mechanics: working locally vs. real repo, and how PMs get to 80%

Codex use cases and honest limitations: repetitive automation vs. net-new building

OpenAI’s Slack-heavy operating system: three pre-day-start automations

Harness + skills: why “the Codex harness” is the real differentiator

Personal agent demos: Codex “computer use” with WhatsApp + Calendar actions

Building a 1040 tax-filing app with Codex—and how to think about safety

International growth at OpenAI: serving the majority of humanity

What drove ChatGPT to ~900M weekly actives: search, multimodality, and ImageGen

ImageGen 2 deep dive: biggest quality jump, multilingual text, and pro editing workflows

Breaking into OpenAI as a PM: core skills + living AI + speaking evals

Get more out of YouTube videos.