EVERY SPOKEN WORD
70 min read · 13,577 words- 0:00 – 1:55
Intro
- AMAbhi Muchhal
Just this week, I got asked to write a PRD about a new platform investment we wanted to make. After writing the doc 20 minutes, I was like, "You know, this is boring. Like, I want to build the thing."
- AGAakash Gupta
ChatGPT's growth has become a global story. They crossed over 900 million weekly active users. It might be the fastest-growing app ever to that number. It is a wild story, and part of that story is their international growth story. Meet Abhi Muchhal, a PM on international growth at OpenAI. In this episode, he's gonna show you how he uses Codex at work, how he has used it to help drive the growth, and some of its latest features so that you can use it like a pro, not a beginner. I don't think there is a single piece of video on the internet that actually shows how OpenAI PMs work.
- AMAbhi Muchhal
But with Codex, there's now many examples where I have been able to take a feature to 70, 80%. Most of humanity does not live in the US. Most of humanity does not live in the developed world. It lives in India, Latin America, Southeast Asia, and we care deeply about making sure that all the tools we build benefit those users.
- AGAakash Gupta
If you stay till the end, you'll get to hear what his international growth PM role is actually like, as well as how being a PM at OpenAI is different from his prior roles. Before we go any further, do me a favor and check that you are subscribed on YouTube and following on Apple and Spotify Podcasts. And if you want to get access to amazing AI tools, check out my bundle, where if you become an annual subscriber to my newsletter, you get a full year free of the paid plans of Maven, Arise, Relay app, Dovetail, Linear, Magic Patterns, DeepSky, Reforge Build, Descript, and Speechify. So be sure to check that out at bundle.aakashg.com. And now into today's episode.
- 1:55 – 3:48
Episode begins
- AGAakash Gupta
Today we're in for a treat. As you know, OpenAI is at the center [laughs] of the AI revolution. That has been affecting us all as PMs since they launched ChatGPT in 2023. What the team has done is pretty crazy, and it's rare to get an inside look how an OpenAI [laughs] PM actually works. In fact, I don't think there is a single piece of video anywhere on the internet that actually shows how OpenAI PMs work. Today we get the pleasure to sit down with Abhi Muchhal. He has been a PM at Meta. He has been a PM at Nubank. He has been a head of product at Tenet as well, and now he is an international growth PM at OpenAI. He's gonna open the covers like has never been done before, show his actual Codex setup, show an actual app he built and uses inside OpenAI as a PM. So you're gonna get to see how OpenAI PMs work, actual work products that they have, as well as learn about the PM job. If you stay till the end, you'll get to hear what his international growth PM role is actually like, how they've driven that international growth, as well as how being a PM at OpenAI is different from his prior roles. Abhi, welcome to the podcast.
- AMAbhi Muchhal
Thanks for having me, man, and followed your work for a long time, so super honored to be here.
- AGAakash Gupta
Oh, that means a lot. I really appreciate [laughs] it. Sometimes as content creators we don't know how the work is landing with those at the frontier, and that's what I wanna talk about today. In fact, where I wanna start with is if anybody has been observing OpenAI at all, they have seen that in the last month or two Sam has been talking about Codex. Codex, Codex, Codex. It seems like you guys have invested a lot of effort into Codex. Six months ago, I told PMs Codex is the best way to use ChatGPT for their use cases.
- 3:48 – 5:34
What has Codex unlocked for your PM work?
- AGAakash Gupta
So what I want to understand is, as someone inside OpenAI, a PM inside OpenAI, what has Codex unlocked for your PM work?
- AMAbhi Muchhal
Yeah. So I think taking a step back, when ChatGPT launched a few years ago, it started as a chatbot. Last year, as we added different tools to it, including connectors, it became a collaborator. And with Codex, it has truly become an agent. And what that means is I can give it parts of my job to do, and it will do it end to end. And the meaningful differences for me there are I'm saving hours of my day, um, be- uh, especially on repetitive tasks that I'm doing it ag- again and again, but also it is enabling me to do the things that I've never been able to do beforehand. Um, so on the former, there's, in a PM's life cycle, there's things that you have to do every week or every month, right? You have to maybe prepare for a review or write a weekly update or send a note to people on following up on things. All of that is automated by Codex now. In addition to that, I was never a great engineer myself. I studied computer science at Michigan, but haven't really been coding since then, and I always felt intimidated, especially with the caliber of engineers that are here, as to wh- how I could contribute. But with Codex, there's now many examples where I have been able to take a feature to 70, 80%. It's like, cool, I'm excited about this. Engineering does not have bandwidth. I'm just gonna build it, and I'm able to take it to 80% and then let the engineers take it from, to the last final, uh, to the final mile. And that has been super empowering, both because I feel like I've turned from just a product manager to now also a builder, and I think it is inspiring to my team because it allows me to go from just writing docs to giving them functional prototypes about what we should be building.
- AGAakash Gupta
What would you say is, like, the single highest leverage thing that Codex has enabled you to do?That wasn't possible before.
- 5:34 – 10:04
Live demo, building the international growth dashboard
- AMAbhi Muchhal
Absolutely. So, um, I'd love to actually show you this. But before that, some, some context. So my job is at leading international growth, involves thinking about how ChatGPT is growing around the world. We are many, many different countries. And as your audience would know, growth has many layers of the funnel, right? We think about top of funnel acquisition, then activation, then engagement, then retention, then resurrection. We also think of competition. All of this information is in different dashboards at OpenAI. So me and my team would have to spend minutes and hours loading these different Databricks dashboards, seven, eight different sources, trying to figure out how to synthesize this. It was crazy. I feel like we were just getting lost in the noise. A few weeks ago, as I was seeing other people starting to use Codex, I was like, "You know what? What if I could build a dashboard that just one single web app that combines all these sources, and then on top of that synthesizes what are the important elements?" And voila, I was actually able to do it with Codex. So I wanna show you a little bit. We can get a virtual drum roll.
- AGAakash Gupta
[laughs]
- AMAbhi Muchhal
Uh, let's see if this works. Okay, cool. Let's hope the demo gods are there. So what you're seeing on my screen is a version of this dashboard. Now, uh, as much as I'd love to let you see the e- external data, uh, the internal data, I've, I've modified it a little bit so that you can see the structure without revealing any of the internal knowledge here. So what you see in this dashboard is basically the exact dashboard we use internally to think about international, right? And at the top here, I've made up, you know, seven or eight countries that we can think through. Now, all these are made up numbers, but you can see how these, uh, correspond to real life. So I can flip between different countries and see, okay, what is happening? What is, like, the top line metrics I care about? How are weekly actives doing? How is the penetration? How has it been growing from one thing to another? Then I can go one level deeper, and Codex has categorized for me what are the strengths and the risks. So here's the things that are going well, and then relative to the rest of the world and other peer countries, which Codex has figured out what is the peer set for this country, what are the places we could be improving? And that gives me and my team a snapshot of here's the areas we should be focusing on, or what is not trending well. We can go even deeper and do deep dives on, okay, here's how our market share is. Here's how new user growth has been. Here's how we're doing relative to other benchmarks, and a little bit more detail. What is cool about this is that I can give someone an exact snapshot, but then if they wanna go deeper, they see all the relevant stats and how it's comparing to competition and how it is comparing to our performance in other countries. And the final element that is awesome about this is that this is updated every single day through an automation at 9:00 AM every morning. So Codex runs it, I don't have to do anything. This has been a game changer for not only me, but for my entire team, because now so many more people can look at this data in one place and are able to make better decisions about how we should be investing our time and what user problems we should be solving.
- AGAakash Gupta
So I'll act as a skeptic here. This already lived in Databricks. What is the unlock here? How should PMs be thinking about, okay, I have some scattered dashboards. What's the generalized principle that people should be using Codex for?
- AMAbhi Muchhal
Yeah. I think that the broader takeaway here is a lot of times, especially in working in growth, you're making decisions by looking at multiple different sources of information, and there's a cognitive overload of trying to piece together how all these things fit together. So if it was just one Databricks dashboard, I wouldn't have to do this. But because it's seven, eight different sources, all with different cadences, and by the way, some of them are different tools. There's a Tableau dashboard for two of these things. There's a Databricks dash for six of these things. I was able to bring it together in one place and also provide the TLDR summary of what are the things that are important, because that doesn't exist in a dashboard. So it's a combination of the synthesis and the takeaways that have really become a game changer for my work and my team's work.
- AGAakash Gupta
Okay, so now I understand the value prop here. What we're doing is we're synthesizing data across multiple sources. We're leveraging LLMs and the connectors. If I wanted to build something like this myself, can you maybe walk us through, even if we can't see the end-to-end workflow, the steps, the prompts, what it would look like in Codex to end up with this type of a work product?
- AMAbhi Muchhal
Yeah, absolutely. I'd love to show you. So, um, I w- I'm gonna walk you through the journey that it took me to build it, and hopefully this inspires others that you could do the same with Codex today.
- 10:04 – 11:32
Ads
- AGAakash Gupta
Here's the dirty secret about prototyping. You spend two weeks building a prototype. You validate your assumptions. Engineering loves the direction. Then what happens? You throw the whole thing away. Bolt changes this completely. When you prototype in Bolt, you're not building throwaway mockup. You're building real front-end code that integrates with your existing design system. So when you hand it to engineering, they don't throw it away, they ship on top of what you've built. I use Bolt every single day. I host my LAN PM job cohort on it, and honestly, I'm up till 2:00 AM some days just vibing in the tool, having fun, and building. That's when you know a product is good, when you're using it past midnight, not because you need to, but because you want to. Check out Bolt at bolt.new. Link in the show notes. I hope you're enjoying today's episode. Are you interested in becoming an AI product manager, making hundreds of thousands of dollars more, joining OpenAI and Anthropic? Then you might wanna do a course that I've taken myself, the AI PM Certificate ran by OpenAI product leader Miqdad Jaffer. If you use my code and my link, you get a special discount on this course. It is a course that I highly recommend. We have done a lot of collaborations together on things like AI product strategy, so check out our newsletter articles if you want to see the quality of the type of thinking you'll get. One of my frequent collaborators, Pavel Hern, is the Build Labs leader, so you're gonna live build an AI product with Pavel's feedback if you take this AI PM certificate. So be sure to check that out. Be sure to use my code and my link in order to get a special discount. And now back into today's episode.
- 11:32 – 14:52
How to build in Codex, inputs, outputs, and Playwright
- AMAbhi Muchhal
If you're able to see my screen here, I basically ask CodexHey, I wanna create a web app which shows how I monitor growth across key priority markets with ChatGPT. I, the audience or internal stakeholders without, within OpenAI. Then I identify what I want the output to be. Here's the things that I care about, right? Let me switch between different tabs in different countries, as I showed you. Show me headline stats, highlight key strengths and weaknesses, and then have, like, a red and green thing of what's going well and what's not. I'm a visual learner. I want it to stand out. So that was the output. Then I said, "Here's the inputs." So I use a few different connectors, and I was like, "Hey, the competition dashboard with Tableau, the exec dashboard in Databricks," and so on and so forth. And the, for this case, I just said, "Hey," since I want to demo to your audience, like, "This is an external demo, so we're not using internal data." But I didn't do that when I was trying it out.
- AGAakash Gupta
Yeah.
- AMAbhi Muchhal
But the key thing I wanna highlight here was just to clarify to C- Codex, what are the inputs and what are the outputs, and then it sort of runs with it. So from that point forward, it went ahead and started building it. Um, you can see that it was, like, thinking a bunch. It added, it created a synthetic demo, added country tabs, added the views for the four sources of data. It added a few notes, and then the cool part is it ran a smoke test to validate that it looked that it was working itself, without me even telling it anything. So that's, that's pretty cool. Then I was like, "Okay, we've got something cooking here. How do I make sure that it's working well for me?" So I said, "Run it on terminal locally and show it to me." So it spun it up on localhost, got a web preview. Great. Then I was like, "All right." Um, I opened it, but, like, the background was, like, not what I wanted. It was, like, too dark. I felt like it wasn't, like, on brand. So I was like, "Make the background, like, openai.com. Let's stick to the brand that we have, and-
- AGAakash Gupta
Yeah
- AMAbhi Muchhal
... test it with the fixes so it can, so I can see that it's working."
- AGAakash Gupta
And so pe- if people don't know Playwright, it comes up a lot when you're AI coding. It basically is allowing Codex to see what it actually looks like to users, right? It kinda takes, like, a screenshot almost of it so that it can see it.
- AMAbhi Muchhal
Exactly, yeah. So C- Codex opens up the web browser, it takes a screenshot, sends that information, or maybe it's a series of screenshots, sends that information back to Codex for it to self-identify what are the UI issues that have been there. And so the prior to pl- you know, being able to use tools like Playwright, I would have to go open the browser myself, be like, "This nit, this nit, this nit." Now it's just, like, made it much more seamless. But what's also cool is that since, you know, we've also now created a browser within Codex, and so I can just see this preview right here. So I asked it to, you know, open the preview, and I could see, okay, here's the things that are coming up. The brand colors look okay. The data looks okay. So without going anywhere else, I can just do an end-to-end workflow in Codex.
- AGAakash Gupta
And I think this is a really important point for people, because a lot of PMs, their sort of first entry point into AI tools was around February last year, when they were told, "Use Bolt. Use Lovable." At this point, Codex is now able to build previews and show you previews directly in the app. So if you want, you can actually use Codex for prototyping. I'm curious, do you use Codex for prototyping?
- 14:52 – 21:23
Moving away from PRDs to Codex prototypes
- AMAbhi Muchhal
100%. Um, just this week I got asked to write a PRD about a new platform investment we wanted to make. After writing the doc for, like, 20 minutes, I was like, "You know, this is boring. I just wanna build, like, I wanna build the thing," right? So I built a prototype and then showed it to people, and that created a better discussion, 'cause I think everyone's a visual learner and wants to see what the end product looks like. And so I've moved away from writing PRDs to just creating prototypes.
- AGAakash Gupta
Whoa. We gotta pause there, because that's, like, the hottest topic [laughs] in product management. You moved away from writing PRDs and just creating prototypes. I guess someone might say, "Hey, well, the PRD, you probably still need that to figure out what's your null hypothesis, what your success metrics and your guardrails are, use it as a document for some stakeholder alignment, and make sure that privacy and legal and compliance and, for you guys, probably the safety, integrity, red team concept is all checked off on it." If you're just on prototype, how do you take over those key tasks of a PRD?
- AMAbhi Muchhal
Yeah. I think that maybe the key point I wanna make is that the end output isn't the document, it's the product you're trying to build, and that is conveyed with the prototype. Along with the prototype, what I would call, I have, like, a companion doc that explains, "Here's what's happening," and it sort of plays the role of what you're talking about in a PRD. It's like a quick spec of, "Here's the big..." It's kind of like an FAQ. As you look at this prototype, you may have these 10 questions. Cool. And these are the 10 questions, and it sort of covers that. So yes, I still have an accompanying document, but it is a companion, but the main show is the product itself, and that's what people are initially reacting to.
- AGAakash Gupta
Okay. It makes a lot of sense. So you have been a PM in the pre-AI era. So how would you compare and contrast, like, let's rewind back to when you were at Meta, for instance, versus now what you're doing at OpenAI. Maybe you can walk us through the product development process and life cycle, like when does prototyping come in, what does it replace along the steps of product development?
- AMAbhi Muchhal
Absolutely. So I think the pre-AI era, the usual process was you started with a hypothesis. You co- collected some data, built some confection, and then to convince everyone around you that this was worth doing, you wrote something like a spec or PRD. Different companies have different names. Which covered all the key elements of what needed to happen. After that, there was some internal alignment, then you convince the designer, "Hey, let's visualize this to people." Then you would work with the designer, do a few revs. "Okay, this is what it looks like." Then you go talk to engineers. "Can we build this? Here's the things we iterate," right? And so I would say that a lot of the same thinking still needs to happen.But we're moving faster now because I, without needing to leverage valuable designer engineering time, are, am able to start with, "Here's what I think this could look like." And obviously everyone knows, like I'm not a great engineer, I'm not, I'm definitely not a good designer, so everyone takes it with a grain of salt, but here's... I want you to react to something. And that shortcuts a lot of the, the steps that I was taking beforehand, 'cause I can show, here's what I think this need can look like, without having to waste eng- designer engineering time. And then we're just talking about how we could build it, or, and if this makes sense.
- AGAakash Gupta
Got it. And at OpenAI, like you mentioned you'd take the code to 80%, so would you be literally working off, would your Codex instance be hooked up to the main GitHub and be working off the main code base so it could use the real components and everything, and would you be shipping like a pull request, or would you just like kind of have like a local repo inside your local so that you could show them the 80% done? What are the mechanics of that?
- AMAbhi Muchhal
Yeah, great question. So depends on what the purpose is. For what I built, this demo website that I just showed you all, I was mostly using, uh, working locally, because it didn't need to integrate with the ChatGPT infra. It was something more for internal consumption, right? But let's say there's like a new feature that I want to ship for users in India, and I want to figure out, how can I, in a very scrappy way, get people to understand what this might look like? In that case, I would actually, um, use GitHub, s- pull from our broader repo with all the ChatGPT code base, then point it to, and this is the key part, point it to something similar that it should build upon. So the question I often ask my engineers when I'm going down this rabbit hole of like, "Abhi's gonna go build this thing," I'm like, "Hey, what's the most similar thing we have done to this?" And they're like, "Oh yeah, this. Just look at this GitHub." So I take that, point that to Codex, because then it knows, here's the thing it needs to build on top of, rather than spending a bunch of time navigating the code base. So I think that's the critical thing, is get a reference from an engineer of what is the most similar thing we've done. I do a few revs with Codex, I get to to a point where I have a pull request out, and then I go to the engineers and be like, "Hey, help me understand if this is the best way to do this. Maybe there's something's failing in my merge. Can you help me out?" And so it just accelerates that entire process.
- AGAakash Gupta
Really helpful insight. Okay. Prototyping and dashboards, those seem like two major use cases for Codex. You talked about a couple of the other ones. Maybe you can just enumerate them for us into buckets, and what I'd also be interested in as you enumerate that is, what is it still not good at? Like, what are the honest limitations?
- AMAbhi Muchhal
Yeah, so I'm gonna start with talking about, a little bit about how I use Codex in work, but I also have some cool things I could show how I use it in my personal life that's inter- in- interesting to your audience. So I would say in, in work, I put it into two buckets. There is the se- set of things I was already doing but were repetitive tasks: updates, dashboard reviews, synthesis, preparing slides for a external or internal presentation, preparing slides for a deck, uh, for a presentation I need to give. All those things, Codex is now doing end-to-end, and the things of those that are repetitive, it is automating. All I do is point it to the right things. And then there's like the net new stuff, which I would call becoming a builder, and those are things I was not doing beforehand, such as creating prototypes or creating dashboards, and Codex has sort of enabled me to do that. So that's on the work front. Do you want me to talk about the personal front as well?
- AGAakash Gupta
So really quickly marinating on the work front for a little bit more, 'cause I wanna open up a couple more layers there. What is like the operating system for a PM at OpenAI? Is it Slack, or do you guys use a system like that, and does Codex help you with that?
- 21:23 – 28:37
The three automations running before his day starts
- AMAbhi Muchhal
Great question. So maybe I want to show you, actually, it's a great example that answers this. I want to show you a little bit of what are my automations every single day. As you suggested, we live and breathe in Slack. We use Slack obsessively. I've not seen a company that is so addicted to Slack as we are. And, um, it, like all forms of communication, including actually with some external partners, we've brought them into Slack 'cause we don't, we, we just operate that way. Now, every PM wakes up in the morning, and especially with me, I work with people in different time zones, and I'm just overwhelmed with like the amount of Slack notifications and pings, and I inevitably miss something. Like, and then I get, like you don't want to be the guy who's like-
- AGAakash Gupta
Yeah
- AMAbhi Muchhal
... gets a response, "Hey, I sent you that message three days ago. What happened?" You're like, "Oh, crap, I missed it," like in the s- all the Slack notifs. So what I actually built is an automation which is my, a daily Slack inbox triage for me. It looks at all the key channels. I've told it, "Here are the important people." Like, "Hey, make sure if Aakash sends me a missio- message, tag that. And tell me the things that I haven't read that I should read, and tell me the things that I haven't responded to that I should respond." I get this once a day in the morning, and that's how I start my day. So that's like a really cool, um, automation that has saved my life. The other automation you see here is the dashboard that I showed you earlier. I've got this on an automation where every morning, it's a daily at 9:30AM, here's the automation that powers that dashboard, pulls all the data. And then finally, the, I was talking about weekly updates, right? So we have to write a weekly update to our stakeholder group talking about what's going well, what's not, how's that. Even this one pulls from a lot of data sources. A lot of it is in Slack, some in Google Drive, some in Notion, some from dashboards. So I've got an automation running that pulls all these things, puts it together, creates a weekly update, and posts it into Slack for me to review and send it out to everyone.
- AGAakash Gupta
Wow. Is there like a art, because I've heard some people who connected it to Slack, and I think even sometimes I face this issue. Is there like an art to like giving it the right context to navigate Slack correctly? Like sometimes your decision to ship like a particular international growth feature might live inside the 38th message [laughs] on a Slack thread, where the legal team is like, "Yep, okay, now it's good to go." Is there any tips or tricks around the Codex-Slack integration to make that really work well?
- AMAbhi Muchhal
Yeah, this is a good question. So like with anything, when it comes to ChatGPT or Codex, context is king.And context does not only mean pointed to these connectors, like use a Slack connector. It also doesn't only mean pointed to these Slack channels. It giving it information of what is the kinds of posts that it should index that are important. So anything, for example, I may say anything that talks about progress on evals or metrics is important. Anything that's like a net new learning, anything that's flagged as a block- blocker is an important thing. But to, connecting to a question you asked me earlier, I will be honest that I think this is one place where I don't think we're perfect yet. I still think that we struggle with the separation of signal to noise. And so this is why instead of asking Codex to just directly post the update to my stakeholders, I ask it to send a draft to me, and then oftentimes I'm like, "Okay, these three things totally made, made a ton of sense. This one thing, probably not that important. It missed this thing I should add. And then here's slightly different framing I'm gonna change because I've, I know that in a hallway conversation my boss asked me I, that she wanted me to cover this."
- AGAakash Gupta
So that's one limitation. I'm curious, are there any other mistakes or things that you tried to do in Codex that maybe you're not doing anymore that people could learn from just to accelerate their learning curve with Codex?
- AMAbhi Muchhal
I think another failure mode is regarding data sources. So as you might imagine, pretty big product. We have a B2C business, we have a B2B business as well, and we have some data tables that look very similar but are very, very different, right? You could talk about like weekly active users when it comes to consumer. You could talk about weekly active users when it comes to business and enterprise. And a failure mode is giving a very generic query to Codex saying, "Tell me about how weekly active user growth has changed." And that could be interpreted in many different ways, which are all correct. And so I think that the thing that I've learned for now is that I need to be very precise with it about what is the kinds of things I want, and ideally point it to what dashboard. I think s- I still think that right now, with an ambiguous prompt, it won't be effective at getting the right data.
- AGAakash Gupta
Awesome. And then final question on the work Codex setup. Have you developed, like Ryan Lipopolo wrote this amazing harness engineering piece that you guys had on your blog, for instance. Like, have you developed... What have you developed in your own personal PM harness for Codex? There's obviously this connectors component. Have you developed skills or like a really deep agent's MD file? What are those things within the harness that people should know about?
- AMAbhi Muchhal
Yeah, great question. I think that one of the takeaways for me, learning for the me for the last three months is that for the longest time, we as an industry talked all about the model. Um, we also talked a little bit about the product, and both of these continue to be important, but the big unlock has really been the Codex harness. 'Cause I think that is really what is powering a lot of what is being done today and is enabling me to pull from all these data sources to build these prototypes. I think the harness has truly been like a differentiator. In terms of how, how I've used it, um, I'm gonna give credit to some of my teammates 'cause I think they've used it even better than I have. On any growth team, you make decisions by ma- running experiments. And to make sure we're doing right by our users, we have a pretty rigorous experiment review process. Before you run an experiment, you write like a quick doc explaining, "Here's the hypothesis, here's the things you're, you're trying to go for." Then you set it up on Statsig, you run the experiment for a few days, and then after that you need to write a postmortem of how it went, what do you recommend the decision to be, and take it to a, a live meeting where we discuss the trade-offs. Now, what I describe here has a few different steps, and a, a few engineers on my team actually built a skill for growth engineers to do experiment reviews. So all you'd need to do is to point to the Statsig now. It writes the hypothesis itself, it monitors how the experiment is going, updates it, and then when whenever we feel like the engineer's ready to present an experiment review, provides a summary and comes up with recommendations and things that we should watch out for. So that's been an amazing skill that has really, um, been a game changer for productivity for our team.
- AGAakash Gupta
Awesome skill. And who is the right person to author that skill then? Is it like a product analytics expert that does like the, owns that skill and creates it as a shared team resource?
- AMAbhi Muchhal
I think the beautiful thing about Codex is that the person who cares the most is the, is the one who makes the skill. It doesn't matter if it's an engineer, an analyst, or even a PM. I've made some skills as well. And so I think the person who feels like, "Hey, this would be a game changer in my workflow and this could help others," is the one who ships the skill.
- AGAakash Gupta
Okay, so that's the work side of Codex. I'm really keen to hear how are you using Codex in your personal life, and what are some mind-blowing use cases we should be using it for?
- AMAbhi Muchhal
Absolutely. So I'll give you
- 28:37 – 30:06
WhatsApp computer use demo setup
- AMAbhi Muchhal
two examples. The first I wanna show you, and the other one I can talk about. As you might expect, we're doing international growth. A lot of the people I'm talking to outside the US use WhatsApp, right? Day and night, communications happens on WhatsApp, especially in countries like India and Brazil. And a lot of business communication and personal communication happens, so I, I have a large Indian family in a bunch of family WhatsApp groups. You wake up, you see like seventeen hundred messages. You're not sure what's important, what's funny, whose birthday did I miss, right? Just like, I always get yelled at for, for not being good at those things. So recently with Codex, we came out with computer use, and the idea there was being able to enable Codex to see what else is happening on your computer to give it the context to take actions. So I was like, "You know what? I've got the WhatsApp desktop app on my computer, and I'm gonna show you this because this happened on my personal PC. Let me see if it can make sense of it for me." So as an example, I recently went to India, was bombarded by a bunch of messages. So Codex, I'm getting back from a day of travel. Catch me up on WhatsApp by looking at the desktop app and share what are the most actionable things for me to take a look at. So it's taking the same mindset of what I do at work, but applying it to the personal life. In this example, it worked. It opened the, the app, and then it realized that there's two actionable things. There's a client meeting, which is actually work-related, 'cause I was doing that, but then there's a pl- personal meeting. In this case, my friend Claire, who is in town and wants to meet but cannot do Saturday dinner. What are the times that would work?And that was awesome, 'cause, like, I would've probably missed these two things, right? So that's step one. But then I was like, "Hmm, could it even
- 30:06 – 33:05
Ads
- AMAbhi Muchhal
go further?"
- AGAakash Gupta
I used to think I had a retention problem. Turns out I had a messaging problem. I was sending the same onboarding emails to every new user, whether they activated on day one or never logged in again. I had no idea who was slipping or why. Customer.io changed that. Every message I send is now based on what users actually do in the product. Someone hits a key activation moment, they get nudged to the next one. Someone goes quiet, they get a different path entirely. Their AI agent makes it fast. I describe the campaign I want, and it builds the full journey for me: triggers, timing, copy, even branching logic. And when I want to know how something is performing, I just ask the agent directly, and it tells me what to do next. They also have an MCP server, which means AI tools like Claude can see directly what's happening in your Customer.io workspace, your segments, your customer data, your attribution, all of it. So instead of explaining your business context every time you need help, Claude already knows it. Notion used Customer.io to personalize their onboarding and hit nearly 50% open rate, improved conversion by 6 to 7% with localized campaigns, and pushed open rates up another 20% through A/B testing. The idea is simple. Customer.io helps you deliver more impact from every message you send. If you're a PM or founder and your onboarding is still one size fits all, try Customer.io at customer.io. I'm notoriously bad at my inboxes. I guess there's a version of that where I seem cool and unavailable, but the reality is I miss sponsor emails, guest pitches, and stuff that my team actually needs me for. So I got an AI assistant, the sponsor of today's episode, Ariso. Ariso connects to my email, calendar, and Slack. Then I just chat with it over Slack, and it helps me with everything. It builds workflows to respond to emails, resolve customer issues, prep me for meetings. It actually comes to my meetings, updates its own knowledge, and remembers context from past conversations, so every time I talk to it, it already knows what I'm working on. I used to pay for Granola and Lindy separately. Ariso replaced both. One tool does more, and it lives right in Slack where I already work. Check it out at ariso.ai/aakash. That's A-R-I-S-O.A-I/A-A-K-A-S-H. Today's episode is brought to you by Jira Product Discovery. If you're like most product managers, you're probably in Jira, tracking tickets and managing the backlog, but what about everything that happens before delivery? Jira Product Discovery helps you move your discovery, prioritization, and even road mapping work out of spreadsheets and into a purpose-built tool designed for product teams. Capture insights, prioritize what matters, and create roadmaps you can easily tailor for any audience. And because it's built to work with Jira, everything stays connected from idea to delivery. Used by product teams at Canva, Deliveroo, and even The Economist, check out why and try it for free today at atlassian.com/product-discovery. That's A-T-L-A-S-S-I-A-N.com slash product dash discovery. Jira Product
- 33:05 – 37:00
Codex takes action inside WhatsApp in 68 seconds
- AGAakash Gupta
Discovery: build the right thing.
- AMAbhi Muchhal
Instead of just summarizing and reading things, could it take actions for me? And I was honestly not sure how this was gonna work, but let's try it out. So in the first example, this client has said, "Hey, I wanna meet up. What days and times work, work best?" So what's the, what's the normal process? You have to go look at the calendar, figure out what times work, figure out what the slots are. You have to, like, read the message. Since it's a little formal, you gotta respond in the right way, acknowledging it. That takes a lot of time. So I was like, "Why don't I throw Codex at this problem?" I asked at the beginning, "Hey, Codex computer, use respond to the message from Client A on WhatsApp desktop app by acknowledging that what they sent, then offer times available tomorrow afternoon by looking at Google Calendar. Then open the WhatsApp app and send the message." So here's a couple things that are happening, right? It's ingesting the information from WhatsApp, then it's looking at Google Calendar, which I've also synced with as a connector, and this is my personal calendar, to see what are the times available, also my work calendar, summarizing both, then opening the WhatsApp app and sending a message. So I'm gonna show you a little bit of how this happens. You can see it's looking at WhatsApp. You can see it's under- trying to understand what's going on. And then it's saying, "I'll place the draft in the composer for you." So instead of sending it itself, it realized that it might be something I wanna vet to make sure I don't send something that's crazy. It's typing the text, "Thanks for the follow-up," in WhatsApp. It tells me that the draft is ready. It opens it up for me to see it, and then all I have to do is press send. So that's, like, honestly a game changer that it was able to do that, and it was able to do it in a way where it gave me enough control to understand where I might need to intervene, open up the app, and all I need to do is press send and edit. So that was really cool.
- AGAakash Gupta
Wow. And I think that one of the things with AI is that we all need to keep pushing the latest models and latest feature toolkit to see what's possible. Many of us probably tried something like this six, seven months ago. The computer use was very lossy. It took, like, five, six minutes. I think the most mind-blowing thing for me in that demo was that it just took a minute and eight seconds. Like, you guys have really sped up the computer use, and so what's possible now is so much greater.
- AMAbhi Muchhal
Yeah, absolutely. I agree. And I think that even for me, working within OpenAI, it's hard for me to keep a track of everything that's launching. Like yesterday, we launched the ability for you to use this seamlessly with Google Chrome. And so I think that a key thing for PMs, who are probably inundated with all the AI news, is to try to set aside maybe some time in your week, maybe 30 minutes, and you could actually use Codex to help with this. Like, tell Codex, "Hey," like, "what's going on in the world of AI that might be relevant to me? Tell me the two, three things I should try out," and then try it again. And a lot of times you'll still see that it doesn't work, and that's okay. But what I've learned from this is that the things that are todayAlmost work but don't are definitely getting solved soon. And so that's like the signal of where there is value coming in the future.
- AGAakash Gupta
Yeah, one of my favorite prompts now that Codex can do this is say, "Analyze all my past chats, look up all the latest features. What new features should I be using from how I use you?" And it seems like it's able to incorporate all that knowledge now where it wasn't able to before.
- AMAbhi Muchhal
Absolutely. I think this is something we've focused on with both Codex and Sat Chat, is like model self-knowledge. It's like, uh, helping the tool understand what it is capable of, which is a really interesting recursive problem. By doing that, it can help you onboard to the tool. You can come to it and be like, "I've never used Codex before. Here's what I do. Help me understand what I could be using. What are the skills that would be relevant?" Um, so I think that's super powerful, and it's a great suggestion.
- AGAakash Gupta
Is there anything else in people's personal lives that they should be using Codex for that they might not be using it for right now?
- 37:00 – 43:42
Building a 1040 tax filing app in Codex
- AMAbhi Muchhal
So I have a funny example for you. I'm not gonna say I recommend it to others, but, um, every year, like just a few weeks ago, we had, you know, you have to file taxes. Um, last year, I switched jobs, and I was like, "You know, like, I work with this accountant, and I think the accountant is great, but what if I could just do it myself, right? Like, what if I could just, like, get it working end-to-end?" And so if you go to ChatGPT today, and you ask ChatGPT just to, uh, to file your taxes, it will give you advice, but it won't create the end-to-end output of a 1040. So I was like, "You know what? I get why Chat doesn't let you do that. Get it. It's, like, risky, but what if I could build a web app?" So over the last few months, I built a 1040 filing web app, which what it does is it takes as input all your tax documents and then spits out an output, not just the analysis, but a full 1040 that you can submit to the IRS. All you need to do is sign it. I was so cautious, though. I was like, "You know, like, this is still new technology. This is tax return. This is, like, Uncle Sam. I don't want to screw it up." So I did this, and then I also had my accountant do my taxes, and then I did, like, an A/B test comparison. And I was like, "Huh, like, the accountant is saying my refund is much higher. This, like, doesn't make sense." So I sent it act- my accountant what Codex did, and he's like, "Oh, crap, I forgot one income source." And I was like, "Oh my God." That was like a mind-blowing moment, that, like, c- this Codex agent, which has no knowledge of accounting practices, was able to spot out a m- m- mistake in my, what my accountant done. So I'm still planning on using my accountant for liability purposes, but the fact that I was able to do this was, was insane. So I think anyone could do that.
- AGAakash Gupta
Yes, I also used it to check my accountant's work in this latest tax cycle, so I highly recommend that for people. Everybody, though, what the, kind of the red alarm going off when we tell them that is, like, how do I do this safely? Like, how, so how exactly do you feed sensitive information into an AI model while maintaining peace of mind?
- AMAbhi Muchhal
Yeah, so I think there's, uh, two elements to safety that people think about. Um, first is regarding data, and the second is probably regarding control. My sense is that the data element, there's, you know, known solves for. Like, if you use, if you turn training off, you delete that chat, it won't be in your memory. If you use an enterprise account with us, it comes with a lot of other protections that enables you to securely use enterprise data. So I think those elements have existed and will continue to harden. But I think the other element is about control. I think in this OpenClaw era, everyone's seen those, um, threads on Twitter where it's like, "Oh my God, OpenClaw went and deleted something for me," right? Um, I have a lot of respect in the, what the team has built here with Codex, because you can ask, there's, like, different levels of permission. Um, you can say, "I want to review every action," to completely, like, "I want you to just run with it and find the right steps in between." And so that's pretty awesome. Um, and what that enables is that every time it's connecting to a new data source, it's pulling something on your laptop, it can ask you for permission. And then in my prompts also, I usually tell it, "Here's where I want to give feedback. I want you to get it to WhatsApp, and before you send the message, give feedback." And I think, like, establishing that control relationship is really important. Um, and that's what's helped me use it well.
- AGAakash Gupta
So we just walked you guys through how an OpenAI PM uses Codex in his work and his personal life. Now I need to learn more, Abhi, about your PM job specifically, because you've already given us a couple really interesting hints. It's on international growth, and I've been on this personal agenda myself of kind of, like, getting rid of this term AI PM. I feel like it's too umbrella. I feel like realistically, there are many different types of AI PMs. So maybe can you help me understand your own personal taxonomy of AI PM and what type of AI PM you are within that?
- AMAbhi Muchhal
To the broader question, I think every PM needs to be an AI PM. Um, that's what, that's what this new era is enabling. To able to operate at the ve- velocity that people do today, you need to use AI tools. And so I think that any product manager needs to be an AI product manager, and whether that means using it just to do parts of your job or using it to build things and push the frontier, that's kind of up to you. Um, as far as my role, um, I'll start with, like, one click up. Like, our mission statement is to make sure that AGI benefits all of humanity. Most of humanity does not live in the US. Most of humanity does not live in the developed world. It lives in India, Latin America, Southeast Asia, and all these other countries, and we care deeply about making sure that all the tools we build benefit those users. I think that the job I have is to figure out how we can be continuously providing value to users outside the world, that are outside the Silicon Valley tech bubble. We have grown a ton since we've launched ChatGPT. A lot of that growth has come from places like India. Like, India is one of our fastest growing and now our second-largest market. And so my job is to think about three layers of the stack. One, how do we improve our models?Um, how do we improve our product to surface relevant use cases? And then top of funnel, how do we tell the story of what ChatGPT could be doing or Codex could be doing for you?
- AGAakash Gupta
The traditional sort of growth and core split is that within core, PMs will own specific application surface areas. I imagine at OpenAI, there's also just the entire research model arm of product, so there's, like, research and then apps. And then within apps, is it, like, there's core teams that own specific features, and growth sits kind of cross-cutting, and you specifically are focused on the international part of it, or how is that structured?
- AMAbhi Muchhal
Yeah, I, I think that generally, we are still an extremely scrappy company, and we have way more to do than we have people to do it, and it just comes down to, like, what do you care about, and people are very willing for you to drive that end-to-end. Um, and so I think that the org structure and boundaries are more loose and more of a suggestion than a hard boundary. That said, we do try to follow the similar structure, where growth sort of is cutting across the work that is happening across the core app, across different features, and trying to figure out how we drive value. And my specific angle is, what are the unique pieces of value that, uh, people around the world might have that we were not thinking about day to day, right, and maybe our audience is not thinking about day to day, that we can think about. And that is at the model
- 43:42 – 47:18
What drove ChatGPT to 900M weekly active users
- AMAbhi Muchhal
layer, the product layer, as well as the top-of-funnel marketing and partnerships layer.
- AGAakash Gupta
Makes sense. So if we broadly think about, okay, you guys did this mind-blowing thing. You guys hit 900 million weekly active users, probably faster than anybody else ever. There's gonna be a lot of components that drove that growth, both from other PMs and yourself and those PMs in the growth org. Maybe can you walk us through, what have been, from, like, a product point of view, the most important things for that growth, and what people really wanna know is, like, what- which were the ones that the growth team drove within that? [laughs]
- AMAbhi Muchhal
Yeah, so I'll, I'll start by saying that, like, uh, most of the growth and the credit that I would give for what has happened is not because of me, it's not because of my only team, it's because of the amazing work that the rest of OpenAI is doing. And I would say my job is to sort of channel that and to help, help ground the work that they're doing in real-life user problems that may be outside the US. I think the narrative arc of ChatGPT is that a few years ago when it launched, it was really a chatbot tool, and I would say that it was used widely, but specifically we had sort of, I would call, like, early product market fit in two segments. One was knowledge workers, so people like you and I, and the, the other element was students, right? Now, what is interesting here as you think about it, and this is research done by ChatGPT, so I'm quoting ChatGPT stats. When you think about a country like Germany, about 60% of adults that have internet access are knowledge workers. If you think about a country like the US, somewhere like 40 to 50%. So great, in those places, if we wanna grow, like, catering to students and knowledge workers covers a lot of what we wanna do. But when I joined here, one of the things I looked at is, like, okay, how does that differ for, like, India or Brazil? So in Brazil, only 10 to 20% of the working population are actually knowledge workers. In India, it's lower than 10%.
- AGAakash Gupta
Wow.
- AMAbhi Muchhal
And so the thing I started thinking about is, like, these are all people that have jobs, they're adults, but they're not sitting day to day looking at a computer, right? That's not what their work entails. Maybe they're trading goods, maybe they're running a small business. Maybe work is happening on WhatsApp. And so what are the use cases that can drive value to them? So in 2024, um, one step change for us was launching Search, right? I think beforehand, if you remember early ChatGPT, it was like an arbitrary knowledge cutoff, where it was like, "Sorry, I don't remember things past this date," which made it hard to do simple queries about, "How do I get to work? What are the things I should be looking at?" Right? And these are the kinds of things that matter to a larger the world that are not just knowledge workers. So launching Search was one step change. And the other one was hands down ImageGen. When you think about how people interact with technology, a lot of people are not spending time typing and reading a lot of text. It's multimodal. They're talking, they're calling, they're viewing videos, they're viewing content. And image generation was our first breakthrough moment in that, in that vein, where we provided a, the rest of the world that wasn't a knowledge worker a obvious way to experience the benefits of AI without having to be steeped in reading three paragraphs of text. And it was so visually different than anything that had been d- done beforehand, and that's what led to, like, you know, a breakout moment, um, for ChatGPT last year, and it has continued this year as we've launched ImageGen 2, uh, Images 2, which I can talk a little bit about as well.
- 47:18 – 59:26
ImageGen 2, the biggest ELO jump of any model
- AGAakash Gupta
Yeah. We've had this recurring theme throughout this pod about figuring out what are the latest and greatest capabilities of the latest models, and Image 2 is, like, the biggest ELO jump of any model. Like, if you look at 1 versus 2, you look at Gemini Nano Banana 2, Gemini Pro 3, whatever you wanna call that model, versus Image 2 that you guys just released, it's, like, insane jump, way bigger than the jumps between 2, 3, and 4. So can you show us what has just recently become possible with the ImageGen model?
- AMAbhi Muchhal
Yeah, absolutely. I'm gonna show you a few examples so you can see, like, how amazing it is, but I just wanna start by saying huge kudos to our research team that work day and night. Like, when I saw that Ella Marina chart, I was also stunned. Like, I've never seen such a step change improvement, and I can viscerally feel it when I use the product, 'cause I use ImageGen every day, been using it for months, and, like, it has been such a game changer. Um, why don't we do this? I'm gonna share my screen, and I'm gonna show you, like, an example of something that when I tried doing beforehand, it would not work, and then while that's loading, I'll show, talk you a little bit through, like, what are the things that have, we've focused on to make this model amazing.Here's an example of something that when I tried in the past did not work. So here I, um, do a lot of user research, when we go talk to people in the field, and this is a prompt that I learned from someone who, um, was in Bangalore and wanted to open a bookstore, and wanted to imagine what this bookstore would look like. And specifically, they wanted to have an image of book titles in multiple different Indian languages. If you know anything about India, the language changes, like, every hundred kilometers there. And so you have to think about Hindi, Bengali, Marathi, et cetera, and put these titles on books. Now, the first ImageGen model struggled with this. Um, character rendering, especially in different languages, is very difficult. Plus, we want some, we wanna make it really realistic. We don't want, wanna make it feel like AI. So why don't we give a stab at what this new model does? Okay, so it's gonna take a few seconds to run. In that time, I want to talk to you a little bit of, like, from our exact blog post, which I love, and it was, like, image forward, like, what are the things we worked on? So first, we've worked on providing greater precision and control. Every time you create an image, you probably don't single shot the final output. You want to, like, edit and provide for fine grain ideas of, "Here's what I want you to change here." It's kind of what you might do on a Figma file. And with ImageGen, we've now allowed you to do that, right? It's g- able to do these fine grain edits to take something that was fun to, like, being work output. So that's pretty awesome. We've also allowed you to make multiple images at once, and so that tells a story. And the thing that I spend a lot of time with the team working on, and I care deeply about and mission align, is working, uh, better on different languages. So this is one of my favorite examples. This is a Japanese manga comic, and this example brings together some of the amazing things we've done here. So first, in prior, uh, versions of the model, we would never get the l- character rendering right. Um, it would either not be in the right characters or mixing up languages. And while I can't speak Japanese, I don't know if you can-
- AGAakash Gupta
No
- AMAbhi Muchhal
... it is able to get all the characters, um, correct, and it is able to point out, you know, what are the things, what is happening in the story one place and another, um, in a, in a rend- rendering cool fashion. And you can see, like, the level of detail here in the image outputs is, like, so real, and it's stitching together multiple images at once. So that's, like, an incredible example of how it is able to do this. Now let's switch back for a second and s- let's see if my prompt worked.
- AGAakash Gupta
And I think you guys also got the character consistency a lot better now too. Like, we saw in that manga that it was the same guy [laughs] wasn't, like, a different guy in the, across the frames.
- AMAbhi Muchhal
Exactly, exact- yeah, that's definitely something that annoyed me in the past. So, so what you see here, which is really cool, is that is, imagine this bookstore actually in the, in, in the book format itself, and it's talked about, like, the stories across multiple different languages. Like, you have all these different states of India that are represented here. So Kerala, Himachal Pradesh, Assam, all of the different languages, and the rendering for each of these, to my knowledge, looks pretty spectacular. But then if you zoom out, this looks like something that is actually in a high-definition image that was taken of an actual book. Yet in the 30 seconds that we, I was showing you something else, ChatGPT just made that. So that's just been, like, mind-blowing. And then for those who are, you know, pro image creators, what you can do is you can switch from instant to thinking, and that ables, e- enables a model to, like, even up, up its game on top of a very high bar, have more realistic outputs, provide high-definition images. So that's m- what my pro tip would be to all the image creators out there.
- AGAakash Gupta
Okay, so at least one key tip is use thinking, and in my personal use of it, I feel like what it's doing is it's, like, taking my prompt, understanding what is my actual goal, and basically, like, writing, like, a better prompt. That's how I've interpreted it. What are the other things one needs to know if they're about to go heavy into ImageGen? What are the limitations? Can I use this for charts in my upcoming product review, or is it still getting axes wrong? What can... Where should I be pushing the boundaries of usage for ImageGen?
- AMAbhi Muchhal
Yeah, great question. So one pro tip, first of all, if you want to describe edits, you can click this Edit button, and that opens up different options. So you can s- say, "Hey, I want this ratio of an image," right? So that's pretty cool, because a lot of times you got to edit it for a specific format if you're uploading it to Instagram. You can also select different areas of the image. And let's say I just specifically don't like this part. You can highlight it now and say, "ImageGen, everything else is good, but this part of the image, I'd like you to edit." So that's, like, a pretty, pretty cool example, and why, once again, it's going from using these images for fun to actually using it for, for work. And then to your other question, what are the breakout use cases for ImageGen, and we've seen this especially in countries like Japan, has been infographics. What we noticed even prior to bus- building the model was a lot of people were trying to use Japan and East Asia creating infographics that would share on social media. So how is Japanese growth been? Like, what is happening one place or another? And what ImageGen does is it is able to provide, create studio-level outputs. Um, so that, that's, like, an amazing thing. I still think that the place that I know that we'll continue improving is steerability. So allowing users to say, "I want you to change this specific thing, but keep everything else the same." We've made a lot of improvements in that, both in the product and the model, but it's something that is not, still not perfect, and I expect us to continue improving.
- AGAakash Gupta
Okay. I cannot emphasize for you guys enough to, if you haven't yet used the latest image model, I'm literally using it multiple times a day, and my biggest use case is charts. Like, it's, it's finally good at those. [laughs] I feel like all the other AI tools that I'd been using, like, they would do random things, like just use a out-of-skill chart or something like that. So feels like it's really crossed over something. It almost feels like, to me, and I'm curious if you felt this way, by the way, that, like, somewhere around December of last year, the coding models and Codex got really good, and it feels like right now with Image 2, like in April, we, the images got really good.
- AMAbhi Muchhal
100%. Um, I think that it was truly a step change, and the evals show this, but sometimes evals are maybe disconnected from reality. But this time, I can see that every single use case that I was trying before has just gotten better. It becomes so much more realistic.Um, the edits are so much more precise. The model is able to search in the web the real time and look at what's going on and bring that information into, um, into it, rather than using a knowledge cutoff from the prior part. And so I think we've really made a step change improvement there. Um, whereas is why I encourage not only people who want to use images for fun, which is awesome, but also small businesses and creators, people who can't afford working with an entire agency or hiring someone full-time to do this, that ImageGen can get you, like, beforehand, maybe 30% of the way. Today, I think it'd get you 90% of the way.
- AGAakash Gupta
Yeah, and I can't afford [laughs] creators and designers, and sadly, I've had to let some of them go because the difference between ImageGen is that it can make a full change for you in two minutes, right? And so your iteration speed gets really, really high versus when you worked with your designer, a lot of times, you know, you'd send that message to them, then they'd have to prioritize that work, then they'd have to do it. So there's always gonna be a couple hours of lag there. ImageGen's, like, instant, which is just crazy.
- AMAbhi Muchhal
What has been, I'm curious, uh, and, but this is useful feedback to our team, what has been your favorite use case of ImageGen so far?
- AGAakash Gupta
So what I do is [laughs] I tell it, "Build a chart that looks like it could have been in The Economist or Bloomberg, a very high-quality financial paper [laughs] of this recent data, and make sure that this data is up to date." And like you said, number one, it'll go find the updated version of the data. So, like, there's this very famous chart of how the price of goods has gone in America, and TVs have gone down and university has gone up. It was able to go find the extension of that data for 2026 for me, and it was able to make it look like beautiful journalist quality. So that has been the breakthrough for me. [laughs]
- AMAbhi Muchhal
Super cool. While we were talking, I tried another use case. I just wanna show you very quickly. So I recently went to Tokyo for work, and while all the business meetings were happening, I tried to sneak out and maybe, like, spend four or five hours just, like, absorbing the city and the culture. And a lot about, like, the Meiji dynasty and how it was powerful to the history of Tokyo and Japan. So I asked ImageGen, as we were talking right now, and he was thinking, "Tell me about Tokyo's relationship with the Meiji dynasty in a Japanese language infographic that's relevant to that audience." And what's really cool is, uh, it went and searched for this information as we were talking on the web, looked at the history of the Meiji dynasty, and I didn't want to make it something that was like a tourist would be impressed with. I wanted to make it something a local would be impressed with. So it shows the history of the Meiji dynasty over the period of time. So this is just, like, mind-blowing, and when I showed this to people in Japan, they were stunned. So that was, like, a big, uh, moment for me, 'cause I felt so proud of the work that we'd done.
- AGAakash Gupta
We've walked through Codex. We've walked through ImageGen. In my mind, those are the two things you guys really need to be trying. Now I wanna unpack for a little bit PM at OpenAI. So you've been a PM at Nubank. You've been head of product at Tenet. You were also a PM at Meta. What is fundamentally the difference working as a PM at a frontier lab?
- AMAbhi Muchhal
Yeah, a couple things. First, I would say that in some ways, the model is the product, right? The core thing we're building is the model. Then we build the product on top of it. The crazy thing about the model is that it is so general purpose, it can do so many different things. Also, the amazing thing is no one exactly knows what the next model is gonna be. Even we don't, right? We have a hypothesis of what's coming, but we're not sure. We are thi- we have things we're trying to improve on, but there are always behaviors when we put it out to the world that people use that we didn't even expect, and we've seen this with not only ta- uh, image, but also text and our voice models as well. And so I think the key difference is that you operate in such an, uh, ambiguous environment where so many things could be changing, and you have to be, like, your roadmap has to be extremely dynamic and flexible and receptive to what are the improvements that the research team is making. When we put something out there to the world, and we give it to, like, beta testers, like, what is Akash seeing? Like, maybe I should be highlighting that in the product, right? And so I think that's, like, the big change is that it is extremely fluid, but in a good way, 'cause we want to adapt to where the research is going.
- 59:26 – 1:05:27
How to break into OpenAI as a PM
- AGAakash Gupta
So if I am a PM at a regular company, and I have it in my mind that I would love to break into OpenAI, what are the things that I need to be upskilling, learning? There's just too much noise [laughs] out there.
- AMAbhi Muchhal
Yeah.
- AGAakash Gupta
AI news is a category we just talked about. You can use Codex to help you digest AI news, but there's also AI fundamentals, and some people are telling me I need to learn how to vibe engineer. Other people are telling me I need to understand RAG and fine-tuning. What is the truth? What do I actually need to know as a PM on AI topics? What are the AI topics I need to understand?
- AMAbhi Muchhal
So I'll say that first, as a comfort to your audience, a lot of the core PM skills still matter at OpenAI, right? Structured thinking, analytical thinking, communication, telling, storytelling. That matters wherever you go. But I would say, I'll add two things. First is, I think it's important that you're living and breathing AI. You're using AI tools to do your work and even outside of your work, so you're understanding where, what the frontier is today and where it's going. The second thing I would say is that the currency of progress, especially in a frontier lab, is evals. Um, you've probably heard this term being used a ton, but any time we think about a problem that we want to get our researchers excited to improve, the question they ask is, "Can we build an eval?" And simply what an eval is, is a way, a rubric which helps us understand, for a specific problem, how do we measure progress? What are the types of scenarios that we want to test? What are the expected output we w- wanna have? And then we look at where we're, we're today, and we set a goal of where we wanna be, and we work with research to help climb on that. So I think speaking the language of evals is, like, another key skill for all PMs.
- AGAakash Gupta
I can already see the pitchforks out, because I have had enough comments to my evals articles and evals episodes. What is the actual level of depth that a PM needs to go on evals? What do they actually... Where is the line between, "Okay, this is what the PM is doing, and they speak the language," versus, "This is the research team taking it over"?
- AMAbhi Muchhal
I, I would say that the roles are quite fluid, and there's different PM archetypes here, and different people who spike in different things. There are some PMs who are working very closely with the researchers, co-embedded with them, and they're going the entire way. They're coming with a hypothesis. They're writing the eval. They're running the eval. So end-to-end workflows. And then there's others who may be a little less involved and are basically helping research understand, here's the problems we wanna solve, and working with them to figure out, how do we go from that to an eval that maybe research is driving? So it's a little flexible depending on where you're working on. I wouldn't say that you need to come in and be an end-to-end expert and having run many of these, but I think you gotta understand the value of it and what it's trying to achieve.
- AGAakash Gupta
Can you tell us your story? I've been collecting... You're now the fourth person I've had on OpenAI. I've been collecting the stories of how you all have broken, because if, when I ask my audience, "Hey, where do you guys wanna work?" OpenAI always ranks first. So we always wanna hear the actual personal stories of how you guys broke into this hottest company in the world.
- AMAbhi Muchhal
Yeah, I, I consider myself lucky every day. I pinch myself. Um, everyone here is, like, definitely way smarter than me, so h- I think it's, like, astounding that I even got here. You know, thankful to those who took a bet on me. But if you take a step back, I think my- the through line through my career is trying to figure out how we take technology to the next billion people. I grew up in India, spent a lot of t- my childhood there, and I was always trying to figure out how we can build technology that helps people like that. After that, I started, you know, when I w- started working, w- worked at Meta, and one of the things I did was actually work on the election integrity team. That was trying to stop misinformation around elections around the world, so Brazil, India, EU, Africa. And so that's, like, when I s- first started interacting with this at work and the bug kicked in. And then I heard about this, like, crazy company in São Paulo, Brazil, called Nubank, which was then a growth stage startup, and they reached out and were like, "Hey, would you like to come work here?" And it was a wild thing in 2019 to consider that, but I packed up my bags and was ready to move to Brazil. I spent some of that time working remotely from the US because of COVID, but then I went there. I learned Portuguese. I worked in our Brazil business, our Mexico business, and our Colombia business, just hopping around. And so this is, like, a through line through my career. And so when OpenAI was starting to think about, um, this international role, I think that I had thought about these problems for a while, and in fact was building a early prototype company around trying to solve real-time language translation. At my previous job, some people w- spoke s- Portuguese, some spoke English, spum- spoke Spanish, but no one spoke all of them. It was, like, a big chaos at work. So I was like, "Okay, why don't, let me just build a Chrome extension on top of OpenAI APIs that would help solve this problem." And so I'd sorta got in touch with the team, was building. And I think a combination of my experience and then the builder attitude is what let me, led me to be here.
- AGAakash Gupta
Okay, so the ingredients are really coming together now, which is obviously world-class experience at two amazing companies, in Meta and Nubank, training you on the fundamentals of PM and specifically international growth. So you have that extremely strong career base. Obviously, you're gonna be top of your field there. But then the other component, which I think probably some people are missing and now they hear validated from your story, is you went out and you actually built a p- AI product.
- AMAbhi Muchhal
Yeah, 100%, and I think that the latter is important, not only f- from the perspective of a resume or an application, but for your own skill set and learning. When I started building, I started realizing what can work, what cannot work. I needed to create... I didn't even know this word then, but I need to create evals myself. Like, the translation thing wasn't working, so we had to create, like, a rubric. And so as I got into the interview process, I was like, "Okay, a lot of the things that we're talking about now, I haven't done it to this standard, but I understand operationally," and so that really helped.
- 1:05:27 – 1:07:06
Outro
- AGAakash Gupta
I could talk to you for another hour, two, three, four, but thank you so, so much for opening yourself up for this time, sharing so much information. I have not gotten these insights after talking to so many people, so thank you so much.
- AMAbhi Muchhal
Yeah, thanks for having me, Aakash. And final note, like, I think I went, I went to Michigan like you, and when I was there, like, 10 years ago, there weren't that many people in the product management industry. So I remember when I started learning about it, I, like, looked at you on LinkedIn. So it's funny, like, a decade later, full circle, being able to talk to you in this format. So thanks for all the help that you do for, uh, everyone who's interested in product and growth.
- AGAakash Gupta
Aw, thank you. If people wanna find you online, where should they find you online?
- AMAbhi Muchhal
Yeah, LinkedIn is probably the best place. Um, not a big Twitter person yet, but LinkedIn, and my name is the, the same as my handle.
- AGAakash Gupta
All right, guys, if you guys are working on international AI products, you now know the person who's responsible for international growth. Until the next episode, we'll see you later. I hope you enjoyed that episode. If you could take a moment to double-check that you have followed on Apple and Spotify Podcasts, subscribed on YouTube, left a rating or review on Apple or Spotify, and commented on YouTube, all these things will help the algorithm distribute the show to more and more people. As we distribute the show to more people, we can grow the show, improve the quality of the content and the production to get you better insights to stay ahead in your career. Finally, do check out my bundle at bundle.aakashg.com to get access to nine AI products for an entire year for free. This includes Dovetail, Maven, Linear, Reforge Build, Descript, and many other amazing tools that will help you as an AI product manager or builder succeed. I'll see you in the next episode.
Episode duration: 1:07:06
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode j1IOG8WoW1A
