The power user’s guide to Codex | Alexander Embiricos (product lead)

Alexander Embiricos, the product lead for Codex at OpenAI, shares practical workflows for getting the most out of this AI coding agent. In this episode, he demonstrates how both non-technical users and experienced engineers can leverage Codex to accelerate development, from making simple code changes to building production-ready applications. Alex walks through real examples of using Codex in VS Code and terminal environments, implementing parallel workflows with Git worktrees, and creating detailed implementation plans for complex projects. He also reveals how OpenAI uses Codex internally, including how they built the Sora Android app in just 28 days, and offers insights on automated code review and the future of AI-assisted development. *What you’ll learn:* 1. How to set up and use Codex in VS Code and terminal environments for both simple and complex coding tasks 2. A practical workflow for running multiple Codex instances in parallel using Git worktrees to avoid conflicts 3. How to create detailed implementation plans using the Plans.md technique for complex engineering projects 4. Why context is critical when prompting Codex—and how to provide the right information for better results 5. How OpenAI uses automated code review to accelerate development while maintaining high quality standards 6. The key differences between vibe coding for prototypes versus building production-ready applications with AI 7. How the new GPT-5.2 model improves Codex’s capabilities with faster reasoning and better problem-solving *Brought to you by:* Brex—The intelligent finance platform built for founders: https://brex.com/howiai Graphite—Your AI code review platform: https://graphitedev.link/howiai *Detailed workflow walkthroughs from this episode:* • 3 Advanced Codex Workflows for Faster, Smarter Development with OpenAI’s Alex Embiricos: https://www.chatprd.ai/how-i-ai/advanced-codex-workflows-with-openai-alex-embiricos • How to Use OpenAI Codex to Understand and Modify a New Codebase: https://www.chatprd.ai/how-i-ai/workflows/how-to-use-openai-codex-to-understand-and-modify-a-new-codebase • How to Architect Complex Software Projects with OpenAI’s Plans.md Technique: https://www.chatprd.ai/how-i-ai/workflows/how-to-architect-complex-software-projects-with-openai-s-plans-md-technique • How to Manage Parallel Development with AI using Git Worktrees and Codex: https://www.chatprd.ai/how-i-ai/workflows/how-to-manage-parallel-development-with-ai-using-git-worktrees-and-codex *In this episode, we cover:* (00:00) Introduction to Alex and Codex (02:06) Getting started with Codex (04:54) Using Codex for parallel tasks (07:34) Understanding Git worktrees (09:51) Terminal shortcuts and command-line efficiency (12:16) How OpenAI built the Sora Android app with Codex (15:37) Using PLANS.md for problem solving (20:23) Using Codex for prototyping (22:22) Deciding between what needs a plan and what doesn’t (26:42) How to multiply the impact of Codex (28:08) Implementing automated code review with GitHub (30:01) Codex adoption at OpenAI (32:08) Challenges and innovations in AI integration (36:38) Recap and the Codex harness (43:49) Atlas and personalized AI interactions (49:09) Conclusion and final thoughts *Tools referenced:* • Codex: https://openai.com/blog/openai-codex • VS Code: https://code.visualstudio.com/ • Cursor: https://cursor.com/ • Git: https://git-scm.com/ • GitHub: https://github.com/ • Atlas: https://openai.com/atlas • ChatGPT: https://chat.openai.com/ • Slack: https://slack.com/ • Linear: https://linear.app/ *Other references:* • Sora app: https://openai.com/blog/sora • GPT-5.2 model: https://openai.com/index/introducing-gpt-5-2/ • SWE-bench: https://openai.com/index/introducing-swe-bench-verified/ *Where to find Alexander Embiricos:* LinkedIn: https://www.linkedin.com/in/embirico X: https://x.com/embirico *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Alexander EmbiricosguestClaire Vohost

Jan 12, 202653mWatch on YouTube ↗

EVERY SPOKEN WORD

55 min read · 11,239 words

0:00 – 2:06
Introduction to Alex and Codex
1. AEAlexander Embiricos
  people love how thorough and diligent Codex is. It's not the fastest tool out there, but it is the most thorough and best at hard, complex tasks.
2. CVClaire Vo
  If you're a software engineer or somebody who's even just new to using some of these AI tools, where would you get started with Codex?
3. AEAlexander Embiricos
  We're building it into a full software engineering teammate. One of the things that Codex is great at is simply answering questions. If you have a chat where Codex is producing these plans and you want to change something, it's actually really nice for the model if you just use the same chat to ask for changes to the plan, and that way, it has all this context in its head when it's ready to get going.
4. CVClaire Vo
  This is a great starter flow that shows how flexible this platform is and how it can meet a bunch of people at a variety of levels of tasks. How is OpenAI using this for bigger features and bigger products?
5. AEAlexander Embiricos
  We used Codex to build the Sora app for Android in twenty-eight days, and it immediately became the number one app in the App Store. [upbeat music]
6. CVClaire Vo
  Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today, we have Alexander Embiricos, product lead for Codex from OpenAI, and he's gonna show us how you get the most out of Codex, whether you're a non-technical user trying to make changes to an existing code base or want the power tips and tricks for getting the most out of it in the terminal. Let's get to it. This episode is brought to you by Brex. If you're listening to this show, you already know AI is changing how we work in real, practical ways. Brex is bringing that same power to finance. Brex is the intelligent finance platform built for founders. With autonomous agents running in the background, your finance stack basically runs itself. Cards are issues, expenses are filed, and fraud is stopped in real time without you having to think about it. Add Brex's banking solution with a high-yield treasury account, and you've got a system that helps you spend smarter, move faster, and scale with confidence. One in three startups in the US already runs on Brex. You can, too, at brex.com/howiai.
2:06 – 4:54
Getting started with Codex
1. CVClaire Vo
  Alex, thanks for joining How I AI. I'm excited about today's episode because we actually haven't seen a deep dive into Codex yet, and we are gonna get the expert take on how to get the most out of this tool. And I love that we're just gonna dive in and do a zero to one "Hello, World!" with Codex. So if you're a software engineer or somebody who's even just new to using some of these AI tools, where would you get started with Codex?
2. AEAlexander Embiricos
  Codex is a coding agent. We're building it into a full software engineering teammate. But to get started, let's just talk about where most people use it, which is in their IDE. Uh, I happen to use VS Code, so I'll show you Codex in VS Code. You can also use it, uh, the Codex extension in any VS Code fork, like Cursor, et cetera. So let's say that I just installed Codex from the VS Code extension marketplace. Do you want me to show that, by the way, or-
3. CVClaire Vo
  Yeah, let's do it.
4. AEAlexander Embiricos
  Okay.
5. CVClaire Vo
  Let's true- truly zero to one.
6. AEAlexander Embiricos
  All right. All right. Okay, truly zero to one.
7. CVClaire Vo
  Yeah.
8. AEAlexander Embiricos
  I mean, I'm not going to uninstall and log out, but we can pretend I did that.
9. CVClaire Vo
  Yes, love it.
10. AEAlexander Embiricos
  So pretend I clicked Uninstall, and then I [laughs] clicked Install. Right. So what would happen then is I'm gonna get this glyph here, uh, which is the Codex extension. I have to click through some steps and log in. So in case you didn't know, Codex is included in your ChatGPT plan, so you need a paid plan. So if you have a Plus plan, Pro, uh, Business, Team, or Edu, you can use Codex, um, and the limits are really generous. Okay, so let's say I have this thing up, and, uh, truly zero to one, let's pretend I like... Actually, I just heard that this is a game, but I don't even know how to play this game. Uh, one of the things that Codex is great at is simply answering questions. I'm the product lead for Codex, so I, I actually use Codex a lot for asking questions, probably more than most engineers do, 'cause I don't want to bother engineers with silly questions. So I might ask, "How do I play this game?" We just launched a new model today, so I'm actually curious what model that used. 5.2. Cool. We're gonna talk about that, I guess. So I'm just gonna run npm run dev here, as it's saying, boot up the server, and let's take a look at the game. Okay, so what I have here is, like, a simple commander-type game. I can move my character around. Um, I can recruit troops. It looks like planting windmills is not implemented yet, and I've heard there's something wrong with the jump. Okay, that's way too high. So let's get to work fixing some of these issues. So what I can do here is I can just go and ask... Let's just say, "That jump is way too big. Uh, lower, please." And so, uh, for those of you who are new to coding agents, I mean, this is pretty, pretty basic. I just wrote in plain, natural language, plain English, what, the change that I wanted, and we can see Codex getting to work, thinking up of a plan. "Okay, I need to figure out how the jump works. I need to then reduce it, and then I need to, like, make sure this whole thing works." So let's do that, and, uh, while we're at it, let's make some more changes as well. How about we, uh, implement the windmill planting? Um, I'm just doing these in new chats so they can go in parallel.
11. CVClaire Vo
  Yeah,
4:54 – 7:34
Using Codex for parallel tasks
1. CVClaire Vo
  I w- I wanna call out some stuff for, for folks listening or not watching. So what you're basically showing is the process of starting with an existing code base, and you as a... Let's just pretend you're a semi-technical user. You're, like, a product manager on this, and you're like, "Eh," you know, they ship something, but not exactly what you want. What you're using Codex for is, one, how, how do I even run this thing locally? Which I think is just such a... You know, people forget these basic use cases because I know there are a lot of software engineers that listen to this podcast, but people forget, like, not everybody knows how to run every repo locally. So one little thing you can do is, like, just how do I get this code base running so I can test it? And then, two, you're setting up little parallel tasks, which I think is really nice, and I'm curious, you know, how many of these do you find yourself doing on any one, any one code base to just fix little things? And so I guess my question for you is, on these parallel tasks, which in this example are very small, do you feel like it's a better approach to set up parallel tasks and just have, you know, individual ones running, or to do them in sort of a serial basis? Like, why one or the other?
2. AEAlexander Embiricos
  Yeah. This totally depends. So, like-... This is a bit of a toy project, but realistically, like the way that I, I typically work, if I'm, like, running around, um, so this is, like, very tactical to, like, I guess, PMs. So I'm looking at a terminal here, and I often just have some, like, question that I wanna know about. So actually, like, literally this morning, I was like, "Okay, I'm gonna do a demo. I know we just launched a new feature that makes it easier to pick models. Can I disable that?" And so I ran Codex, which popped me in. Uh, this is just some internal auto-update logic, but I ran Codex, and then I asked, "Hey, uh, we have this new feature..." By the way, I don't mind telling you guys about it because Codex is open source, so a lot of new things are just, like, out there in public. "We have this new feature that offers, um, balanced reasoning settings." I'm gonna quote it actually to give the model a clue that I want it to search that string. Uh, "Reasoning settings, how do I disable that for the demo?" So, you know, I might do this kind of thing, like, super frequently, or I might be like, "Hey, I heard a customer report about this behavior. Is that done?" Or I might ask a question like, "Hey, uh, did we ship this feature? I th- I, I like, I lost track of whether or not we shipped something." So I ask these kinds of questions a lot, and when I do this, running them in parallel is, like, great. There's no reason to, to do anything else. On the other hand, if I'm making changes, then I'm more likely to think about, "Okay, how likely is this change to conflict with another change?"
3. CVClaire Vo
  Yep.
4. AEAlexander Embiricos
  And typically, I'll either do one at a time, or I'll use something called a worktree, which is, I guess, a bit more of an advanced concept. Uh, we can get into that if you're interested. Uh, I'll use a worktree, and I'll just spend- send Codex off to do its work on a separate worktree.
7:34 – 9:51
Understanding Git worktrees
1. CVClaire Vo
  N- n- now, let's take it-- let's take a minute to, to look at worktrees because I think this is something that, um, most folks that are new to these tools aren't really using particularly well. I see sort of the two paths that you showed, which is, one, I'm just gonna do, like, one big branch, do these things in serial, and then commit them in, or two, I'm just going to, like, kick off a bunch of different tasks, but they're all going [chuckles] in the same conflicting space and creating issues. And so maybe we take a minute and talk about worktrees and how those work within Codex, or how you set them up and use them to make sure that you can run palle- parallel changes, but that they don't conflict with each other and can be reviewed separately.
2. AEAlexander Embiricos
  Basically, if we're gonna have Codex make changes... Uh, maybe I can come up with an example on the fly.
3. CVClaire Vo
  Mm-hmm.
4. AEAlexander Embiricos
  Let's say that we want to change, like, the language of this input to, like, French or German.
5. CVClaire Vo
  Mm-hmm.
6. AEAlexander Embiricos
  Like, obviously, those both can't be true at the same time. This is a very contrived example, right? But maybe I wanna try both. Maybe I'm prototyping something.
7. CVClaire Vo
  Mm-hmm.
8. AEAlexander Embiricos
  What I need then is I need two different copies of the code base. So I could just copy the code base twice, right? Like, just Command+C, Command+V in Finder, or I could Git clone the c- the repo twice. But Git has this really nice affordance called a worktree, which basically lets one Git instance track multiple copies of the code base. So, uh, you know, as a, as sort of a classic mammal, I am lazy, and so I don't wanna remember the commands for worktree, even though they're very simple. So typically, the way that I would actually do this is I would just ask Codex to create worktrees. So I might say something like, "Codex," and, uh, I could launch Codex like I just did and type the prompt, but a shorthand that's kind of nice in Codex is you can just put your prompt right in. So I might say, "Codex, in here, uh, create two new worktrees, uh, off the main branch." I might not be this super explicit if I was actually doing it. "Main branch, one called French and one called German." And so what you can see that happened just here is that Codex just launched and went straight into this prompt. I do this all the time. In fact, I, I've gotten so used to running this that sometimes I will forget to write Codex--quote. [laughing] Um, and I will literally just do something like this. Uh, I'll be in terminal, I'll say, "In here, do this."
9. CVClaire Vo
  Yeah,
9:51 – 12:16
Terminal shortcuts and command-line efficiency
1. CVClaire Vo
  I was, I was just looking at this because the- we're in a very meta repository right now, which is in... You're in a folder it called Codex, running Codex, talking to Codex. It's all you know, it's Codex all the way down, as they would say. But yeah, I think this is an important one to call out for folks that are not watching the YouTube and maybe are listening, is you can type, as you open a new Codex instance in the CLI, you can type this dash, dash and your first prompt right in one line, and this is like classic developer productivity. It's like I cannot be expected to press enter, wait, and then type my words. [chuckles]
2. AEAlexander Embiricos
  Exactly.
3. CVClaire Vo
  Um, so I love this. Okay, so then what you've done here is you've, um, used Codex, um, to do what we all use Codex for, which is not have to memorize Git CLI commands.
4. AEAlexander Embiricos
  Mm-hmm.
5. CVClaire Vo
  And you're creating two new Git worktrees off main. And then I'm presuming, as you work inside Codex, what you're saying is, like, "Okay, in the French worktree, do A, B, C, and in German, do X, Y, Z."
6. AEAlexander Embiricos
  Yeah, so let's, let's actually show working in those worktrees. So if you see here, I now have two folders.
7. CVClaire Vo
  Mm-hmm.
8. AEAlexander Embiricos
  I have one called French and one called German. So what I might just do is I might cd into French, and then, um, I might run Codex. Uh, let's go in, and I'll just say, "Translate the input field placeholder strings, strings to French." Again, very contrived example.
9. CVClaire Vo
  Yep.
10. AEAlexander Embiricos
  So now I can open a new tab, and what I will do is I will now cd into the German tab instead, and I'll run Codex. I'll use my nice shortcut to just immediately give it the command, and I'll say, "Translate the input placeholder string to German." And so now Codex can go work on both of these changes at the same time. Here's the French one going, figuring out where to do this. Here's the German one going, and so that's awesome.
11. CVClaire Vo
  You know, we have a lot of people that, you know, go, go on social media, and they're like, "We're running... I'm running 15 instances of Codex across my terminal." They show all these tabs, but they're not sharing practically how they're creating separation of concerns across this code. And, you know, uh, I, I love that we're showing AI tools, but I think also what these coding tools are allowing people to do is come to software engineering a lot of times without the basics of things like Git. And so-... you know, if in addition to learning some of these AI tools, if I could tell anybody, like, learn the fundamentals of Git, [chuckles] and then you will be in a safe space when you're, when, when you're, when you're running with the power of these tools, I think is, is really
12:16 – 15:37
How OpenAI built the Sora Android app with Codex
1. CVClaire Vo
  important. Okay, so you know, what we've shown is you can spin up Codex kind of in one of two ways, which I also wanna call out. One, in your IDE through an extension plugin. Two, if you just wanna go straight into the terminal experience, great. You can, you know, ask it to do either ex- explanatory tasks, which is, I, I use it a lot, e- even on code I have written myself. [chuckles] Like-
2. AEAlexander Embiricos
  Yep
3. CVClaire Vo
  ... what, what, you know, what did I do here? Remind me how this works. As well as sort of discrete tasks, and then you can parallelize these, especially by using worktrees. So I think this is a, is a great, you know, kind of like foundational, how you would use some of the basics of this. But how is OpenAI using this for bigger features and bigger products?
4. AEAlexander Embiricos
  Totally. So actually, we just published a blog post about this that I think could be cool for people to know about, about how we used Codex to build Sora, the Sora app for Android in twenty-eight days, and it immediately became the number one app in the App Store. So, you know, four engineers, twenty-eight days, number one app in the App Store, and it's not a trivial app. That... I was super impressed by the speed as I was watching this team go. And, uh, this article has a bunch of, like, really practical advice for how to do it, how to, how to do so. And, um, I think the, you know, if you so-- this is really written for professional software engineers building, like, big production apps-
5. CVClaire Vo
  Yeah
6. AEAlexander Embiricos
  ... like working in, like, complex code bases. And, um, a really cool sort of headline to take away here is that with coding agents, it doesn't get easier, but you just move way faster. And sort of the idea here is that, you know, the-- we didn't have four engineers just, like, purely vibe code this app in twenty-eight days. They didn't go in and just say, "Hey, Codex, build the Sora Android app and have it work." Actually, slight correction, they did try that, and it didn't work. [chuckles]
7. CVClaire Vo
  [chuckles]
8. AEAlexander Embiricos
  It didn't go in one prompt and build the entire Sora Android app. But instead, what they did is they thought really hard about the architecture that they wanted the app to have, and they used a technique called planning, which I would say is just a super practical thing that you can do. So let me see. I'm gonna pull up, uh, the Codex code base here, which is a slightly bigger code base, and what I might do in order to do this is I might start a task here. Oh, yes, sorry. One thing that I actually I wanted to share, when you first install the Codex extension, it will appear here in the left, and I highly recommend dragging it over to here.
9. CVClaire Vo
  [chuckles]
10. AEAlexander Embiricos
  It's just a nice place for it to live. So there you go.
11. CVClaire Vo
  Use your news.
12. AEAlexander Embiricos
  Um, yeah, in VS Code, that's easy. In IDEs like Cursor, it is hard to find that, so I will let you explore where it is 'cause it, I feel like it even changes. But so I might say something like: Hey, we wanna make this, like, nontrivial change. Like, for instance, we have a, a TypeScript SDK, and maybe we want to write a Python SDK.
13. CVClaire Vo
  Mm-hmm.
14. AEAlexander Embiricos
  Right? So that's-- I don't necessarily want a one-shot Codex on that, although I could. It might work. Uh, so I might say something like this: Make a plan to build a, you know, Python SDK based off our TypeScript SDK. And this is a reasonable prompt. I could just send this. This would be fine, but actually, some of our power users at OpenAI have gotten fairly opinionated about how they like their plans to work, and we've actually published a blog post on really effective planning.
15:37 – 20:23
Using PLANS.md for problem solving
1. AEAlexander Embiricos
  So Aaron posted a blog post about using Plans.md, and it's super easy to use this technique. Basically, what you do is you go to this blog post, and you just copy this description. It's kind of like a meta plan. It's like, "Hey, when you plan, this is what a good plan looks like."
2. CVClaire Vo
  Yep.
3. AEAlexander Embiricos
  So for instance, a good plan is, like, self-contained, a good plan has milestones, and, you know, the agent should update the plan as it goes. So I have done so. I have copied that into, uh, a, a markdown file, Plans.md. There you go. I just copy-pasted that from the website. And so what I might actually do instead here is I might say, "Using Plans.md, make a plan."
4. CVClaire Vo
  Yep.
5. AEAlexander Embiricos
  Right. And so I might just send this prompt. Um, this will take a while because the... If you look at the spec for Plans.md, it's very thorough, and this is actually something that Codex is really good at. Like, people love how thorough and, like, diligent Codex is. It's not the fastest tool out there, but it is the most thorough and, like, best at hard, complex tasks. And so I actually... Yeah, I could say, uh, let's say, put it in temp.md. I'm asking it to put it in a random file, uh, mostly because I did this a- ahead of time. So here is, uh, a pl- the plan that it came up with, and so you can see it's about a hundred and twenty lines that we could read through together. We see these to-dos that it wanted. Um, we see that it's identified the TypeScript naming conventions. This is great. Codex, thread, et cetera. Like, we actually are really intentional about how we name our SDK parameters, so it's really important for me to read these and verify them, make sure that it didn't get that wrong. Um, it's making, you know, various decisions in here, uh, that I might be happy with. Okay, great. And so now what I can do is I could say, I could start a new chat and say, "Implement the plan in sdkplan.md," if I'm just happy with it.
6. CVClaire Vo
  Yep.
7. AEAlexander Embiricos
  Um, and it would just go, and this would, this would take... This is probably, like, a thirty-minute to one-hour task, but I would be pretty confident in the results of this task. Um, and that's how they built the Sora Android app as well. One very concrete recommendation is if you have a chat where Codex is, um, producing these plans, and you wanna change something, it's actually really nice for the model if you just use the same chat to ask for changes to the plan, and that way, it has all this context in its head when it's ready to get going.
8. CVClaire Vo
  First of all, I, I love that you said that it has a head, so we are [chuckles] the model has- model has its brain. Yeah, I mean, we see this, we see this a lot, this sort of, like, build a plan. Obviously, I love a spec. I love a PRD. I love a technical design document. I'm curious, just if we take, take it, uh, the... We take the Sora app example. I'm presuming that you had-... a plan of plans, which is essentially like you look across the, the architecture of an app, and then you do kind of what we've always done in software engineering, which is you spec out the full thing you wanna do, you break it down into components or initiatives that you can execute on. And then where I think you're suggesting the velocity comes from, is any one engineer can do a detailed, you know, like technical spec and plan in, in partnership with Codex, and then have Codex execute the kinda like V1 of that plan for review very quickly. And so you don't kind of get to bypass the architectural thinking of, like, "How should this app be set up, and what, what capabilities do we want it to have?" And all that stuff, although you can use AI as a brainstorming partner for that. Um, but then once you have the kinda right-sized chunks of work, and they can be pretty meaty, I mean, like building an entire TypeScript SDK is not like a small initiative. It's not like adding a method to something. Um, then you can use this planning, kind of m- uh, planning strategy to then get what you're gonna do all laid out and then have Codex execute it.
9. AEAlexander Embiricos
  Yeah, I think that's, I think that's, like, similar to how I think about it. So I would say right now, I, I kind of like the terms vibe coding and vibe engineering-
10. CVClaire Vo
  Mm
11. AEAlexander Embiricos
  ... to be honest. My sense is that right now you have a lot of agency in how you spend your time as a developer or, you know, as a product manager. I think when you're gonna do something like the Sora, you know, build a production app, like Sora app, that you know you have to scale, it's really important that, you know, maybe you have a bunch of, like, Codex senior engineers, but you want that, like, architect, right, or staff engineer to think about the shape of the app. Um, and so that's critically important. I think at the same time that... So, you know, you're gonna have to think a lot about the shape of the app, and you're gonna wanna be really careful with review, and we can actually talk about some of that, how we've accelerated review at OpenAI, which is kind of becoming- You know, now that we can write so much code, like, the bottlenecks are kind of like thinking about what code to write and then making sure that code is good, reviewing it and landing it. I think at the other- at the same time, though, Codex can be really powerful for those places where you just wanna learn, and you don't actually need, like, a scalable production-ready app. So, for instance,
20:23 – 22:22
Using Codex for prototyping
1. AEAlexander Embiricos
  we use Codex a lot for prototyping. The designers on Codex actually have a fully, mostly vibe-coded full prototype of, like, all the Codex surfaces that they can just, like, design into with code, and then we use that to play around and see if we like things, and then if we do, then we'll often vibe code, like, a branch in the actual product. So a lot of things are just tried by designers there. And then, you know, sometimes the vibe-coded prototype is, like, exactly, like, pretty close to what we want, and so they'll just, like, land it with the help of an engineer, uh, or by themselves even. And then sometimes we're like, "Okay, this direction was good," or, "We, we learned some stuff. We iterated on the vibe-coded prototype. Now we know what we wanna build," and then we can actually go and give that, like, really well-defined spec to an engineer, who might, you know, rethink some of the fundamental assumptions and so end up having to use Codex to rebuild a lot of it from scratch. So I think there's, like, two flavors of acceleration. I think there's massive acceleration on learning, and then there's also massive acceleration on, like, executing.
2. CVClaire Vo
  Execution.
3. AEAlexander Embiricos
  Yeah.
4. CVClaire Vo
  Yeah. This episode is brought to you by Graphite, the AI-powered code review platform helping engineering teams ship higher-quality software faster. As developers adopt AI tools, code generation is accelerating, but code review hasn't caught up. PRs are getting larger, noisier, and teams are spending more time blocked on review than building. Graphite fixes this. Graphite brings all your code review essentials into one streamlined workflow: stacked diffs, a cleaner, more intuitive PR page, AI-powered reviews, and an automated merge queue, all designed to help you move through review cycles faster. Thousands of developers rely on Graphite to move through review faster, so you can focus on building, not waiting. Check it out at graphitedev.link/howiai to get started. That's graphitedev.link/howiai.
22:22 – 26:42
Deciding between what needs a plan and what doesn’t
1. CVClaire Vo
  So I, I have to ask one question, and then I do wanna go to code review, which is, y- you know, like, it's sort of this, you know, when you know, you know. Um, but how do you decide between what needs a plan and what doesn't?
2. AEAlexander Embiricos
  To some extent, it, it, it depends more about me than it depends about the task. So obviously, the harder the task, the more likely you wanna have a spec. Um, but I also think it kind of depends, like, what you're up to at that time. For instance, like, if I just wanna get something done, like, quickly, I might not have time to, like, wait for a plan and then go back and forth, so I might kind of throw, throw Codex at it, but I might just do it four times in parallel instead. This is actually a thing we do. Like, Codex, we... You can also use Codex, um, in the cloud, so on web, where it'll, like, run on its own computer, and that has a feature called Best of N, where it'll just, like, do the same task four times. And so often, like, you can just have Codex explore, uh, instead of, like, exploring to do a plan, and then you collaborate, you just have it try four different attempts-
3. CVClaire Vo
  [chuckles]
4. AEAlexander Embiricos
  ... and find out what works best. And, you know, I also do that with worktrees locally as well. So I guess the, the, the better answer to your question is, the harder the task, uh, the more you want to plan. But the lazy answer to your question is, uh, also it depends if I have time to wait for a plan or not.
5. CVClaire Vo
  [chuckles] I like that. I know one of the things that I have found myself doing, which I think is really funny, is as these, you know, longer-running coding models come out, GPT-5.2 being among them, I'm, like, waiting a lot more. I gotta f- I, I'm, like, trying to find ways to fill my time, and as somebody who used to have this, like, fancy executive job where I, like, really had a manager's schedule, and then over the past two years, I've been, like, builder life, and now I'm like, "Damn, I'm back to manager schedule." [chuckles] Like, I send the task off, and somebody else, you know, quote-unquote, "does it," and then I gotta find something to do with my time, and I refuse to add more meetings to, to my list. So, um, I am with you. Do I have the time and patience for a plan?
6. AEAlexander Embiricos
  Right.
7. CVClaire Vo
  Yeah.
8. AEAlexander Embiricos
  I mean, a lot of the, a lot of the engineers on the Codex team-... will basically run two things that they're building in parallel, not more than two.
9. CVClaire Vo
  Yeah.
10. AEAlexander Embiricos
  Like, it's usually two.
11. CVClaire Vo
  Yeah.
12. AEAlexander Embiricos
  Um, and so they'll kind of, like, be thinking about what to do on, on one side, and then- and you, by the way, this might just be, like, two different worktrees and two different-
13. CVClaire Vo
  Yeah
14. AEAlexander Embiricos
  ... instances of their IDE open. It could be something like that. And, uh, they might just be thinking about and collaborating with Codex in one while it's working in the other, and just juggling two seems to be manageable for sort of normal people.
15. CVClaire Vo
  Yeah.
16. AEAlexander Embiricos
  Uh, juggling more than two seems quite hard for normal people. But my view on this from a product direction perspective is we don't really wanna ask humans to juggle. Like, that's not fun for many people. Some people like StarCraft and code, but, uh-
17. CVClaire Vo
  Quick pause!
18. AEAlexander Embiricos
  We're training... Yeah.
19. CVClaire Vo
  Quick pause. I love StarCraft, which is why I feel like [laughing] I'm really good at all this right now.
20. AEAlexander Embiricos
  Yeah, yeah, yeah. Um, I think it's actually kind of, it's kind of an apt analogy, but that's not-- I didn't come up with it, I forget who did. Um, but, uh, what we're trying to do is just make Codex faster and faster.
21. CVClaire Vo
  Yeah.
22. AEAlexander Embiricos
  And we are also trying to basically set it up so that you don't have to do the waiting. Like, as the models get smarter and smarter, they can take on harder and harder tasks. Like I just heard from Navin at Every this morning, who was sharing a demo of a bug that he... no model could fix, and then 5.2 came out, um, yesterday, and, um, he threw 5.2 at it, and it thought for 37 minutes and was like, "This is the bug." And then, in fact, that was the bug, and he got the bug fixed, right? So as we have smarter and smarter models, there is going to be more instances, there are gonna be more instances where you wanna wait. But I, I think that's our job as the product builders around the model to make it so that even when the model is thinking, you're not waiting for it to think.
23. CVClaire Vo
  Yeah.
24. AEAlexander Embiricos
  Okay?
25. CVClaire Vo
  Or you know that you're waiting, and you feel good about it. I think that's one of the challenges I've had with, with some of these, where the thinking time is long, is I find myself, and bless it, it's like when I, you know, worked with human software engineers, I find myself being like: "How, how's it going? How's, how's it going? You still, you still on it? Like, you still good?" And so I do think it's a really interesting product problem because there is, you know, useful latency in, in these models. Um, but as a product and designer, being able to expose that latency and that reasoning and the progress in a way that makes people not feel antsy about it, I think is still a challenge out there for, for you all and for everybody else building these kinds of tools.
26:42 – 28:08
How to multiply the impact of Codex
1. CVClaire Vo
  You know, this was- this has been super helpful on sort of like the basics of Codex. I would love to hear one or two integrations with other systems or tools that you found have been really, like, really multiply the impact you can get out of Codex.
2. AEAlexander Embiricos
  I think the biggest one by far is going to be GitHub and code review. Um, and then there are some, are some others as well. This is just a nifty graph while I'm here, about 5.2, uh, the model. Um, well, let's take a quick digression, and I'll show you because I just think it's super cool. Basically, what this graph shows is that, uh, 5.2, when you give it as long as it wants to think, uh, is an amazingly intelligent model at SWE-bench Pro, which is an eval of software engineering tasks. But the x-axis is pretty interesting. Basically, what it shows is the number of output tokens that the model took to perform these tasks, and so it's kind of like, how long did the model have to think? And when we say: "Hey, you can think as long as you want-ish," uh, it's really smart. But the other cool thing is we're able to say: "Hey, like, we actually don't want you to think that hard," or, "We want you to, like, kind of answer quickly." And so it's performing, like, even higher, you know, results than, like, say, a pre- this previous model here, 5.1 Thinking, but in significantly less time. And so this is kind of what we're trying to build, going back to what we were saying about waiting, right? We wanna just a- get you the same result much faster and then get you more intelligent results than, you know, than you give the model time.
3. CVClaire Vo
  Yeah.
4. AEAlexander Embiricos
  Um-
5. CVClaire Vo
  Get, get the right result in the appropriate amount of time, right?
6. AEAlexander Embiricos
  Yeah. Exactly.
28:08 – 30:01
Implementing automated code review with GitHub
1. AEAlexander Embiricos
  So one thing you can do with Codex is, uh, you can ask it for code review. This is actually super easy to do, but without, uh, integrating with GitHub. Uh, we could just be in here. Let's just say that it had- it's written some code. Um, I'm gonna kind of ignore, uh, what, what happened there, and I'm just gonna pretend that I wanted to review this code. Uh, so I could type /review, uh, and basically ask it to start reviewing this code, and this is something that people really love. And, you know, right now it does feel like when you put the model in a certain mindset, like, "Hey, you are a reviewer," and you give it a different conversation context than the model that wrote the code, you know, you'll just get, like, even better critiques than you might get from a human engineer. Partially because this model is not... You know, it doesn't-- has a lot of time to read all the code and, like, maybe even execute code to, like, validate those changes. So this is, this is just something super useful that I, I recommend doing, and like many engineers on Codex, will, like, have Codex do work and then multiple times ask it to either review its code or critique its code or just make the code more elegant, and that's just, like, a massive accelerant. So, so let's say that you like this, right, and, and your team has a practice of doing review. Something that you can go do is actually enable automated code review in GitHub. And so here, when this PR was pushed, Codex automatically, without anyone having to prompt it, and without anyone having to have a computer running, this is just, you know, in the cloud, Codex went, took a look, and found an issue with the code. And the hit rate on these is, is really high. Like, we built this feature so that it only points out issues that it's, like, very confident are issues, 'cause, you know, the sort of the principle here is just, like, human attention is so scarce, we really wanna protect it. But when it finds a, a really important issue, it'll post here, and this is where you start to feel the AGI a little bit. It found this issue, and then Romain basically replied, like: "Hey, Codex, can you fix it?" And then Codex went and fixed it, right?
2. CVClaire Vo
  Yeah.
3. AEAlexander Embiricos
  And so we can get into this kind of loop like that. So that would be my number one integration to give. The number two might be Slack and Linear.
30:01 – 32:08
Codex adoption at OpenAI
1. CVClaire Vo
  Well, l- I, so I, I love this flow, and I think, again, um, optimizing... I'm actually running a 5.2 branch review right now. Um, I'm not running it in Codex, but I will do a, a comparison on the two experiences, but I'm doing a very much, like, compare this to base. Tell me-... what, what we did, what, what, what we did here, if there's anything-
2. AEAlexander Embiricos
  Mm
3. CVClaire Vo
  -I need to be aware of. I do like these in GitHub code reviews. I do feel like where I have found the highest quality is reducing noise in these, so it's really great that you have kind of focused on confidence, focused on what, what bugs really are gonna matter, and then this loop of kind of, can, can you fix it? And so are you running this on kinda all your repos? Has this become, like, how code gets reviewed?
4. AEAlexander Embiricos
  Yeah. I mean, Codex is, is just everywhere at OpenAI, which has been really cool to see. You know, I feel like when I hear a story of s- of some tool being used everywhere, I'm always like, "Ah, a little skeptical, these people are biased." So the thing I can tell, tell the audience here is that earlier this summer, Codex was used by around half the company, which is like, I don't know, a pretty low number, right? Half. Um, we're now at the point where, um, basically, like, all of technical staff, nearly all of technical staff, is using Codex constantly. And so it's funny, 'cause we don't even have this comparison point. Since everyone's using Codex, it's, like, hard to compare the people using Codex and the people not. But there was this period of time where we were seeing that, like, the people using Codex were, like, 70% more productive, if you looked at PR volume. Obviously, PR volume isn't, like, the best thing to measure, but it's a thing you can look at. Um, and now with... That metric doesn't mean anything anymore, 'cause everyone uses Codex. Codex code review itself is enabled on pretty much every single repo at the company and re- reviews, like, pretty much all PRs. Um, and, you know, it's one of those features where we were a little bit nervous when we shipped it about how people would feel, but it was just an immediate hit, and, like, people really like it. Uh, you know, maybe this is a bit of a segue, but, you know, speak product person to product person, like, something that we've been thinking about is, if our mission is to deliver the benefits of AGI to all humanity,
32:08 – 36:38
Challenges and innovations in AI integration
1. AEAlexander Embiricos
  I, I believe one of the biggest limiting factors is, like, do people wanna type the prompt or not? [laughing] You know? And, like, I don't like typing, so, you know, I, I would be too lazy to do this. And so we were thinking a lot about, like, okay, what are things we can do for teams that are just useful without anyone having to do any work? And so we, we tried a few things. Code review, as I showed you here, is one of the things we tried, big hit. Um, we tried some other things that were pretty interesting. Like, we built a feature that would automatically attempt to revise, um, the PR when, you know, you got a code review feedback-
2. CVClaire Vo
  Yeah
3. AEAlexander Embiricos
  ... from someone else. And, uh, you know, maybe I'd, I'd be interested to try that again, but interestingly enough, that feature was not super popular. The hit rate was, like, often, like, PR feedback, I mean, sometimes it's nits, but often it's, like, kind of in there. You need a lot of human context to understand the PR feedback, and so Codex wasn't acing it, and the sort of hit rate wasn't high enough to be worth the, like, email you get every time an event happens on GitHub. Whereas code review, you know, we were really careful with, like, how often it did things, and we made sure, for instance, this is one where Codex didn't find any issues. It doesn't even notify you, it just thumbs up emoji. You know you're done.
4. CVClaire Vo
  Yeah, I, I mean, I've had this experience, too, where it's, it's interesting. I have also used automated code review, and I have attempted that, like, full closed loop, and I have also been dissatisfied with not just the, the bug fixes. Sometimes they're fine, sometimes they're not. But I, I often feel like putting a human in the loop that says, "I'm pretty sure this is fine because X, Y, Z," or, "Just remember we did this because A, B, C," and sometimes there's a little context lost. But the other thing from a product perspective, I think is interesting on these, like, proactive versus reactive agentic experiences, is if you have a full agent loop, the human bar for quality is extremely high, and so dissatisfaction and frustration can bubble up very quickly. It's one thing if, like, your code review bo- you know, bot says, "This is broken," you go fix it, and then you get it reviewed again and it nits you. You're like, "Okay, I didn't do exactly what you wanted." But if you had the experience where, like, your code review bot, you know, raises something, it fixes something, and it nits itself, that gets [chuckles] really annoying.
5. AEAlexander Embiricos
  Yeah, totally.
6. CVClaire Vo
  And so I do... Again, this is, like, a little bit of a agentic product design challenge that, that we're now gonna have that people, I think, need to pay attention to, which is, like, how do you design for latency? How do you design for perceived quality and bars when humans are involved, when they're not involved? And then one last thing I wanna say to your comment, which is, I mean, if, if our human fingers are no longer valuable and used, what role are we gonna play? Like, protect the typers. [laughing] But, um, I, I, I mean, I get what you're saying, which is almost all the friction right now in my product development, software development flow is, like, literally writing the first prompt.
7. AEAlexander Embiricos
  Right.
8. CVClaire Vo
  It's like sitting down and just going, "Uh, now let's, like, let's do the thing now," and it's such a funny shift from where we've been before.
9. AEAlexander Embiricos
  Yeah, I mean, it's, it's interesting to think about, right? Like, you know, obviously, our mission is to just, like, massively accelerate every single developer and, you know, broadly than anyone who's doing anything where an agent using a computer can help. And so yeah, the question of what our role is int- is interesting. Like, I like to joke that the limiting factor is typing speed. I think that's half true. So, like, I think for things like, "Review this code, please," the limiting factor is, is typing speed. There's a lot of just, like, micro places where an agent could help you. Um, but then I think the other half is actually thinking, right? Like, now that we can just have ubiquitous code, and we can basically prototype things, like, trivially, the hard parts, I think, become, like, deciding what actually should make it in, thinking, like, what a product should do, like, knowing your customer, actually, and then, you know, if you're building a complex system, like, being really thoughtful about, like, the architecture of that system-
10. CVClaire Vo
  Yeah
11. AEAlexander Embiricos
  ... and kind of curating the agent work. So-
12. CVClaire Vo
  Yeah
13. AEAlexander Embiricos
  ... you know, I think it's-- I talk to developers who are really motivated by, like, seeing people use what they build, and I think those developers are increasingly just, like, thrilled, right? And then I also talk to developers who just love, like, the feeling of writing code. I, I, I don't know, I experience joy from both, and I think that's a place where, you know, we think constantly about, like, "Okay, how do we make this feel as fun as possible?"... uh, right? Like setting up dependencies kind of sucks, but you have to do that for your agent, so, like, how do we make that easy? Reviewing kind of sucks, so, like, how do we make
36:38 – 43:49
Recap and the Codex harness
1. AEAlexander Embiricos
  that easy?
2. CVClaire Vo
  Great. Well, just to recap, 'cause we've done a lot here so far, before we get to lightning round, we have shown installing Codex as an extension, setting up tasks, setting up worktrees, using Plan, and in particular, the plan on how to plan on the OpenAI [chuckles] blog, to generate plans for more complex implementations, especially when you need longer-running tasks, um, and then automations around, uh, code review and bug fixes. The things we didn't call out, but I think are really important, are you can basically use Codex wherever you want. So we showed it in VS Code, we showed it in the terminal. You can get it on the web, you can get it in Slack, [chuckles] you can get it in Linear, it can kick off in GitHub. Like, I do love this idea of kind of like Codex anywhere is also really nice. So again, if you're intimidated by or don't understand VS Code, great, kick it off in the web. If you love your command line tool, great, let's use, use a couple of those keyboard shortcuts that we showed. And so I just think this is a great kind of like starter flow that shows how flexible this platform is, and how it can meet, you know, kind of like a bunch of people at a, a, a variety of levels of tasks. So I'm actually gonna start there with our lightning round questions, which is, you know, you just released, um, 5.2. You showed SWE-bench. We're clearly, you know, like model wars every week. Um, I am really curious about harness wars. Like, why does the interface to the model, like Codex, matter so much? And, you know, we've seen a couple things that you've built into the platform, you know, PR review or code review, you know, little, like, small UX fixes that make, make it easy to use. But, like, where do you feel like harness differentiates in your experience using these coding tools?
3. AEAlexander Embiricos
  I think there's two places. One is just the quality of model work, and then the other is in the user experience. Right, so taking the quality of model work, I, I presume many people listening to this podcast, you know, tinker with models or have friends who are building models, and, like, some people just are... You know, you've- they're model whisperers and they know, they know how to use a model, and some people are model whisperers for, like, a specific model, but then maybe not for another model. And one of the things that's, I think, very true is that these models are changing all the time, right? Like, we've been-- I've lost track. I think we're shipping a new model, like, every two weeks recently at OpenAI, right? And they're-- Each model we ship is better than the last model. It's, like, super exciting. Um, but they're also evolving, right? They have new capabilities that are kind of hard to keep track of unless you spend all day on, you know, Twitter/X. And so I think that there's kind of two things here. The first is, like, you need to know how to get the most out of the model. You need to know, for example, that OpenAI models, we kind of have this, like, very, again, AGI-pilled way that we train models. We kind of just give them access to a shell, and we say, like, "Go and do whatever you think you should do in the shell." And so we see these, like, really interesting emergent behaviors where, you know, sometimes the model will decide to write a Python script to make many, many edits to code, and then we have debates about, well, is that a good idea or not? Like, would we prefer it if the model was, like, plodding through the edits, or do we like it running a script? But either way, that's, like, a thing you should know. And so one of the cool things about Codex is that we're building off harness and open source, and so you get to see the updates we're making for each model that we ship to make the most out of the model. And I can say, like, every time we ship a model, engineers from the Codex team will go, like, test it, think about it, talk to the research team. I mean, it's their-- it's something they're just working super closely together to figure out how to make the most out of the model. And sometimes we'll, we'll ship a new capability, like model's getting better at parallel tool calling, let's see how that works, or recently we shipped something called compaction, where the model can basically-- where we can basically have the model, like, prep-- like, start a new conversation with itself with a fresh context, and what it'll do is just give itself just enough, just the right information, so that to the user, it just feels like it's one conversation rather than, like, two conversations. And so when, when, when we build features like that, by building both the harness and the model together, we can be much more opinionated about what to do with the model, and then we can make the actual outcomes, like, way better for users. And so part of why the Codex CLI is open source is so that anyone who wants to get the best out of Codex models, and actually just OpenAI models generally, can just go observe our, how it works. They can use the Codex SDK if they want and just, like, not even touch the harness, just, like, delegate to us for that. Or if you want to build your own harness, you can just go copy/paste parts of code. So we do this all the time, like, I'm in a bunch of, like, Slack DMs with customers, and we'll just send them point, code pointers, like, "Oh, yeah, this is how we do this. Just, like, just copy this code, please." So that's on the just, like, having higher quality model outputs, but also keeping pace with the pace of innovation from our research team. That would be, like, why I think the Codex har- harness is awesome there. The other side is, is, um, is product overhead. For instance, um, earlier this year, most of the sort of, like, really, uh, powerful agentic flows that people were using, like Codex and other, were in the CLI, right? And this is super basic, but, like, I, I, I spoke to many people who don't really like spending all day in terminal. Like, I love using the terminal, but I spoke to many people who don't, and I spoke to many people who really like seeing the code that is being edited at the same time-
4. CVClaire Vo
  That, that's me. I'm, I'm a co- I'm a code reader. I like to read my code.
5. AEAlexander Embiricos
  Right. And so, you know, that's a very, very basic thing, but it's a place where just building the right product experience unlocked a ton of growth for us. And so, you know, empirically, I could say that we see many, many more people who like looking at the code that's being written at the same time as the agent, uh, than we see, like, just, like, running in terminal, like, on the side. Um, so-... uh, I think, you know, that's a very basic example, and then we were kind of touching on this point of, like, latency for a more advanced example. Like, I think if we can harness the model right, we can make it so that you can deploy the model to help you with, like, hundreds of thousands of things a day, but without you having to type, right? But also without it being annoying to, like, filter these outputs, and without you having to wait, because whenever it's helping you with something, proactively, it's sort of doing so on its own computer and only letting you know when it has something great. So my view is actually that even as all these models are progressing, if, like, let's just say that stops, which it won't, there are many years of product building to do to just, like, get the harness right and useful for people.
6. CVClaire Vo
  Well, that's great, and I do wanna make sure people did not miss this tip, which is, if you're trying to figure out how to get the most out of, out of these new models, you go, go peek under the hood at, at Codex open source and just see what ki- I mean, I think that's the other thing, is what kinds of changes do you have to make when a new model comes out? Especially if you're a builder out there, maybe you're not building a coding tool, but you're building a SaaS product that use these models. When they come out, being able to observe how the creators of the models actually maximize their unique strengths, I think is a really valuable thing that I think people really
43:49 – 49:09
Atlas and personalized AI interactions
1. CVClaire Vo
  underestimate. Okay, my second question: I spied with my little eye, Atlas. Tell me your favorite Atlas use cases that you think people underappreciate.
2. AEAlexander Embiricos
  Ooh, the first one is kind of boring, but it's, it's really tr- really true, which is I have started just asking Chat for everything instead... Chat, Chat being ChatGPT. Um, I just ask Chat for everything, uh, because I get answers that are, like, really catered to me because I talk to Chat about everything. You know, I, I'm a weird person. Like, if I ask a question, and then, um, I, like, make a decision based off it, I tell it what decision I made, because then it remembers, and so then the next question I- answer I get is even better. And so just simply, like, my workflow for anything that's not code, is I go to Atlas, I Command+T to open a new tab, and then I just type whatever I want, and I get an answer, and I often follow up. Like, for me, that's the sort of magic thing about using an LLM. Um, so being able to just, like, ask your question and then follow up, and then maybe navigate to links, like, super boring, but I, but I love that. Um, my other sort of favorite feature is, um, the fact... is Sidechat. So basically, any page you open... Like, should I sh- maybe I can show this. Um-
3. CVClaire Vo
  Yeah, yeah, sure.
4. AEAlexander Embiricos
  Yeah.
5. CVClaire Vo
  A- and while you're doing that, I have to call out that you have settled a debate that I saw today on X, which is, like, should you tell your AI when it's done something right or wrong after it's done it? Like, once something's fixed a bug, I'm the person that's like, "Great, it looks awesome. Thanks. That fixed [chuckles] ..."
6. AEAlexander Embiricos
  Yeah.
7. CVClaire Vo
  But I do think that I, I, in my mind, I've convinced myself that closed loop creates some context that's like, "Yes, I did this particular thing, right or wrong, user has accepted it, and that in some future world, it's gonna make my life better." [chuckles]
8. AEAlexander Embiricos
  Yeah, I mean, so I think there's two reasons to do that. The first is just, like, memory.
9. CVClaire Vo
  Yeah.
10. AEAlexander Embiricos
  Right? Like, if you have memory enabled-
11. CVClaire Vo
  Mm
12. AEAlexander Embiricos
  ... which most people do, then you'll get a better answer. A very concrete example is I was on holiday, and some plans changed, so we were deciding where to get dinner. Uh, and so we just asked Chat for dinner recommendations. Uh, to be clear, we also, we like food a lot, so we also, like, searched. Um, but it's interesting to get- ask Chat because it knew, like, where we were staying, and it knew what food we'd had, like, the day before or whatever, and so it just gave, like, a really bespoke recommendation, and that was cool. Uh, I think my other sort of hot take reason to do this is I think it's important to be polite to AI.
13. CVClaire Vo
  I agree!
14. AEAlexander Embiricos
  I know that this is not an official company stance, just to be clear.
15. CVClaire Vo
  [laughing]
16. AEAlexander Embiricos
  But my sort of meta reason here is I just think it's important to be polite to everyone.
17. CVClaire Vo
  I agree.
18. AEAlexander Embiricos
  And I think that if you start not being polite to Chat, I think it can wear off on you, and you just start not being polite to other people in your life. And, like, we're adults. Like, imagine kids, right? They hear us, like, talking to our AI in some way. They're gonna go treat someone who they, like, you know, they're children, they don't know, in some, like, not polite way. So that's kind of my, my hot take thing, like, be polite.
19. CVClaire Vo
  I could not agree more. I... You know, P- I am a "please, thank you, good job." And, and honestly, it's not because of the AI's humanity, it's to protect my own humanity, which is if I get used to being a jerk to anything human-like, there is no way that does not bleed into how I think about people, speak to people. Uh, clip this, pin it to the top of the YouTube channel, be polite for you, if not for AI.
20. AEAlexander Embiricos
  Hundred percent. That, that resonates a lot. It's like our, our humanity is defined by how we treat others, not how they treat us, right?
21. CVClaire Vo
  Exactly.
22. AEAlexander Embiricos
  So Sidechat. So basically, um, I can go in here, I can click this button, and I can ask questions about the page, right? So I could be like, "What's great about GPT-5.2?" Uh, you might joke, like, and be like, "Why are you asking AI to summarize this article?"
23. CVClaire Vo
  [chuckles]
24. AEAlexander Embiricos
  But, like, oftentimes I'll be at work, and someone will send me a thing, and be like, "Thoughts?" And I just like, "I don't have time." [laughing] So it's like, what is this, right? And then, and, you know, and then I can have a conversation, though, right? So then I can be like, "Oh, like, interesting," like, you know, um-
25. CVClaire Vo
  "Hey, Chat-
26. AEAlexander Embiricos
  Yeah.
27. CVClaire Vo
  ... what do I think about this?" [laughing]
28. AEAlexander Embiricos
  I mean, when I ask-
29. CVClaire Vo
  I know
30. AEAlexander Embiricos
  ... my questions, it often grounds itself on what I've ha- talked to it about before. It's like: Well, since you are a person who likes, you know, like this, you probably would be interested in this detail. Um, so I mean, this, this is not maybe the best example because I'm asking for a summary, but oftentimes if I'm looking at, like, numbers or math-
49:09 – 53:04
Conclusion and final thoughts
1. CVClaire Vo
  Um, okay, last question, and we maybe already covered this, but, um, you- we've established you're polite to AI, but when it is not replying, not doing what you want, not remembering, what is your prompting technique to get it back on track?
2. AEAlexander Embiricos
  Yeah, I mean, so first off, I have a bit of a weird job in that if I notice the AI not replying, I have to go probably file a bug or, like, start a sev, sev being, like, a word for an incident. So I, you know, yeah, I have to go do those things, but, um, I think context is everything. So if I see the agent not doing what I want... So I, well, I guess one really tactical tip is I, I don't usually ask for things from the agent without asking for context, uh, without giving context. Because I'll say, like, "Hey, I want you to like, like, m- change this UI from this to this so that, you know, users do this," or like, "because we don't want people to be confused about X, Y, Z." And I often- it's funny, I think pe- like, another hot take, I think PMs are the best prompters, um, because we're used to not being the expert in what we're doing, and we're used to not being the most intelligent person in the room, right? And so usually we can just, like, maybe suggest, but we don't even know if that's right. So I, I, I, I sort of work with Codex in that way. I'll be like: Hey, can you, like, make this more elegant? And I won't say what I want, because I actually- it'll look at the code, and it'll know better than me. Um, so tip number one is give a lot of context and actually get really good at describing the level of ambiguity of your request. Like, do not create false precision in your prompt if you don't actually care exactly about what, what the outcome is. And then the second thing is, like, if that doesn't work, and you explain why again, and it still doesn't work, then I just start a new chat. Um, and you can do things, these... This is a very, like, advanced user thing that I don't think anyone listening to this will ever do, but Codex is a very open product. It stores its conversations logs, uh, in, like, your home directory in a .codex folder in a subfolder called Sessions, so like .codex/sessions. So you could just go say, like: "Hey, I started a new session 'cause you got confused. I wanted you to do this because of this. Go read your previous session-
3. CVClaire Vo
  Sure
4. AEAlexander Embiricos
  ... understand what's going on, and then, and then like, you know, continue from there."
5. CVClaire Vo
  Ah, I love it. Ending this episode with a hidden hot tip, which we didn't get to through our, our Codex walkthrough, which is all your, all your sessions are stored locally, so just ask it, ask it to go read them. This has been really fun. Alex, where can we find you? And other than reporting bugs, how can we be helpful?
6. AEAlexander Embiricos
  I am hiring PMs, so if you're interested, please, uh, apply on the job site and also hit me up on socials. Uh, we are hiring generally a lot on Codex. We do love bug reports, and we do love feedback, and actually, it's already in open source, so I don't mind talking about it. We actually are also releasing a bunch of new configuration abilities for Codex, like the ability to allow us commands or skills. So if you wanted to, like, help build Codex skills or, like, tell us what configuration you want, that would be very helpful. And lastly, just check it out. Codex is awesome. So, uh, you can find me, uh, on Twitter. I'm @Embirico, E-M-B-I-R-I-C-O, and, uh, at the r/codex subreddit, we are there all the time and also love chatting there.
7. CVClaire Vo
  Amazing. Well, thank you for joining How I AI.
8. AEAlexander Embiricos
  Cool. Thanks for having me. [upbeat music]
9. CVClaire Vo
  Thanks so much for watching. If you enjoyed the show, please like and subscribe here on YouTube, or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiaipod.com. See you next time! [upbeat music]

Episode duration: 53:04

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode xeZDHGjG5zM

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome