GPT-5 and Agents Breakdown – w/ OpenAI Researchers Isa Fulford & Christina Kim

ChatGPT-5 just launched, marking a major milestone for OpenAI and the entire AI ecosystem. Fresh off today's live stream, a16'z Erik Torenberg was joined in the studio by three people who played key roles in making this model a reality: - Christina Kim, Researcher at OpenAI, who leads the core models team on post-training - Isa Fulford, Researcher at OpenAI, who leads deep research and the ChatGPT agent team on post-training - Sarah Wang, General Partner at a16z, who helped lead our investment in OpenAI since 2021 They discuss what’s actually new in ChatGPT-5—from major leaps in reasoning, coding, and creative writing to meaningful improvements in trustworthiness, behavior, and post-training techniques. We also discuss: - How GPT-5 was trained, including RL environments, and why data quality matters more than ever - The shift toward agentic workflows—what “agents” really are, why async matters, and how it’s empowering a new golden age of the “ideas guy” - What GPT-5 means for builders, startups, and the broader AI ecosystem going forward Whether you're an AI researcher, founder, or curious user, this is the deep-dive conversation you won't want to miss. Timecodes: 00:00 ChatGPT Origins 02:13 Model Capabilities & Coding Improvements 04:11 Model Behaviors & Sycophancy 06:15 Usage, Pricing & Startup Opportunities 08:03 Broader Impact & AGI Discourse 16:59 Creative Writing & Model Progress 31:50 Training, Data & Reflections 36:25 Company Growth & Culture 41:39 Closing Thoughts Resources: Find Christina on X: https://x.com/christinahkim Find Isa on X: https://x.com/isafulf Find Sarah on X: https://x.com/sarahdingwang Stay Updated: Let us know what you think: https://ratethispodcast.com/a16z Find a16z on Twitter: https://twitter.com/a16z Find a16z on LinkedIn: https://www.linkedin.com/company/a16z Subscribe on your favorite podcast app: https://a16z.simplecast.com/ Follow our host: https://x.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details, please see a16z.com/disclosures.

Christina KimguestErik TorenberghostIsa Fulfordguest

Aug 8, 202542mWatch on YouTube ↗

EVERY SPOKEN WORD

50 min read · 9,738 words

0:00 – 2:13
ChatGPT Origins
1. SPSpeaker
  I mean, I think it's pretty unique at OpenAI to be able to work on something that's so generally useful. I mean, it's like everything they tell you not to do at a startup. It's just like your user is anyone.
2. CKChristina Kim
  You just kind of take it for granted that you literally have this, like, wizard in your pocket.
3. SPSpeaker
  We're trying to make the most capable thing, and we're also trying to have as- make it useful to as many people as possible and accessible to as many people as possible.
4. CKChristina Kim
  If this exponential is true, like, there's not really much else I want to spend my life working on. I think we hear this with GPT-5 internally when people are testing it. They're like, "Oh, I thought I asked like a really hard question."
5. SPSpeaker
  Yeah. [laughs]
6. CKChristina Kim
  I feel, like, a little bit insulted that it-
7. SPSpeaker
  [laughs]
8. CKChristina Kim
  ... thought for like two seconds.
9. SPSpeaker
  [laughs]
10. CKChristina Kim
  Or like when it doesn't even wanna think at all.
11. SPSpeaker
  Yeah. [upbeat music]
12. ETErik Torenberg
  It's a slow news day. Uh, not much going on for you guys.
13. CKChristina Kim
  [laughs]
14. ETErik Torenberg
  Thank you for, uh, thank you for coming on. No, obviously, uh, you know, Tina, you were just on the, on the live stream. We're recording, uh, d- day of. Congratulations.
15. CKChristina Kim
  Thank you.
16. ETErik Torenberg
  Um, for those who are unfamiliar, why don't you introduce, uh, what you guys do at, at OpenAI?
17. CKChristina Kim
  Yeah. Uh, I'm Christina. I lead, uh, the core models team on post-training.
18. SPSpeaker
  I'm Isa. Um, I lead the deep research, like ChatGPT agent team on post-training.
19. ETErik Torenberg
  And, and Tina, you've been here for... Or you've both been here for, for, for a while now. And, Tina, why don't you give a little bit of your history at the company?
20. CKChristina Kim
  Yeah. Uh, I've been at OpenAI for about four years now. Um, I originally worked on WebGPT, which was the original, uh, first LLM using tool use. Um, but it was just one question. So the model learned how to use the browser tool, but you only asked one question, you got an answer back. And then we kinda just had this realization, like, oh, you... Normally when you have questions, you have more questions after that. [laughs] And so we started building this chatbot, um, and then that's what eventually became ChatGPT.
21. ETErik Torenberg
  And w- it's, what have been the reactions s- so far, uh, you know, it's only been a few hours, but in your live stream, like what are, any reflections? Any re- what, what can you, uh, what can you tell us day of?
22. CKChristina Kim
  I'm honestly really excited. I think the, obviously we have some great eval numbers, and numbers are always really exciting, but I think the thing I'm like really excited about this model is just, it's way more useful. Like, in cross, like, all the things that people actually use chat for. Um, and it's not just like... And it's, I think the eval numbers look good, but then also like the way, when people use it, I think we'll, they'll notice a quite a bit big of a difference when the utility of it.
23. ETErik Torenberg
  And, and say more about these. What are, what are you noticing? What are you seeing? What, what are you hoping?
2:13 – 4:11
Model Capabilities & Coding Improvements
1. CKChristina Kim
  Yeah. I think for me, the two top... I mean, this is my personal use cases. I use it for coding and writing all the time, and it's just a huge step change.
2. ETErik Torenberg
  Yeah. S- Sarah, you, you've been involved in, in the, in helping lead our investment since, since 2021. What, why don't you, uh, either share more or tee up how, how you, how you've been thinking about, uh, sort of this as it relates to, to coding or more broadly?
3. SPSpeaker
  Yeah. Well, well actually just on the topic of coding, it, it was a huge deal to have Michael Trull come on there and, um, not only showcase the, uh, the capabilities, but also say this is the best coding mark- uh, model on the market. Um, and so just curious to the extent that you can share, what did you do differently to get these results?
4. CKChristina Kim
  Yeah. I think huge shout-out to the team, um, especially Michelle Pocriss.
5. SPSpeaker
  Mm-hmm.
6. CKChristina Kim
  Like we, I think to get these things right and with like eval numbers is one thing, like I said, but to get the actual usability and like how great it is at coding, I think it just takes a lot of detail and care. Um, I think the team put a lot of effort into datasets and thinking about the reward models for this.
7. SPSpeaker
  Mm.
8. CKChristina Kim
  Um, but I think it's just literally just caring so much about getting coding working well.
9. SPSpeaker
  And, and maybe actually just to double-click on front-end web development, I mean, we've seen as sort of investors in the ecosystem, that's obviously taken off in the last six to eight months. Um, if you could pinpoint, uh, the improvement to that piece specifically, is it around, is it more around aesthetics, um, or is there sort of a- another capability, um, leap forward in terms of what we can do with front-end, um, web development?
10. CKChristina Kim
  I think there's gonna be a lot more we can do with front end. I think the way we've gotten this big leap, I mean, if you compare it to o3's front-end c- coding capability, this is just totally next level.
11. SPSpeaker
  Yeah, totally.
12. CKChristina Kim
  Um, it feels very different. And I think it kinda just goes back to what I was saying. The team just really cared about like nailing front end. Um, and that means like getting the best data, like thinking about the aesthetics of the model and all of these things. Um, I think it's just all those details that are really coming together and making the model like great at front end.
13. SPSpeaker
  Really exciting to see. Loved, loved the demos in the, in the live stream too. I wanted to,
4:11 – 6:15
Model Behaviors & Sycophancy
1. SPSpeaker
  uh, ask about model behaviors, 'cause I know you, you worked on that too. Um, but how did you guys think about that for GPT-5? And there are a lot of things that, you know, um, we've talked about in prior models of like sycophancy and characteristics like that. Um, how did you guys think about for this? What did you guys change or tweak?
2. CKChristina Kim
  Yeah. The design of this model has been very, very intentional for model behavior, especially with the sycophancy issues that we had like a few months ago with 4o. Um, and we've just spent a lot of time thinking about like, yeah, what is the ideal behavior? Um, and I think for post-training, what's really f- or one of the reasons I really like post-training is it feels more like an art than maybe even like other areas of research, 'cause you kinda have to make all these trade-offs, right? Like, you have to think about like for my rewards, like all these different rewards I could be optimizing during the run, like how do, like how does that trade off against it, right? Like, I want the assistant to be like super, like helpful and engaging, but maybe that's like a bit too engaging.
3. SPSpeaker
  Right.
4. CKChristina Kim
  And getting too engaging gets to the overly effusive like assistant that we have. Um, so I think it's really like a balancing act of trying to figure out like what are like the characteristics and like what do we want this model to actually feel like. And I think it, we were really excited with GPT-5 because it's kind of a time to like reset and rethink about, um, especially since it's so easy to make something, I think, very engaging in the sense that in an, in an unhealthy way. How can we make this like a very healthy, helpful assistant?
5. ETErik Torenberg
  S- say more about how, how you achieved such a just kind of reduction in hallucinations, but, but also, also deception. W- what's the relationship between those?
6. CKChristina Kim
  I guess like for me, I find hallucinations, deceptions like pretty related. So the model, um, and we kinda saw this a lot with the reasoning models. Like the, the reasoning model would understand that it didn't have some ability, but then it still really wanted to respond.
7. SPSpeaker
  Hmm.
8. CKChristina Kim
  I think if we really baked it into the models that they want to be helpful, and so they're like, "Whatever I can say to be helpful in that moment." Um, and that's kind of what we consider for like deception. Versus hallucinations, sometimes the model like liter- uh, it seems that they will just say something quickly. Um, and we kinda see a lot of this reduction with the thinking, with when the models are able to think step by step, they actually can like pause and before blurting out an answer is kind of what I, it feels like with a lot of the previous models for hallucinations.
9. ETErik Torenberg
  Over the next few weeks as, as you're evaluating usage,
6:15 – 8:03
Usage, Pricing & Startup Opportunities
1. ETErik Torenberg
  what, what are the biggest questions that you're having or that you're sort of anticipating, uh, being potent- potentially answered?
2. CKChristina Kim
  I'm just really curious to see how all of these things, um, reflect in usage, right? Like, I think coding is way, way better. Like, what does this actually unlock for people? And I think we're really excited to be offering these models at the price points that we have.
3. SPSpeaker
  Yeah, totally.
4. CKChristina Kim
  'Cause I think this actually, like, unlocks, like, a lot more use cases that really weren't there before. Maybe, like, previous competitor models were, are good at coding, but the price point is not as exciting. And so I think with this number of capabilities that we have in this model and the price point, I'm kind of excited to see, like, all the new startups and, like, developers, like, doing things on top of it.
5. SPSpeaker
  Yeah. We're excited too. But by the way, just on the topic of usage, um, you obviously have a lot of products with a ton of usage already, and since we have one of the, uh, deep research gurus here too, um, how did deep research, ChatGPT, Operator, sort of your existing products inform how you went about approaching GPT-5?
6. IFIsa Fulford
  One thing that's interesting is with reinforcement learning, um, training a model to be good at a specific capability is very data efficient. You don't need that many examples to teach it something new. And so the way that we think about it on my team is we're trying to push capabilities and things that are, like, useful to people. So, like, deep research, it was the first model to do, like, very comprehensive browsing, but then when o3 came out, it was also good at comprehensive browsing, and that's because we're able to, um, take the datasets that we've created for, um, the, you know, frontier agent models and then contribute it back to the f- um, frontier reasoning models. We always wanna make sure that the capabilities that, um, we're pushing with agents makes it into, into the flagship models as well.
7. SPSpeaker
  Yeah. That's great. Very self-reinforcing.
8. ETErik Torenberg
  Uh, you mentioned all the, the startups that you're excited to see come as... Like,
8:03 – 16:59
Broader Impact & AGI Discourse
1. ETErik Torenberg
  flush out what you think that, that could look like, or even just high-level so- some opportunities you're, you're more excited about because of this.
2. CKChristina Kim
  I mean, like, people always say vibe coding. I think basically, like, non-technical people, like, have such a powerful tool at their hands, and I think really you just need some good idea and, like, you're not gonna be limited by the fact that, like, you don't know how to code something. Like, you saw two of our demos which were front-end coding, or in the beginning, and that just literally took minutes. It lit- I think that would have honestly taken me, like, a week to actually build, like, fully interactive. Um, and so I think we're just gonna have a lot more... I would expect, like, maybe a lot more, like, indie type of, like, businesses built around this because of the fact that, like, you just need to have the idea, write a simple prompt, and then you get the full-fledged app.
3. ETErik Torenberg
  It's the world's the ideas guy.
4. CKChristina Kim
  Yeah.
5. ETErik Torenberg
  It's our time.
6. SPSpeaker
  I think so. [laughs]
7. CKChristina Kim
  Yeah. Finally.
8. ETErik Torenberg
  Yeah. Um, h- how about in the, in the broader sort of, uh, AGI discourse, like, w- what is this, um, what, what does this mean or accelerate or, or not, or, like, h- how do we think about sort of the broader, um, A- AI discourse in terms of w- what does GPT-5 mean here? Or, or ch- change the conversation in any sort of way?
9. CKChristina Kim
  I think with GPT-5, um, it kind of sets, like, a new... W- It's obviously state-of-the-art in, like, all the things we talked about, um, but I think if you're showing that, like, you know, we're con- continue pushing the frontier here, and I feel like there's always people who are like, "Oh, we're hitting a wall," like things aren't actually improving. Um, and I think the interesting thing is I feel like we've almost saturated a lot of these evals, and the real, like, metric of, like, how good our models are getting is I think gonna be, like, usage, right?
10. SPSpeaker
  Mm.
11. CKChristina Kim
  Like, who, what are the new use cases that are being unlocked, and, like, what, how, like, how many more people are using this in their daily lives to help them, like, across multiple tasks? So I feel like that's actually, like, the ultimate usage in terms, like, that I'm excited about for, in terms of, like, are we getting to AGI?
12. SPSpeaker
  Uh, I had a question about that just because, uh, I think Greg made this comment about how he was comparing the last model to this model and the, the benchmark went from 98 to 99. He's like, "Clearly we've saturated the benchmarks." Um, at least on that, that front, which I think is instruction following. Um, what benchmarks do you pay attention to? Like, how do you guys think about evals, right? 'Cause given you're already saturating what's out there, um, to a large extent or, or doing very well along those dimensions, um, what actually gets you to push the frontier? Is that, um, before the... I mean, so usage would be kind of post the model release, but before you get there, what are you guys looking to internally to help guide you? Is it a lot of internal evals that you've created? Um, you know, is it early access to startups, seeing what they think? Maybe it's a combo of all the above, but how do you weigh all those things?
13. IFIsa Fulford
  Yeah, I mean, I think on, on our team, we really work backwards from the capabilities we want the models to have.
14. SPSpeaker
  Mm.
15. IFIsa Fulford
  So maybe we want it to be good at creating slide decks or something, or, um, spread- good at edi- editing spreadsheets. And then if evals for those things don't exist, we try to make evals that are representative measures of that capability in a way that's actually gonna be useful for users. Um, and then we'll, um, a lot of those are, are internal. We'll collect them maybe from human experts or, um, you know, try and synthetically create examples, or we'll actually look at usage data. Um, and then for us, we'll just try and hil- hill climb on those. Um, and yeah.
16. CKChristina Kim
  Yeah. I, we, I think we make this joke a lot internally that, like, if you wanna nerd snipe someone into working on something, you just need to make a good eval and then-
17. ETErik Torenberg
  [laughs]
18. CKChristina Kim
  ... people are gonna be so happy to try to hill climb that.
19. IFIsa Fulford
  Yeah. [laughs]
20. SPSpeaker
  [laughs] I like what you said about starting with the capabilities first. How do you prioritize what you actually are, are shooting for? Let's say there's this dimension of maybe deeper into everyday use versus getting much deeper into the expert use cases.
21. IFIsa Fulford
  Yeah.
22. SPSpeaker
  How do you think about that trade-off? What does that trade-off mean practically speaking, and what do you guys prioritize when?
23. IFIsa Fulford
  I mean, I think it's pretty unique at OpenAI to be able to work on something that's so generally useful. I mean, it's like everything they tell you not to do at a startup. It's just like your user is anyone.
24. SPSpeaker
  [laughs]
25. ETErik Torenberg
  [laughs]
26. IFIsa Fulford
  Like, for deep research, we wanted it to be good across, like, every single domain someone might wanna do research in, and I think you only have the, like, privilege of doing that if you work at a company that has, like, huge distribution and, like, all different kinds of users. So, um, yeah, I mean, I think i- if you choose a capability that's quite general, like online research, you just have to make sure that you represent, like, a distribution of tasks across, um, loads of different domains if you wanna get good at all of them. But then yeah, sometimes it is, it's hard to, to decide to focus on one specific thing, um, because-Uh, there are just so many different verticals that you could go, could choose from. But I think in some cases, maybe like coding will be really important, so then, um, you know, a specific team will focus on coding. But I think in general, um, because the capabilities are so general, usually like the, uh, the next model improvement just kind of, you know, improves performance on a, a pretty broad range-
27. CKChristina Kim
  Yeah
28. IFIsa Fulford
  ... of things.
29. CKChristina Kim
  I think we've kind of seen this, like with the progression of even the models that we've had in ChatGPT. Like, as the model gets smarter, it's better at instruction following, it's better at tool use-
30. IFIsa Fulford
  Mm
16:59 – 31:50
Creative Writing & Model Progress
1. ETErik Torenberg
  Let's talk about, uh, creative writing. Maybe you can talk about the, the improvements there, h- how you think about it.
2. CKChristina Kim
  That's one of my favorite improvements in GPT-5. Um, the writing, I honestly find it's very tender and touching, especially for a lot of the creative writing that we wanna do. Um, uh, we were thinking through like a bunch of different samples for the live stream, and like every time I was like, "Oh, that's like actually, like, that like hits, like [laughs]
3. IFIsa Fulford
  [laughs]
4. CKChristina Kim
  ... it's like, it's like good." Um, and it's like spooky, and I'm just like, "Oh, this feels like someone, like someone should've written this." [laughs] Um, but I think it's really cool 'cause you can actually really use it for, um, like helping you with things. Like, like I, like my example I did in the live stream was like writing, helping me write the eulogy.
5. IFIsa Fulford
  Yeah.
6. CKChristina Kim
  Something that, like that's like kind of hard to write, especially since writing isn't really something a lot of people are good at. Like, I'm personally a very, very bad writer.
7. IFIsa Fulford
  That's not true. [laughs]
8. CKChristina Kim
  [laughs] I think it's-
9. ETErik Torenberg
  But it makes a better story.
10. IFIsa Fulford
  [laughs]
11. CKChristina Kim
  [laughs] I think it's, compared to maybe other things I'm better at. Um, but it's so great to have this tool, um, to help me like craft whenever... Like, I use it literally for s- simple things as like Slack messages to figure out like how to phrase this well.
12. IFIsa Fulford
  [laughs]
13. CKChristina Kim
  It'll help me give me some iterations on how to, how to say something to the team.
14. ETErik Torenberg
  I wanna see those prompts.
15. CKChristina Kim
  Yeah. [laughs]
16. SPSpeaker
  We're now all just looking for em dashes in what I'm-
17. ETErik Torenberg
  I was gonna say
18. SPSpeaker
  ... writing.
19. CKChristina Kim
  Yeah, right?
20. SPSpeaker
  And we're like, "Mm, ChatGPT." [laughs]
21. ETErik Torenberg
  Where do you stand on the em dash discourse?
22. CKChristina Kim
  I like em dashes.
23. SPSpeaker
  Yeah.
24. ETErik Torenberg
  'Cause I, I do that normally, and now people think I'm just using ChatGPT.
25. CKChristina Kim
  Yeah, I know, I know.
26. SPSpeaker
  [laughs] I know, me too.
27. ETErik Torenberg
  Going back to the, the, the discourse for a second, Sam said in his interview with-Jack, he said, "If you had said 10 years ago that we would get in models at the level of, of, uh, sort of PhD students, um, I would think, wow, the world looks so different, and yet we've basically taken it for granted." Um, uh, do you think basically the improvements are similar... Like, as soon as we get them, we're just going to be like, "Oh, yeah, now, now this is the, the standard," or, or do you think at some point there's gonna be like, "Oh my God, this is like..." Um, h- how do you think about sort of people's ability to, um, sort of, uh, acclimate or adjust or?
28. CKChristina Kim
  Yeah, I mean, it seems like people adjust really quickly, don't you think? Uh-
29. ETErik Torenberg
  Yeah, like whatever happens-
30. CKChristina Kim
  I feel like ChatGPT got released, and everyone was like, "Wow, that's so cool." But then you just kind of take it for granted that you literally have this, like, wizard in your pocket.
31:50 – 36:25
Training, Data & Reflections
1. ETErik Torenberg
  can, can you explain what mid-training is and how it sort of, what does it achieve that pre or post doesn't?
2. CKChristina Kim
  So I think with your pre-training runs, these are, like, your, these are your, your, the big runs. These are the massive ones. Like, th- that's what we're building all these giant clusters for. Um, so you can kinda think of mid-training as literally in, it's for, like, middle. Like, we do it before, um, after pre-training but before post-training. Um, you kinda think of a way to, like, extend the model's, like, intelligence without having to do a whole new pre-training run. So this is mostly just focused on data, and it's off of the pre-training models. Um, so this is a way for us to do things like updating the knowledge cutoff of these models, right? So when you pre-train it, you're kinda like, "Okay, shoot, now we're kinda stuck in this date, and we can't ever update it again." It doesn't quite make sense to put all that data into post-training. Um, and so mid-training is just a smaller pre-training run to help expand, like, the model's intelligence and, like, up-to-dateness.
3. ETErik Torenberg
  Christina, did you work on WebGPT?
4. CKChristina Kim
  Yes, I did.
5. ETErik Torenberg
  Okay, so you're basically, like, an AI historian, um [laughs] .
6. SPSpeaker
  [laughs]
7. CKChristina Kim
  Yes, yes.
8. IFIsa Fulford
  She also watched some Confucius.
9. ETErik Torenberg
  Oh [laughs] .
10. SPSpeaker
  [laughs]
11. CKChristina Kim
  I'm an elder. [laughs]
12. SPSpeaker
  [laughs]
13. ETErik Torenberg
  So can you, like, reflect back a little bit to, you know, four years ago, five years ago, and sort of reflect on, like, what, what are the biggest thing, like, if you were to predict the, the five years out, like, what are the inflection points or, or biggest things that would've surprised you?
14. CKChristina Kim
  H- honestly, with WebGPT, the main thing we were just excited about was, like, trying to ground these language models. Like, it's, they, we had so many issues with, like, hallucinations and the model just saying random things. And, like, the fact of we didn't really do a mid-training then, so, like, the fact of, like, how do we make sure the model is actually up to date, like, most factually up to date? So then that's kinda how we thought about, like, oh, let's give it a browsing tool, if I think that makes sense. Um, and then it, yeah, like I said, that kind of went on from, like, oh, actually wanna keep asking questions, so what a chatbot would look like. But at this point, I think there had been a few chatbots by a few other companies, and I feel like a chatbot is also, like, a very common AI thing to think of. Um, but they were quite unpopular at the time, so we weren't really even sure that, like, this is actually something useful for people to work on or, like, people to use, or will people be excited about this? Is this really, like-
15. SPSpeaker
  [laughs]
16. CKChristina Kim
  ... a research innovation that we, like, are we making the Turing test here? Like-
17. SPSpeaker
  [laughs]
18. CKChristina Kim
  ... um, but I think it kinda clicked into me that, like, maybe there was actually something interesting happening here. Um, we gave early access to about 50 people, most of those people being, like, people I lived with at the time.
19. SPSpeaker
  [laughs]
20. CKChristina Kim
  Uh, and they're, two of my roommates just used it all the time. They just, like, would never stop using it, and they would just have these long conversations, and they would ask it, like, quite technical things 'cause they're also AI researchers. And so then I was just like, "Oh, this is, like, kind of interesting." Like, I don't know. And at the time we were kinda thinking like, "Okay, we kind of have this chatbot. Should we, do we make this, like, a really specific, like, meeting bot type of thing? Do we, like, make it a coding helper?" Um, but it was interesting to see my two roommates just use it just, like, for anything and everything and just, like, literally be chatting with it, like, the whole workday as they were using it. Then I was like, "Oh, this is kind of interesting." But then it was also interesting to see, like, the majority of the people that I gave access to on that 50-
21. SPSpeaker
  Yeah
22. CKChristina Kim
  ... person list, like, didn't really use it that much. But I was like, "Oh, it's like, there's clearly, like, something here, but it's, like, not quite maybe for everyone yet. Um, but there's something here."
23. ETErik Torenberg
  When, when did you realize, like, I'm working at one of the most important companies o- of this generation? Like, like, when, when was the moment where you were like, "Hey, this is something that I obviously believe is important, that's why I joined," but that, that you realized, like, the scale and significance?
24. CKChristina Kim
  Honestly, I kinda had this moment before I joined OpenAI. Like, like, I think with the scaling laws paper with GPT-3, I was just, like, kind of hit me that, like, if this exponential is true, like, there's not really much else I want to spend my life working on. Um, and, like, I wanna be part of this, like, story. Like, I think there's, there's gonna be so many interesting things unlocked with this, and I think this is, this is probably the next, like, step level in terms of, like, technology, that it kind of made me realize, like, "Oh, I, I should probably go start reading about deep learning and [chuckles] figure out how I can get into one of these labs."
25. ETErik Torenberg
  Isa, what was your moment?
26. IFIsa Fulford
  I think, I think for me it was also before I started working at OpenAI, um, using... I think I first learned about OpenAI in a, in a AI class or something, or some kind of computer science class, and they were saying like, "Oh, they trained on the whole internet." I was like, "Oh, that's so crazy. Like, what is this company?"
27. SPSpeaker
  [laughs]
28. IFIsa Fulford
  And then started using GPT-3, like, in the, I think I was the, it was a hu- a power user of the OpenAI playground, and at, at a certain point, like, had early access to these, like, different OpenAI features, like embeddings and things like that, and just became this, like, big OpenAI fan, um, which is, like, a little embarrassing, but, you know, it's fine-
29. SPSpeaker
  [laughs]
30. IFIsa Fulford
  ... 'cause it got me here [laughs] . And eventually they're like, "Okay, like, you're stalking us."
36:25 – 41:39
Company Growth & Culture
1. SPSpeaker
  Maybe a, a question more on the company building front. Um, we all sort of read and reread Calvin French-Owen's piece, uh, just his reflections on working at OpenAI. Um, curious, and you don't have to comment on that piece unless you want to, but, um, would love your reflections on the change that you've seen over the last four years or, um, or, you know, or even less than that, given I think that was only covering one year of change. Um, but what are the biggest things that you've seen change at OpenAI?
2. CKChristina Kim
  I mean, when I first joined OpenAI, the applied team was 10 engineers or something. It just, like, we didn't really have this, like, product arm. We had just launched the API. It was just a completely different world. And I think AI is, in most people's mind now, after ChatGPT, but I think pre-ChatGPT, like, people didn't really know what AI was or really, like, thought about it as much. Um, it's kinda cool working at a place that, like, my parents know what I do now, and, like, it's like that's really cool. Um, and I think the company obviously is just a lot bigger, but I think with that we can just take a lot more bets. I think when I first joined OpenAI, there were obviously way less, um, people. Like, it was much, much smaller.
3. IFIsa Fulford
  Mm-hmm.
4. CKChristina Kim
  It was around, like, 200-ish people, and I think we're close to, like-
5. IFIsa Fulford
  A few thousand, for sure.
6. CKChristina Kim
  Yeah.
7. IFIsa Fulford
  Yeah, when I joined, it was also a few hundred before ChatGPT. So it's obviously, yeah, very different in how m- you know, all of your friends have heard of, you know, what you work on. But I think culturally, obviously, the company's much bigger. I still think we've maintained, um, this... It, it still feels very much like a startup. I think some people who come from a startup are surprised. They're like, "Oh, I'm working even harder than when I was working [laughs] at my start- the startup that I founded."
8. CKChristina Kim
  [laughs]
9. IFIsa Fulford
  I think ideas can still come from anywhere, and if you just, like, take initiative and wanna make something happen, you can. And this doesn't really matter, like, how senior you are or anything like that. I think we've been able to maintain that culture, which I think is pretty special.
10. CKChristina Kim
  Yeah, we definitely reward agency, and I think that's, like, always been true. And I think especially in the research side, the teams are quite small. Like, when Isa was working on deep research, it was, like, two people still.
11. IFIsa Fulford
  Seriously, yeah.
12. CKChristina Kim
  So, like-
13. IFIsa Fulford
  Wow
14. CKChristina Kim
  ... I think we still do that on the research side. Like, most research teams are quite small and nimble for that reason. Um, so...
15. ETErik Torenberg
  And earlier you said, um, you know, we s- do something at OpenAI which startups never do, which is, you know, try to appeal to every single person with a product. What, um, are there other things that come to mind that OpenAI just does differently than, than a- all your peers or, or other startups or, or things that we may not appreciate being on the-
16. IFIsa Fulford
  I mean, I think it's different for th- um, different teams. But, um, my, the, my team collaborates so closely with the applied, like, the engineering team and the product team and design team, um, in a way that I think sometimes, like, research can be quite separate from, like, the rest of the company. But for us it's, like, so integrated. We all sit together. Um, you know, sometimes, like, the researchers will help with, like, s- implementing something. I'm not sure the engineers are always happy about it, but [laughs] we'll try.
17. ETErik Torenberg
  [laughs]
18. IFIsa Fulford
  Um, like, they'll, like, get out of the front end code.
19. ETErik Torenberg
  [laughs]
20. IFIsa Fulford
  But, um, and, and vice versa. Like, they'll help us with things that we're doing for, like, model training rounds and things like that. So I think, um, some of the, like, product teams are quite integrated. I think it's, for, for post-training, um, it's, it's a pretty common pattern, which, um, I think just lets you move really quickly.
21. CKChristina Kim
  Hmm.
22. SPSpeaker
  I guess one thing that I think is unique about OpenAI is that you're both very much a consumer company by revenue, et cetera, um, products, but also an enterprise company. How does that... Internally, like, what would you guys consider yourself? Or is that even just the wrong paradigm to think about?
23. IFIsa Fulford
  Yeah, I mean, I guess if you tie it to the mission, it's like we're trying to make the most capable thing and we're also trying to have as, make it useful to as many people as possible and accessible to as many people as possible. So, like, in that framing, I think it makes a lot of sense.
24. SPSpeaker
  The concept of taste has become, um, also very widely used. What does good taste mean within OpenAI? How do you know when you see it, know it when you see it? Um, and is that something that, um, even in a world where everything, the cost to produce everything just keeps going down and down, is that, is that the one thing that's not commoditizable, or is that also shifting given maybe that can go into the training data?
25. CKChristina Kim
  No, I think taste is quite important, especially now that, like, it is, like, like I said, like, our models are getting smarter. It's easier to use them as tools. Um, so I think having the right direction matters a lot now.
26. SPSpeaker
  Mm.
27. CKChristina Kim
  Um, and, like, having the right intuitions and, like, with the right questions you wanna ask. Um, so I would say maybe it matters more now than before.
28. IFIsa Fulford
  I think also I've been surprised by how often the thing that is, is the most simple, like, easy to explain, is the thing that works the best. And so sometimes it's, like, sou- seems very obvious, but, um, it, you know, it's quite hard to get the details of something right. But I think usually good research or taste is just, like, pretty-
29. SPSpeaker
  Yeah
30. IFIsa Fulford
  ... like simplifying the problem to-
41:39 – 42:47
Closing Thoughts
1. SPSpeaker
  Hmm. Very cool. Taste is Occam's razor. Yeah. [laughs]
2. ETErik Torenberg
  So sort of in, in, in closing here, uh, obviously historic day, uh, you wanna contextualize sort of what, what this means in context of, of, of the mission and, and, um, you know, where you've been to, to get to now to where, where you're going?
3. CKChristina Kim
  Yeah. I think with GPT-5, the thing that's, the word that's, like, been in my mind throughout all of this is, like, usable. And I think the thing that we're excited about is, like, getting this out to everyone. Um, we're excited to get, like, our best reasoning models out to free users, uh, now, and I think just getting this, our smartest model yet to, like, everyone, and I'm just excited to see what, like, what people are gonna actually use it for.
4. ETErik Torenberg
  That's a great place to wrap. Tina, Isa, thanks so much for coming on the podcast.
5. CKChristina Kim
  Yeah, thank you.
6. IFIsa Fulford
  Thank you for having us.
7. CKChristina Kim
  Yeah. [upbeat music]

Episode duration: 42:47

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode k6DM-sgYu8M

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome