Skip to content
OpenAIOpenAI

Inside ChatGPT, AI assistants, and building at OpenAI — the OpenAI Podcast Ep. 2

Why was OpenAI surprised by ChatGPT’s success? What does it really mean to “reason” in an AI system? And what’s next for agentic coding and multimodal assistants? OpenAI Head of ChatGPT Nick Turley and Chief Research Officer Mark Chen unpack it all in a conversation that pulls back the curtain on the making of OpenAI’s most iconic product. 00:00 Intro: Meet Nick Turley and Mark Chen 00:40 Origin of the name "ChatGPT" 03:50 ChatGPT’s viral takeoff 07:00 Internal debate before launch 9:40 Evolution of OpenAI’s launch approach 11:00 The sycophancy incident and RLHF 14:45 Balancing usefulness vs. neutrality in model behavior 20:00 Memory and the future of personalization 22:50 ImageGen’s breakthrough moment 29:00 Cultural shifts in safety and the freedom to explore 33:10 Code, Codex, and the rise of agentic programming 37:45 Coding with taste 41:45 Internal adoption of Codex 43:40 Skills that matter: curiosity, agency, adaptability 46:45 OpenAI’s “Do Things” culture 51:30 Adapting to an AI future 55:15 The opportunities ahead: healthcare, research 01:01:00 Async workflows and the superassistant 01:05:40 Favorite ChatGPT tips

Andrew MaynehostMark ChenguestNick Turleyguest
Jul 1, 20251h 7mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:000:40

    Intro: Meet Nick Turley and Mark Chen

    1. AM

      Hello, I'm Andrew Mayne, and this is the OpenAI Podcast. My guests today are Mark Chen, who is the Chief Research Officer at OpenAI, and Nick Turley, who is the Head of ChatGPT. We're gonna be talking about the early viral days of ChatGPT. We're gonna talk about ImageGen, how OpenAI looks at code and tools like Codex, what kind of skills they think that we might need for the future, and we're gonna find out how ChatGPT got its totally normal name.

    2. MC

      Even half of research doesn't know what those three letters stand for.

    3. NT

      You know, you're gonna have an intelligence in your pocket, that it can be your tutor, it can be your advisor, it can be your software engineer.

    4. MC

      There was a real decision the night before. Do we actually launch this thing?

  2. 0:403:50

    Origin of the name "ChatGPT"

    1. AM

      First off, how did OpenAI decide on that awesome name?

    2. NT

      Uh, it was gonna be Chat with GPT-3.5, and we had a late-night decision to simplify [chuckles] -

    3. AM

      Wait, wait, so say that again- say that name again.

    4. NT

      Uh, it was gonna be Chat with GPT-3.5-

    5. AM

      Chat-

    6. NT

      ... which rolls off the tongue even, even more nicely.

    7. AM

      That's, uh- and, and you said that was a late-night decision, meaning, like, weeks before you finally decided what to call it, right?

    8. NT

      Right, right, right. No, weeks before, we hadn't started on the project yet, I think.

    9. AM

      Oh, goodness.

    10. NT

      But, you know, I think we, we realized that that would be hard to pronounce and, um, came up with a great name instead.

    11. AM

      So that was the night before? [chuckles]

    12. NT

      Roughly.

    13. AM

      Yeah.

    14. NT

      Might have been the day before.

    15. AM

      Yeah.

    16. NT

      It was all kind of a blur at that point.

    17. AM

      I would imagine a lot of that was a blur, and I remember here, uh... I remember being in a meeting where we talked about the low-key research preview, which, like, really was, like, we really thought like, "Oh, this is..." 'Cause it's- it was the 3.5. 3.5 was a model that had been out for months, and from a capabilities point of view, when you just look at the evals, you're like, "Yeah, it's the same thing, but we just put the interface in here and made it so you didn't have to prompt as much." And then ChatGPT comes out, and when, when was the first sign that this thing was blowing up?

    18. NT

      I mean, I'm curious for- every- everyone has their slightly own recollection of that, that era, because it was a very confusing time. But for me, day one was sort of, you know, is the dashboard broken? Classic, like, uh, the logging can't be right. Day two was like, "Oh, weird." I guess, like, J- Japanese Reddit users discovered- [chuckles] ... this thing. Maybe it's, like, a local phenomenon. Day three was like, "Okay, it's going viral, but it's definitely gonna die off." And then by day four, you're like, "Okay, yeah, it's gonna, gonna change the world."

    19. AM

      Mark, did you have any expectation about that, about-

    20. MC

      No, honestly, I mean, we've had so many launches, so many previews over time, and yeah, this one really was something else, right? The takeoff ramp was huge, and yeah, my parents just stopped asking me to go work for Google. [chuckles]

    21. AM

      Wait, so wait, wait, wait a second. Up until ChatGPT, your parents were asking, like, what you were doing here?

    22. MC

      Yeah. No, I mean, um, they, they just never heard of OpenAI. Um-

    23. AM

      Right

    24. MC

      ... I think for many years, thought AGI was this pie-in-the-sky thing, and I wasn't having a serious job. So-

    25. AM

      [chuckles]

    26. MC

      ... it was a real revelation for them. Yeah.

    27. AM

      Uh, what was your job title at the time?

    28. MC

      Um, I think just member of technical staff.

    29. AM

      Member of technical staff?

    30. MC

      Yeah.

  3. 3:507:00

    ChatGPT’s viral takeoff

    1. AM

      it accelerate, I knew it was gonna happen, and then when it did, it was when it was on South Park. And remember that, when South Park made fun of the name, and-

    2. NT

      That was the first time I'd watched South Park in-

    3. AM

      Oh

    4. NT

      ... uh, let's just say a while. And that episode, I still think it's magic, and-

    5. AM

      Yeah

    6. NT

      ... it was obviously profound to watch and see, you know, something you, you helped make show up in pop culture. But there's the punchline in the end, where it's like, "Oh, this was co-written by ChatGPT?" That was so-

    7. AM

      I think they took that off, though.

    8. NT

      I think they did.

    9. AM

      I think in later episodes, 'cause it used to say, I think, "Written by, like, uh-

    10. NT

      Oh, man

    11. AM

      ... Trey Parker and, like, ChatGPT"

    12. NT

      I was not remembering that.

    13. AM

      And then... No, it was. And then I think later, I think they may have pulled that off at some point. I don't remember, like-

    14. NT

      Oh, I strongly feel that you shouldn't have to give credit to... It's-

    15. AM

      Yeah, that was-

    16. NT

      It's your business, whether or not you're using the-

    17. AM

      If I had to give credit to ChatGPT for every aspect of my life, um, well, might as well just say ChatGPT maybe with Andrew.

    18. NT

      True.

    19. AM

      So it's-

    20. MC

      Do you use it for prep for your interviews?

    21. AM

      You know, one of my, my co-producers, Justin, probably uses it. I haven't asked him yet, 'cause I'd like to think that he's handcrafting every single question that we're thinking about here, but I am sure. You say it was a bit of a blur, and I'll tell you, like, a standout moment for me at the launch of ChatGPT was, I don't know if you remember this, but the Christmas party. And we'd had several weeks of ChatGPT out there, and Sam Altman went up and said, "Hey, this has been exciting to watch this, but the Internet being the Internet," and I think we all felt this way, "it's gonna die down."

    22. NT

      Mm.

    23. AM

      Spoiler alert: It did not die down, and it just kept accelerating. What were the things you had to do internally to sort of keep this thing up and running as more people wanted to use it?

    24. NT

      We had, you know, quite a few constraints. And if, if, if... For those of you who remember, you know, I, I think you guys remember ChatGPT was down all the time-

    25. AM

      Yeah

    26. NT

      ... in the beginning. Um, and that was... Yeah, we'd said, "Hey, this is a research preview. No, no guarantees, and maybe it goes down." But the minute you had people loving and using this thing, that didn't feel super good. So, you know, people were certainly working around the clock to keep the site up. I remember, you know, we obviously ran out of GPUs, we ran out of database connections. We had, you know, um... We're getting rate limited in some of our providers. It- nothing was really set up to run a product. So in the beginning, we just built this thing, we called it the Fail Whale, and it would just tell you- [chuckles] ... kind of nicely that the thing was down, and made a little poem, I think it was generated by GPT-3, um, about being down, and, and it was sort of tongue-in-cheek.... and that got us through the winter break, 'cause we did want people to have some sort of a holiday. And then when we came back, we were like, "Okay, this is clearly not viable. You can't just go down all the time." Um, and eventually we got to something we could serve everyone.

    27. MC

      Yeah, and I think, you know, the demand really speaks to the generality of ChatGPT, right? Um, we had this thesis that ChatGPT embodied what we wanted in AGI, just because it was so general.

    28. NT

      Mm-hmm.

    29. MC

      And I think, you know, you're seeing that demand ramp just because people are realizing, you know, any use case that I want to, to give or to throw to the model, it can handle.

    30. AM

      We were kind of known as the company working on AGI, and I think prior to ChatGPT, the API was certainly the first time we had a public offering where people could go use it and do it, but then it was more for developers-

  4. 7:009:40

    Internal debate before launch

    1. AM

      was everybody at OpenAI on board with ChatGPT being useful or being ready to launch?

    2. MC

      Yeah, I don't think so. You know-

    3. AM

      Yeah

    4. MC

      ... um, even the night before, I mean, there's this very famous story at OpenAI of, uh, you know, Ilya-

    5. AM

      Mm

    6. MC

      ... taking 10 cracks at the model, you know, 10 tough questions, and my recollection is maybe only on five of them, he got answers that he thought were acceptable.

    7. AM

      Mm-hmm.

    8. MC

      And so there's a real decision the night before, "Do we actually launch this thing? Is the world actually gonna respond to this?" And I think it just speaks to when you build these models in-house, uh, you so rapidly adapt to the capabilities.

    9. AM

      Mm-hmm.

    10. MC

      And it's hard for you to kind of put yourself in the shoes of someone who hasn't kind of been in this model training loop, and see that the- there is real magic there.

    11. AM

      Yeah.

    12. MC

      Yeah.

    13. NT

      Yeah, I think to build on that, like, the controversy internally-

    14. AM

      Mm

    15. NT

      ... about, you know, is this thing good enough to launch? I think it is humbling, right? Because it, it's just a reminder of how wrong we all are when it comes to AI. It's why, you know, frequent contact with reality-

    16. MC

      Yeah

    17. NT

      ... is so important.

    18. AM

      Could you elaborate more on that contact with reality?

    19. NT

      Mm-hmm.

    20. AM

      What does that mean?

    21. MC

      Yeah, I mean, when you think about iterative deployment, uh, one way I like to frame it is, you know, there's no point everyone agrees where it's suddenly useful, right?

    22. AM

      Mm-hmm.

    23. MC

      Um, and I think usefulness is this big spectrum. Um, and so, you know, there's not one capability level or one bar that you meet, and suddenly, you know, the model is useful for everyone.

    24. AM

      Were there any hard decisions about what to include or what to focus on?

    25. NT

      We were very, very principled on ChatGPT-

    26. AM

      Mm

    27. NT

      ... to not balloon the scope. Um, we were-

    28. AM

      Mm

    29. NT

      ... we were adamant to get feedback and data, um, as quickly as we could, so there's a lot of things that-

    30. AM

      I'm always in Slack telling you things, by the way. [chuckles]

  5. 9:4011:00

    Evolution of OpenAI’s launch approach

    1. MC

      Yeah, I think over time, you know, feedback really has become an integral part of how we build the product, and it's also become an integral part of safety.

    2. AM

      Mm-hmm.

    3. MC

      And so you always feel the time cost of losing out on feedback. You know, you can deliberate in a vacuum, right?

    4. AM

      Mm-hmm.

    5. MC

      Uh, are they gonna respond to this better? Are they gonna respond to that better? Um, but it's just not a substitute for just bringing it out there, right? Um, I think our philosophy is let the models have contact with the world-

    6. AM

      Mm

    7. MC

      ... and if you need to revert something, that's fine. But I think, uh, there's really no substitute for this fast feedback, and it's become one of the big levers for how we improve model performance, too.

    8. NT

      It's sort of funny, like, I feel like we started with shipping these models in a way that is more similar to hardware, where you make, like, one launch-

    9. MC

      Mm

    10. NT

      ... very rarely, and it has to be right. And, you know, you're not gonna update the thing, and then you're gonna work on the next big project, and it's capital intensive, and the timelines are long. And over time, and I think ChatGPT was kind of the beginning, it's looked-

    11. MC

      Mm-hmm

    12. NT

      ... more like software to me, where you make these frequent updates.

    13. MC

      Mm-hmm.

    14. NT

      Um, you have kind of a constant pace the world can adopt. If something doesn't work, you roll it back, and you sort of lower the stakes in doing that, and you low- you increase the empiricism, and, and of course, just operationally, too, you can innovate faster in a, in a way that is more and more in touch with what users want.

    15. AM

      Yeah, one of the examples we had of that was the, the model becoming, uh, too obsequious or sycophantic.

    16. NT

      Mm.

    17. AM

      Could you explain what

  6. 11:0014:45

    The sycophancy incident and RLHF

    1. AM

      happened there? Well, that was where people all of a sudden say, "Hey, it's, it's telling me I've got 190 IQ, and I'm the most handsome person in the world," which I had no problem with personally, but other people did. [chuckles] And what was going on there?

    2. MC

      Yeah, so I think, um, one important thing is we rely on user feedback-

    3. AM

      Mm-hmm

    4. MC

      ... to improve the models, right? And it's this very complicated mix of reward models, which we use in, uh, a procedure we call RLHF, right? Uh, using human feedback to use RL to improve the models. And-

    5. AM

      Could you give me just, like, a brief example what that would mean?

    6. MC

      Yeah, yeah. So I think, um, one way to think about it is, you know, when a user enjoys a conversation, you know-

    7. AM

      Mm

    8. MC

      ... they provide some positive signal, um, and-

    9. AM

      Thumbs up.

    10. MC

      Yeah, a thumbs up, for instance.

    11. AM

      Okay.

    12. MC

      And, uh, we train the model to prefer to respond in a way that would elicit more thumbs up, right?

    13. AM

      Mm-hmm.

    14. MC

      And this may be obvious in retrospect, but, um, stuff like that, if balanced incorrectly, can lead to the model being more sycophantic, right? Um, you can imagine users might want that kind of, uh, that feeling of, you know, a model saying good things about them.

    15. AM

      Mm-hmm.

    16. MC

      But, um, I don't think it's a very good long-term outcome. And actually, when we look at kind of our response to-... uh, sycophancy and, and the rollout that resulted there. Um, I think there were a lot of good points about it. You know, this was something that was flagged just by a small fraction of our power users. It wasn't, you know, something that a lot of people who generally use the models noticed, and I think we really picked that out fairly early. We responded to it, I think, with the appropriate level of gravity.

    17. AM

      Mm-hmm.

    18. MC

      And, um, yeah, I think it, it just shows that, you know, we really do take these issues quite seriously, and we want to intercept them very early.

    19. AM

      Yeah, it felt like there was maybe 48 hours since the model came out, and then-

    20. MC

      Mm

    21. AM

      ... Joanne Jiang had a response explaining exactly what happened, and I think that that's the, that's the hard part. How do you navigate that? Because the problem with social media is you're basically monetized by engagement time. You want to keep people on there longer-

    22. MC

      Mm

    23. AM

      ... so you can show them more ads, and certainly the more people use ChatGPT, obviously there's a cost to OpenAI. The idea is maybe use it once and stay around forever, but that's not practical. How do you weigh that, the idea of making people happy with what they're getting versus making the model, you know, be broadly more useful than just pleasing?

    24. NT

      I feel very lucky in this regard because we have a product that's very utilitarian, and people use it to either achieve things that they do know how to do but don't feel like doing, um, faster or with less effort, um, or they're, they're using it to do things that they couldn't do at all. Um, you know, first example is maybe, you know, writing an email that you've been dreading. Second example might be, you know, running a data analysis that you didn't actually know how to do- [chuckles] ... um, in Excel, you know. Um, um, true story. So, so, you know, th- those are very utilitarian things, and fundamentally, as you improve, you actually spend less time on the product, right?

    25. AM

      Mm.

    26. NT

      Because, you know, ideally it takes less turns back and forth, or maybe you actually delegate to the AI, so you're not in the product at all. So for us, you know, time spent, it's very much not the, not the, not the thing we optimize for. You know, we do care about, um, your long-term retention-

    27. AM

      Mm

    28. NT

      ... because we do think that's a sign of value. Um, if you're coming back three months later-

    29. AM

      Mm

    30. NT

      ... that clearly means we did something right. Um, but what that means is, you know, um, I always say, um, "Show me the incentive, and I'll show you the outcome." We, we have, I think, the right fundamental, um, incentives to build something great. Um, that doesn't mean we'll always get it right. Um, the sycophancy, um, events were, were really, really important and good learning for us, and I'm, I'm proud of how we acted on it. Um, but fundamentally, I think we have the right, um, the right setup to build something awesome.

  7. 14:4520:00

    Balancing usefulness vs. neutrality in model behavior

    1. AM

      on when, you know, ChatGPT came out, there was, like, the, the allegations, "It's woke. It's woke, and people are trying to promote some sort of, like, agenda from it." And my argument always been, like, you train a model on, you know, kind of on corporate speak, you know, average news, and a lot of academia, that's gonna kind of follow into that. And I remember Elon Musk was very critical about it, and then when he trained the first version of Grok, it did the same thing. And then he's like, "Oh, yeah, when you trained it on this sort of thing, it did that." And internally, at OpenAI, there were discussions about how do we make the model not try to push you, not try to steer you? Could you go a little bit how you try to make that work?

    2. MC

      Yeah. So I think, um, at its core, it's a measurement problem, right?

    3. AM

      Mm.

    4. MC

      And I think it's actually bad to downplay these kind of concerns because they are very important things.

    5. AM

      Mm-hmm.

    6. MC

      Right? And, um, we need to make sure that the model, the default behavior that you get is something that's centered, that, you know, doesn't reflect bias-

    7. AM

      Mm

    8. MC

      ... um, on the political spectrum, um, or in, in many other, you know-

    9. AM

      Mm

    10. MC

      ... uh, axes of bias. And at the same time, you know, you do want to allow the user the capability to... You know, if you, you wanted to talk to, a, a reflection of something with more conservative values, to be able to steer that a little bit, right?

    11. AM

      Mm-hmm.

    12. MC

      Um, or liberal values, right? And, and so I think the thing is, you want to make sure the defaults are meaningful, and they're centered, and that's a measurement problem.

    13. AM

      Mm-hmm.

    14. MC

      And you also want to give ability, some flexibility, right, within bounds, to steer the model to be a persona that you want to talk to.

    15. NT

      I think that's right. Um, I think, you know, in addition to neutral defaults, ability to sort of bring your own values to some extent-

    16. AM

      Mm

    17. NT

      ... I think, you know, being transparent about the whole thing is I think really, really important. Uh, I'm not a fan of, of, you know, secret system messages that, you know, try to, like-

    18. AM

      Mm

    19. NT

      ... you know, hack the model into saying or not saying something.

    20. AM

      Mm-hmm.

    21. NT

      Um, what we've tried to do is publish our spec, so you can go look at, you know, if you're getting certain model behavior, is that a bug? Um, you know, is it in violation of our own stated spec, or is it actually in the spec, in which case you know who to criticize and who to-

    22. AM

      Mm

    23. NT

      ... who to yell at, or is it just underspecified in the spec, in which case that allows us to improve it and add more specificity into that document. So by sort of publishing the rules of the AI that it's supposed to be following, um, I think that's an important step to have more people contribute to the conversation than just the people inside of OpenAI.

    24. AM

      So we're talking about, like, the system prompt, the part of the instruction that the model gets before the user puts the input in.

    25. MC

      Well, I think it's more than that.

    26. NT

      Yeah.

    27. MC

      Yeah.

    28. AM

      Yeah.

    29. NT

      Um, the system prompt is one way to steer the model-

    30. AM

      Mm

  8. 20:0022:50

    Memory and the future of personalization

    1. AM

      I, I find myself having longer conversations with it. I like the memory function.

    2. NT

      Mm.

    3. AM

      I like the fact you can turn it off if you don't want. And I think about, like, you know, what's this gonna be two years from now or three years from now when it has a much longer memory, much more context with this? I like the idea to have these sort of like, you know, memento anonymous modes too-

    4. NT

      Mm

    5. AM

      ... where it's not gonna store this.

    6. NT

      Mm.

    7. AM

      But I, I kind of wonder how much you've been thinking about two years, three years down the road. What, what's that going to be like when ChatGPT knows way more about you?

    8. MC

      Yeah, I mean, I think memory is just such a powerful feature.

    9. AM

      Mm.

    10. MC

      In fact, it's one of the most requested features when we-

    11. AM

      Mm

    12. MC

      ... talk to people externally. Um, it's like, "This is the thing I really want to pay, pay more for." And I think, um, you know, you liken it to, if you've ever kind of had a personal assistant, you know, you-

    13. AM

      No, I have not. [laughing]

    14. MC

      [laughing] Well, you, you do need to build up context-

    15. NT

      Not relatable, Mark.

    16. MC

      Over time.

    17. AM

      Me neither. [laughing]

    18. NT

      [laughing]

    19. MC

      I'm sorry, guys. I'm sorry, guys. But, you know, it's, yeah, it's just like, it's, um, kind of in any kind of relationship that you have with a person, right?

    20. AM

      Mm.

    21. MC

      You, you build up context with them over time.

    22. AM

      Mm-hmm.

    23. MC

      Um, and I think just the more they know about you, right, the richer the relationship, the more, you know, um, they can also help you, right? Uh, you can, uh, work together to collaborate on tasks together.

    24. AM

      I, I do become-

    25. MC

      Yeah

    26. AM

      ... self-conscious of the fact that, like, it knows everything about me when I'm grumpy, and I've, I've, I've argued with it recently, by the way. Um-

    27. NT

      That's good.

    28. AM

      Yeah.

    29. NT

      Um, you should be able to argue with it, and-

    30. AM

      Yeah

  9. 22:5029:00

    ImageGen’s breakthrough moment

    1. AM

      seemed like it preferred a certain kind of image, and a lot of the utility and the capabilities for variable binding was sort of, kind of hidden away.

    2. NT

      Mm-hmm.

    3. AM

      And then ImageGen was kind of just this breakthrough moment that it caught me off guard.

    4. NT

      Mm.

    5. AM

      How did you guys feel about the launch of that?

    6. MC

      Yeah, honestly, it caught me off guard, too. Um, and this is really props to the research team. You know, um, Gabe, in particular, did a ton of work here.

    7. NT

      Mm.

    8. MC

      Um, Kenji, many others on the team-

    9. NT

      So amazing

    10. MC

      ... did phenomenal work. And, um, I think it really spoke to this thesis, that when you get a model which is good enough that in one shot it can generate an image that fits your prompt, that's gonna create immense value.

    11. AM

      Mm.

    12. MC

      And I think we never quite had that before, right? Um, that you just get the perfect generation, oftentimes on the first, first try. Um, and I think that's something very powerful. You know, like, uh, people don't want to pick the best out of a grid. I think, uh, yeah, you just got very good prompt following and, you know, just great style transfer, too, right?

    13. AM

      Yeah.

    14. MC

      Um, this ability to kind of put images, um, as context for the models and to modify and to change, and the fidelity that you could do that with, um, I think that was really powerful for people.

    15. NT

      I think, I think this ImageGen experience, um, it was just kind of another mini-ChatGPT moment-

    16. MC

      Mm-hmm

    17. NT

      ... um, all over again, where-

    18. MC

      Mm

    19. NT

      ... you know, you have kind of this, you've been staring at this for a while, you're like, "Yeah, it's gonna be cool. I think people are really gonna like it." Um, but you kind of, you know, you're launching, like, 20 different things, and then suddenly the world is going crazy in a way that you, you kind of only find out, um, um, by shipping. Like, I remember distinctly, you know, we had, like, t-... 5% of the Indian internet population tried- [laughs] - um, ImageGen over the weekend. And I was like, "Oh, wow, we're reaching new types of users-

    20. AM

      Mm

    21. NT

      -who we wouldn't even have thought, you know, who, who might not have thought of using ChatGPT. That's really cool. And, and, um, to Mark's point, I, I think a lot of this is, um, because there's just this discontinuity where something suddenly works so well, and truly the way you expected, um, where I think it, it blows people's minds, you know? And I think we're gonna have those moments in other modalities, too. You know, I think voice-

    22. AM

      Mm.

    23. NT

      -you know, it, it hasn't quite passed the Turing test yet-

    24. AM

      Mm

    25. NT

      ... but I think the minute it does, people are gonna, um, I think, find that immensely powerful and valuable. You know, video is gonna have its own moment, where it-

    26. AM

      Mm

    27. NT

      ... starts meeting the expectations that users have. So I'm really excited about the future because I think there's so many of these magical moments coming that are really gonna transform people, um, people's lives and, and also, you know, change, um, sort of ChatGPT's relevance for people. Because, um, you know, there's, um... I've always felt like there's text people and there's image people, and, like, some of them are a little bit different. Um, and now they're all using the product and discovering the value, um, across the board.

    28. AM

      The moment when it launched, I think it kind of illustrated the, the problem that had been with image models before.

    29. NT

      Mm.

    30. AM

      And, you know, when DALL-E came out, it was super exciting 'cause you're like, "I'm, I'm, like, doing pictures of space monkeys," and all these sorts of things. The moment you try to do a really complex image, and that's the, the phrase I brought up before-

  10. 29:0033:10

    Cultural shifts in safety and the freedom to explore

    1. AM

      the technological ability to control for things, and how much of that was just saying, "We've got to push the norms?"

    2. NT

      I would say it was both cultural shift and an improvement-

    3. AM

      Mm

    4. NT

      ... in our ability to control things. The culture shift, you know, I'm, I'm, I'm not gonna deny it. I think when I joined OpenAI, there was, um, a lot of conservatism, um, around, you know, what capabilities we should give to users-

    5. AM

      Mm

    6. NT

      ... maybe for good reason.

    7. AM

      Yeah.

    8. NT

      The technology is really new. Um, a lot of us were new to working on it, and, you know, if you're gonna have a bias, you know, biasing towards safety and being careful, it's not a bad, you know, in, in DNA to have. But I think over time, we, we learned that there's so many positive use cases that you, um, e- effectively prevent when you make arbitrary restrictions of the model.

    9. AM

      What about faces? Why not? Why can't I make any face I want?

    10. NT

      Um, so this is a good example of, of, um, a, you know, capability that, that's got pros and cons, and you can err on one side or the other. But, you know, um, when we, um, first shipped, um, image uploads-

    11. AM

      Mm

    12. NT

      ... um, into ChatGPT, uh, we had some debates, you know, about what, what capabilities do you allow versus where are you conservative? And I think one debate do we had is, like, do we upload- uh, allow the upload of images with faces? Or rather, when you upload an image that contains a face, do you, um, you know... Um, should we just, like, gray out the face? Because you avoid so many problems, right?

    13. AM

      Yeah.

    14. NT

      You can make inferences about people based on, on their face. Um, you could say mean things to people based on, uh, um, their face. Um, um, and, and, you know, you would just take a giant shortcut on all the gnarly issues if you didn't allow that. But, um, I've always felt we need to-

    15. NT

      ... um, err on the side of freedom, and we need to do the hard work. And I think in this case, you know, there's so many valid ways. You know, if I want feedback on makeup or on my haircut or anything like that, um, I wanna be able to talk to ChatGPT about it. That- those are valuable and benign use cases, and I would prefer to allow and then study, you know-

    16. AM

      Mm-hmm

    17. NT

      ... where does that, um, um, fall short? Where is that harmful? And then iterate from there versus taking a default stance on disallow. And I think that's one of those ways in which our stance and posture has changed a bit over time in terms of where we set, you know, where we start.

    18. AM

      Yeah, we- we're very good, I think, imagining worst-case scenarios. What if I use this, these faces to evaluate hires for a company or whatever? But also it's like, "Hey, is this eczema?" [chuckles] You know, like, you know, there's a lot of utility there.

    19. NT

      And, and honestly, I think there are certain domains of, of AI safety where worst-case scenario thinking is very appropriate.

    20. AM

      Mm-hmm.

    21. NT

      So I think that is an important way of thinking about risk-

    22. AM

      Mm-hmm

    23. NT

      ... when it comes to certain forms of risks that are existential or even just very, very bad. You know, uh, we have the preparedness framework, which helps us reason through some of those things. You know, um, can the AI let you make a, a bioweapon? It's good to think about the worst case there-

    24. AM

      Mm-hmm

    25. NT

      ... because it could be really, really bad. So you kind of have to have that way-

    26. AM

      Mm

    27. NT

      ... of thinking in the company, and you have to have certain topics where you think about, um, safety in that way. But you can't let that kind of thinking spill over onto other domains of safety, where the stakes are lower, because you end up, I think, making very, very, um, um, um, conservative decisions that, that block out many valuable use cases. So I think being sort of principled about different types of safety on different time horizons and with different levels of stakes is very important for us.

    28. AM

      I think I want a blunt mode sometimes, and just... 'Cause, like, right now-

    29. NT

      Where it actually roasts you?

    30. AM

      Well, I mean, like, yeah, 'cause I'll ask the model, like, 'cause with the, the, the voice-in, speech-out model, be like: "Do I sound tired?" And it's like, "Well, you know, I don't really wanna, you know..." And I'll be like, "Yeah, you know, just you're trying to get it to be honest."

  11. 33:1037:45

    Code, Codex, and the rise of agentic programming

    1. AM

      Codex is somehow back [chuckles] and, you know, a new, new form, uh, same name, but the capabilities keep increasing. And we've seen Code work its way first into, uh, VS Code via Copilot and then, uh, Cursor, and then, like, Windsurf, which I use all the time now. What uh- how much pressure has there been in the code space? Because I'd say that if we asked people, "Who made the top code model?" We might get different answers.

    2. MC

      Yeah, and I think it reflects that when people talk about coding, uh, they're talking about a lot of different things-

    3. AM

      Mm

    4. MC

      ... right? I think there's coding in a specific paradigm. Like, if you pull up an IDE, and you wanna kind of get a completion on a, on a function, that's very different from, you know, agentic-style coding-

    5. AM

      Mm

    6. MC

      ... where, you know, you ask, uh, you know, "I want, I want this PR," and, you know, um, and I think we've done a lot of focus-

    7. AM

      Could you- uh, sorry, could you, uh, unpack a little bit what you mean by agentic coding?

    8. MC

      Yeah, yeah. So I think, um, when you... You draw a distinction between more kind of real-time response models.

    9. AM

      Mm.

    10. MC

      Um, you can think of ChatGPT, uh, to first order, uh, a- as you ask a, um, a, a prompt, and then you get a response fairly-

    11. AM

      Mm

    12. MC

      ... fairly quickly, and then a more agentic-style model-

    13. AM

      Mm

    14. MC

      ... where you give it a fairly complicated task.

    15. AM

      Mm-hmm.

    16. MC

      You let it work in the background, and after some amount of time, it comes back to you with what it thinks is something close to the best answer.

    17. AM

      Mm-hmm.

    18. MC

      Right? And I think we see increasingly that the future will look like more of a async kind of, uh, you know, where you're asking it very difficult, hard things.

    19. AM

      Mm-hmm.

    20. MC

      And, um, you're letting the model think and reason and come back to you with really the best version of what, what it can come back with. And we see the evolution of code in that way, too. I think eventually we do see a world where you'll kind of give a very high-level description of what you want-

    21. AM

      Mm-hmm

    22. MC

      ... and the model will take time, and, um, it'll come back to you. And so I think, uh, our, our first launch, Codex, really, um, reflects that kind of paradigm, where, uh, we are giving it PRs, units of fairly heavy work-

    23. AM

      Mm-hmm

    24. MC

      ... um, that encapsulate, you know, a, a new feature or, you know, a big bug fix, and we want the model to spend a lot of time thinking about how to accomplish this thing, rather than kind of give you a fast response.

    25. AM

      Mm-hmm.

    26. NT

      And to your question, you know, there, there's, there's, there's... Coding is such a giant space.

    27. AM

      Mm.

    28. NT

      There's so many different angles at it. Kind of like talking about knowledge work or something-

    29. AM

      Mm

    30. NT

      ... incredibly broad-

  12. 37:4541:45

    Coding with taste

    1. AM

      how are we going to adapt to all that?

    2. MC

      Right. Yeah, I mean, specifically in code, right, I think there's more beyond, did it get you the right answer? With code, you know, people care about the style of the code.

    3. AM

      Mm.

    4. MC

      They care about, you know, how verbose it was in the comments. It cares about, um, you know, how much proactive work did the model do for you, right, um, on other functions? And so, I think, you know, there's a lot to get right, and users often have very different preferences here.

    5. NT

      Yeah, it's funny, I used to, I used to- you know, people used to ask me, "Well, what domains are gonna, like, you know, be transformed by AI?" You know, fastest, and I used to say, "Yeah, it's code," because, like, similar to math and other things, it's very, very verifiable and testable, and I think those are the domains that are particularly great to do-

    6. MC

      Mm

    7. NT

      ... RL on, and, you know, you're therefore gonna see all this, this awesome, you know, agentic stuff just suddenly work. I still think that's true, but the thing that surprised me about code is that, you know, there is still so much of an element of taste in terms of-

    8. MC

      Mm

    9. NT

      ... what makes good code.

    10. MC

      Mm.

    11. NT

      And there's, you know, there's a reason that, you know, people train to be a professional software engineer. It's not because their IQ gets better, because they- but rather because they learn, you know, um, how, how to build software inside an organization. What does it mean to write good tests? What does it mean to write good n- um, documentation? How do you respond when someone disagrees with your code?

    12. MC

      Mm-hmm.

    13. AM

      Mm.

    14. NT

      Those are all actual elements of being a real software engineer that we're gonna have to teach these models, um, to do. Uh, so I, I expect progress to be fast, and I still think code-

    15. MC

      Mm

    16. NT

      ... has a ton of nice properties that make it very ripe, um, for agentic, um, products. But, but, but I, I do think it's very interesting to the degree that, you know, the, the, the, the element of taste and style and, um, real-world, um, um, software engineering matters.

    17. AM

      It's interesting, too, because y- with ChatGPT and, and the other models, you're kind of dealing with having to bridge the divide between consumer and pro.

    18. NT

      Mm.

    19. AM

      I open up ChatGPT, and I tell my friends, like, "Oh, yeah," 'cause I'll plug it into whatever code model I'm working, 'cause I can actually connect it to there. And I think about, you know, well, that's a very different use case to a lot of other people. Although I've shown people, like, how to go in and use, you know, uh, an IDE, and actually have it just write documents [chuckles] for you and create folders and stuff, which people don't realize, like, yeah, you can do that. You can have ChatGPT actually control it and do that, which is cool, but then you think about, like, okay, we've got a tab now for images. There's the Codex tab, so if I want to connect to GitHub-

    20. NT

      Mm

    21. AM

      ... and have it work through there.

    22. NT

      Mm-hmm.

    23. AM

      And there's, uh, Sora-

    24. NT

      Mm-hmm

    25. AM

      ... into there. So it's kind of interesting to see how all of these things are coalescing into there. How do you differentiate between a consumer feature, a professional feature, and maybe, like, an enterprise feature?

    26. NT

      Look, um, we build very general purpose technology, and it's gonna be used by a whole range of folks. And u- unlike m- many companies, which have this kind of founding user type, and then-

    27. AM

      Mm

    28. NT

      ... they use technology to solve that user's problems, we do start often with the technology, observe who finds value in it, and then iterate for them. Now, with Codex, um, our goal was very much to build for, for professional software engineers, knowing, though, that there's sort of a splash zone, where I think a lot of other people will find value in it, and we'll try to make it accessible for those people as well. Um, there are a lot of opportunities to target non-engineers. I'm personally really motivated to create a world where, you know, or help, help build a world where, um, anyone can make software. Codex is not that product, but you could imagine those products existing over, over, over time. Um, but, you know, as a general principle, it's really hard to predict exactly who the target user is, um, until we made some of these general-purpose technologies, um, available, because it gets back to the empiricism I was talking about. Um, uh, we just never exactly know, um, where the value is gonna lie.

    29. MC

      Yeah, and I think even, um, to, to dig deeper into that assuming, like, you know, you could have a person who's mostly using ChatGPT for coding, right?

    30. NT

      Mm.

  13. 41:4543:40

    Internal adoption of Codex

    1. AM

      me, there are some tools you see that there's a lot of excitement about because there's a lot of internal demand for that.

    2. NT

      Mm.

    3. AM

      How much are you using it internally, or tools like that?

    4. MC

      More and more.

    5. AM

      [chuckles] Okay.

    6. NT

      I've, I've been really excited, um, to, to see the internal adoption. It's, it's everything from, you know, exactly what you'd expect, you know, people using, um, Codex to offload their tasks to... You know, we have a, um, um, analyst, um, workflow that will look at, you know, logging errors and automatically flag them and slack people about it. Um, so there's all these, these ways that... Or I've actually heard, heard some people are using it as a to-do, where, like, future tasks-

    7. AM

      Mm

    8. NT

      ... they're, they're hoping to do, they're starting to fire off Cod- Codex tasks. So this is the perfect type of thing that I think you can, you can dogfood internally. Um, and, and, you know, um, I, I'm very excited about, you know, the leverage that engineers are gonna get out of a tool like this. I think it's gonna allow us to move faster-... um, uh, with, with, with the people we have, and make each engineer that we hire, um, you know, yeah, like 10 times more productive. So, so in some ways, internal, uh, usage is, is a very good predictor of, of where we want to take this.

    9. MC

      Yeah. I mean, we don't want to ship something to other people that we don't find value in ourselves.

    10. NT

      Mm-hmm.

    11. MC

      And I think, you know, leading up to the launch, we-

    12. AM

      Laundry Buddy.

    13. MC

      Laundry Buddy.

    14. NT

      Laundry Buddy is an essential partner.

    15. MC

      I think you and me use that today.

    16. NT

      Okay, [chuckles] sorry, sorry. [chuckles]

    17. MC

      Um, I mean, yeah, we- I mean, we had some power users, though, that, you know, hundreds of PRs a day, um-

    18. NT

      Mm

    19. MC

      ... that they were generating personally, right?

    20. NT

      Mm-hmm.

    21. MC

      So I think, you know, uh, there are people internally finding a lot of utility from what we're building.

    22. NT

      Also, the, uh, if, if you think about internal adoption, it's also a good reality check because, you know, people are busy. You know, adopting new to- tools-

    23. MC

      Mm

    24. NT

      ... tools takes some activation energy. So actually, um, the thing you find when you try to dogfood things internally is, is, is some of the reality component of how long it takes people to actually adjust to a, a new workflow, and it's, it's been, it's been humbling to, to, to watch, right?

    25. MC

      Mm-hmm.

    26. NT

      So, so I think you learn both about the technology, but you also learn about

  14. 43:4046:45

    Skills that matter: curiosity, agency, adaptability

    1. NT

      some of the adoption patterns when you're trying to get a bunch of busy people to change the way they write code.

    2. AM

      As you build these tools, internally, people have to learn how to use them and are having to adapt, and there's a lot of question now about kind of what kind of skills do people need in the future.

    3. NT

      Mm.

    4. AM

      You know, what kind of skills do you look for on your teams?

    5. NT

      I've thought about this a lot. Um, hiring is hard, especially if you want to have a small team that is very, very good and humble and able to move fast, et cetera.

    6. AM

      Mm-hmm.

    7. NT

      And I think curiosity has been the number one thing that I've, I've looked for, and it's actually my advice to, you know, students when they ask me, "What do I do in this world where everything's changing?" Because, I mean, for us, there's so much that we don't know. There's a certain amount of humility you have to have about building on this technology, uh, because you don't know what's valuable, you don't know what's risky until you really study and go deep and, and try to understand. And when it comes to working with AI, which, you know, we obviously do a lot, not just in code, but in kind of every facet of, of our work, um, it's asking the right questions that is the bottleneck, not necessarily getting the answer. So I really fundamentally believe that we need to hire people who are deeply curious about, um, the world and what we do. I care a little bit less about their experience in AI. Mark, um, [chuckles] presumably feels a bit different about that one. [chuckles]

    8. MC

      [chuckles]

    9. NT

      But for the product side, uh, it's been curiosity that I've, I've, um, found the most- the best predictor of success.

    10. MC

      No, I mean, even on research, I think increasingly less, uh, we index on, you have to have a PhD in AI, right? I think, uh, this is a field that people can pick up fairly quickly.

    11. NT

      Mm-hmm.

    12. MC

      I also came into the company as a resident without much formal AI training. And I think correlated to what Nick said, I think one important thing is for our new hires to have agency, right?

    13. NT

      Mm.

    14. MC

      OpenAI is a place where you're not gonna get so much of a, "Oh, here's... Today, you're gonna do thing one, thing two, thing three." Um, it's really about being kind of driven to find, "Hey, here's the problem. You know, no one else is fixing it. I'm just gonna go dive in and fix it." Um, and also adaptability, right?

    15. NT

      Mm-hmm.

    16. MC

      It's a very fast-changing environment. That's just the nature of the field right now, and you need to be able to quickly figure out what's important and pivot what you need to do.

    17. NT

      The agency thing is re- uh, is real, you know.

    18. MC

      Mm.

    19. NT

      I think we often get asked for, you know, how, you know, "How does OpenAI, you know, keep shipping, and, you know, you- it feels like you're, you're pushing something out every, every week," or something like that. It's, A, funny, because it never feels to me. I always feel like, you know, we could go... be going even faster. [chuckles]

    20. MC

      [chuckles]

    21. NT

      Um, uh, but, but, you know, I, I think fundamentally, we just have a lot of pe- people with agency who can ship.

    22. AM

      Mm-hmm.

    23. NT

      Um, when that comes to product, that comes to research-

    24. MC

      Yeah

    25. NT

      ... that comes to policy. Shipping can mean different things. Uh, we all do very different things at OpenAI, but I think the ratio of people who can actually do things, um, and, you know, the lack of red tape, except where it matters-

    26. AM

      Mm-hmm

    27. NT

      ... you know, there's a couple areas where I think red tape is very, very important. But, you know, um, um, I think, I think that is what makes OpenAI very unique, and it obviously affects the type of people who we want to hire, too.

    28. AM

      I was brought into the company- 'cause I was originally given access to GPT-3.

    29. NT

      Mm-hmm.

    30. AM

      And I just started showing all these use cases

  15. 46:4551:30

    OpenAI’s “Do Things” culture

    1. AM

      for it and making videos every week for it.

    2. NT

      I remember.

    3. AM

      Yeah, and I was annoying people, I'm sure. [chuckles] But I was-

    4. MC

      No, it was not. It was really fascinating.

    5. AM

      It was exciting.

    6. MC

      [chuckles]

    7. AM

      It was an exciting time. I, I described it to people like, "They, you know, they... I think they built a UFO, and I get to play with it," you know?

    8. NT

      Yeah.

    9. AM

      And then I make it hover, and like: "Oh, you made it hover!" I'm like, "Well, they built it."

    10. NT

      [chuckles]

    11. AM

      "I just pressed the button-

    12. MC

      [chuckles]

    13. AM

      ... and got it to do that." But that was just what I found very empowering, was the fact that I, I- I'm self-taught. I learned to code by Udemy courses and stuff.

    14. NT

      Mm-hmm.

    15. AM

      And then to be a member of the engineering staff and be told, "Just go, just go do stuff."

    16. NT

      Mm-hmm.

    17. AM

      Um, nothing too critical. I didn't break anything or anybody. Um, and that's good to know that that kind of spirit is still there, and I think that is part of the reason why OpenAI is able to ship, even though, you know, it was like 150, 200 people worked on GPT-4.

    18. NT

      Mm-hmm.

    19. AM

      I think people forget about that, you know?

    20. NT

      Totally. And, and honestly, this is how, and e- even ChatGPT, this is how, how, how it came together. You know, we, we had a research team. They'd been working, you know, um, for, for a while on instruction following, and then the successor to that, and, you know, uh, post-training these models, um, to be good at chat. Uh, but the product effort came together as a, as a hackathon. I remember distinctly we said, like, "Who, who, who, who's excited to, you know, go build consumer products?"

    21. AM

      Mm.

    22. NT

      And we had all these different people. Like, we had a guy from the supercomputing team, uh, who, you know, was like, "I'll make an iOS app. I've done that-

    23. AM

      Mm

    24. NT

      ... in a past life." Or we had, you know, a researcher who wrote some backend code, and it was this convergence of people who were excited to do stuff, and I think the ability to do so, and I think that's how you get the next ChatGPT is, is, is running an organization where, where, where that is possible and continues to be possible as you scale.

    25. AM

      Hackathons were my favorite thing, 'cause, one, being a performer and loving show and tell, but it was just neat to be able to see things that you knew were gonna be a product or something later on. 'Cause when you're playing with a technology this advanced and all that. Do you guys still do them?

    26. MC

      Yeah, absolutely.

    27. AM

      Okay.

    28. MC

      Yeah. Um, we've had some fairly recently, and they are typically tied-

    29. NT

      Last week, actually.

    30. MC

      Yeah, I know. [chuckles]

  16. 51:3055:15

    Adapting to an AI future

    1. AM

      of an optimist because I see all these opportunities or places to go in there. What advice do you give people, you know, where- whatever point they are in life, about preparing for or adapting to or being part of the future?

    2. NT

      You know-

    3. AM

      I like how Mark just looked right to Nick and said, "You take this." [laughing]

    4. MC

      [laughing] Oh, no. I can go. Okay, I will jump in right now.

    5. AM

      Oh, yeah, right. [laughing]

    6. MC

      Yeah, no, I think the-

    7. AM

      Go, go

    8. MC

      ... important thing is you have to really lean into using the technology, right?

    9. AM

      Yeah.

    10. MC

      And you have to see how your own capabilities can be enhanced, how you can be more productive, more effective, by using the technology. I fundamentally do think that the way this is gonna evolve is you will still have your human experts, but what AI helps the most is the people who don't have that capability at a very advanced level, right? So if you imagine, right, like, uh, as these models get much better at healthcare advice, um, they're gonna help people who don't have access to care the most.

    11. AM

      Mm-hmm.

    12. MC

      Right? Uh, image generation, right? It's not producing, you know, an alternative for, you know, experts or, you know, professional artists. It's allowing people like me and Nick to create creative expressions, right? Um, and so I think it's kind of rising the tide that allows people to be competent and effective at a lot of things all at once, and I think that's kind of how we're gonna see a lot of these tools bootstrap people.

    13. NT

      The world's gonna change a lot, and I think truly everyone has a moment where the AI does something that they considered sacred and human. Um, um-

    14. AM

      I know a guy that got vested and, or felt very threatened about his achievements in code and abilities.

    15. NT

      Well, that, that happened for me a long time ago. [laughing] Let's talk about someone else in the room.

    16. MC

      Oh, yeah. I mean, yeah.

    17. NT

      [laughing]

    18. MC

      It's definitely better than me at a lot of code problem-solving, for sure.

    19. AM

      Yeah.

    20. NT

      Right. So I think it's deeply human to, to, to feel some level of-

    21. AM

      Mm

    22. NT

      ... um, awe, respect, uh, and maybe even fear. And I think to Mark's point, be- actually using this thing can demystify it. I think we all grew up or, you know, learned about the word AI, um, in a world where AI meant something pretty different from what we have today.

    23. AM

      Mm-hmm.

    24. NT

      You've got these algorithms that, you know, try to sell you things, try to do things, and... or you've got movies, you know, where the AI takes over, et cetera. You know, like, that term means so many things to different people, that I'm entirely unsurprised that, you know, um, there's fear. So-

    25. AM

      Mm

    26. NT

      ... actually using the thing is, is I think the best way to have a grounded conversation, um, about it. And then I think from there, the best way to prepare, I, I think there's some degree to which you need to understand the products and keep up, sure, but I think things like prompt engineering or sort of understanding the intricacies of this AI, they're kind of not the right direction. I, I, I think sort of there's fundamental human things, like learning how to delegate.

    27. AM

      Mm-hmm.

    28. NT

      Um, that is incredibly important because increasingly, you know, you're gonna have an intelligence in your pocket that it can be your tutor, it can be your, um, advisor, it can be your software engineer.

    29. AM

      Mm.

    30. NT

      Um, it's much more about you understanding yourself and the problems you have and how someone else might help than a specific understanding of AI. Um, so I think that's gonna be important. Curiosity, I mentioned it earlier, I think asking the right questions. You'll get- you only get what you put in, right? Um, that's important. And I think fundamentally being ready to learn new things. I think the more you learn, understand how to pick up new topics, i- and, and, and domains, et cetera, um, the more you're gonna be prepared for a world where, you know, the, the nature of work is shifting much faster than it's ever shifted before. So, um, I'm prepared that my job, you know, and product is, is gonna look different or not exist at all.... but, um, I am looking forward to p- picking up something new, and, and, and I think as long as you, you bring that perspective, um, you're well set up to leverage AI.

  17. 55:151:01:00

    The opportunities ahead: healthcare, research

    1. AM

      for coders or people to create code-

    2. NT

      Mm

    3. AM

      ... however it's done. Um, and you mentioned, like, the health field, and that's one of the things I hear people like, "Oh, when, you know, when we replace everything with AI," like, well, I mean, I would be very happy having an AI diagnose me, operate on me, and probably do everything else, but I do want somebody there to talk me through the procedure and hold my hand. But also, I want people asking questions, like, like, you know, every day I take a bunch of vitamins.

    4. NT

      Yeah.

    5. AM

      Is this the right time of day to take it? You know, I can't bother my doctor with all these silly little questions.

    6. NT

      I, I really don't think you end up displacing doctors. You, and you'd end up displacing not going to the doctor.

    7. AM

      Yeah.

    8. NT

      You end up democratizing the ability to get a second opinion.

    9. AM

      Mm-hmm.

    10. NT

      Uh, very few people have that resource or know to, you know, take advantage of a resource like that. You end up bringing medical care into, uh, pockets of the world, uh, where that is not readily available, and you end up helping doctors gain confidence.

    11. AM

      Mm-hmm.

    12. NT

      You know, um, I think I oft- I've often heard from doctors that, you know, they already talk to existing colleagues, uh, to get a second opinion. In some cases, that's not possible, and I think you'd be surprised by the number of doctors that use ChatGPT. Um, now, on things like medicine, there's work to make the model really, really good, and we're excited to do that work.

    13. AM

      Mm.

    14. NT

      There's also work to prove that the model-

    15. AM

      Mm

    16. NT

      ... is really good because I think you're not gonna trust it until there's some degree of sort of legitimacy.

    17. AM

      Mm-hmm.

    18. NT

      And then there's work to explain the areas where the model might not be good because increasingly, once it gets to human and then super le- human level performances, um, it's hard to frame exactly where it will fall short, which is also, um, hard, hard to sort of reckon with. But nonetheless, I think that opportunity is one of the things that gets me up in the morning. Education might be the other one.

    19. AM

      [chuckles]

    20. NT

      And, um, I think there's a tremendous opportunity to help people.

    21. AM

      What do you think is gonna surprise us the most in the next year to 18 months?

    22. MC

      I honestly think, um, it's gonna be the amount of research results that are powered, even in some small way-

    23. AM

      Mm

    24. MC

      ... by the models that we've built. And, um, one of the kind of quiet things that's taken the field by storm is the ability of the models to reason.

    25. AM

      Mm-hmm.

    26. MC

      And you already see some research papers-

    27. AM

      Could you... I'm gonna make you explain-

    28. MC

      Yeah, yeah

    29. AM

      ... when you say reason.

    30. MC

      Yeah. So this fits into the-

  18. 1:01:001:05:40

    Async workflows and the superassistant

    1. AM

      of view capability and then, uh, UI, was Deep Research. And Deep Research is probably the best example we maybe have of probably agentic sort of-

    2. MC

      Mm-hmm

    3. AM

      -model use right now, because it used to be you would ask for a model to tell you about a topic.

    4. MC

      Mm-hmm.

    5. AM

      It would-- you'd either get the data or just do a big search of the Internet, and then it would just summarize all that. Where Deep Research will go find some set of data, look at it, ask a question, then go find some new data, and come back to it-

    6. MC

      Mm

    7. AM

      ... and keep going on. And I think the first time I used it, other people used it, they were like: "Wow, this is taking a while." And then you added a UI change, so I can actually go away and go do something else.

    8. MC

      Mm-hmm.

    9. AM

      And then the lock screen on my phone will show me this is working, which was a paradigm shift. And I talked to Sam here about that, and Sam said that was a surprise to him-

    10. MC

      Mm

    11. AM

      ... was the fact that people would be willing to wait for answers. And now I've seen, uh, a new metric for models, is how long a model can spend trying to solve a problem-

    12. MC

      Mm

    13. AM

      ... which is a good metric if it ultimately solves it, and that's-- Has this been an update to you in how you think about these things? The idea of, like, "Oh, we don't just want..." And I guess you talked about this before, about agentic, and the idea that it's not just: "Give me the answer." It's like, "Take your time. Get back to me."

    14. MC

      Mm-hmm.

    15. NT

      I think, you know, to build a superassistant, you've got to relax constraints. Like, today, you have a product that is, you know, en- entirely synchronous, you have to initiate everything. Um, that's just not the maximally best way to help people.

    16. AM

      Mm-hmm.

    17. NT

      Like, if you think about a real-world, um, intelligence that you might get to work with, um, it has to be able to go off and do things over a long period of time. It has to be able to p- be proactive. Um, so I think there's like-- we're, we're sort of in this process of relaxing a lot of the, the constraints on the product and on the technology to better mimic a very, very helpful, um, entity.

    18. AM

      Mm-hmm.

    19. NT

      Um, the ability to go do five-minute tasks, you know, five-hour tasks, eventually five-day tasks, is, like, a very, very fundamental thing that I think is gonna-

    20. AM

      Yeah

    21. NT

      ... unlock a different degree of value in, in the product. So I've actually not been that surprised that people are willing to do that. Um, like, I, I don't really want to be sitting around, um, waiting for my coworker either. Um, and I think if the value is there, um, I'd, I'd gladly be doing other stuff and, and come back.

    22. MC

      Yeah, and we really don't do it just because, right? We do it out of necessity. The model needs that time-

    23. NT

      Mm

    24. MC

      ... to solve the really hard coding problem or the really hard math problem, and it's not gonna do it with less time, right? Uh, you can think about this as, I give you some kind of brain teaser, right? Your quick answer is probably, like, the intuitive wrong one, and you need the actual time to kind of work through other cases to, like, are there any gotchas here? Um, and I think it's that kind of stuff that ultimately makes robust agents.

    25. NT

      Mm.

    26. AM

      We, we've seen kind of there's, like, the, the paper of the moment where somebody comes out and says: "Ah, I found a, a blocker." And I remember there was one a month or so ago-

    27. MC

      Mm

    28. AM

      ... and they said models couldn't solve certain kinds of problems, and it wasn't hard to figure out a prompt that you could train into a model and it could solve those kinds of problems. And we had a new one that talked about how they would fail at certain kinds of problem-solving ones, and that was kind of quickly, I think, debunked by showing-

    29. MC

      Mm

    30. AM

      ... that, you know, the paper kind of had flaws in there. But there are limitations. There are things that there, there might be some blockers and thing-- or things we don't know are gonna be there. I think brittleness is one of the things. There is a point where models can only spend so much time solving a problem. We're probably at a point where we're only having the model, you know, maybe two systems watch each other, and we have to think about how a third system stops, you know, to wait for things to break down. But do you see kind of any blockers between here and where I'm getting the models that are gonna be solving-- You know, doing things like coming up with interesting scientific discoveries?

  19. 1:05:401:07:17

    Favorite ChatGPT tips

    1. NT

      worthwhile and part of our mission to do this all.

    2. AM

      All right, last question, and I'll begin. It's, uh: What's your favorite user tip for ChatGPT?

    3. MC

      Mm.

    4. AM

      Mine is I take a photograph of a menu, and I'm like: "Yeah, help me plan a meal," or whatever, if I'm trying to like, you know, stick to a diet or whatever.

    5. NT

      See, I really want that use case, but, like, I've been trying it for wine lists, and that is my eval on multimodality.

    6. AM

      Interesting.

    7. NT

      It still doesn't work, like-

    8. AM

      Really?

    9. NT

      It keeps embarrassing me with, like, hallucinated wine recommendations, and I go order it, and they're like, "Never heard of this" [laughing]

    10. MC

      [laughing] Oh.

    11. NT

      So I'm glad yours works.

    12. AM

      I see.

    13. NT

      Uh, but for me, that's the, that's the still use case-

    14. AM

      Well, I mean, it c- I-- maybe the wine lens is too dense.

    15. NT

      [laughing]

    16. AM

      That was a problem, that was a problem with Operator-

    17. NT

      Yeah

    18. AM

      ... was that, like, originally-

    19. MC

      Mm

    20. AM

      ... was that the vision models, the too much dense text, it just loses its placement.

    21. NT

      Yeah.

    22. MC

      Yeah, I mean, speaking to DeepResearch, I love using DeepResearch. And, you know, when I go meet someone new, um, when I'm gonna talk to someone about AI, right? I just pre-flight topics, right? I, I think the model can do a really good job of contextualizing who I am, who I'm about to meet-

    23. AM

      Mm-hmm

    24. MC

      ... and what things we might find interesting. Um, and I think it, it really just helps with that whole process.

    25. AM

      Very cool.

    26. NT

      Yeah. I'm a voice believer.

    27. AM

      Mm.

    28. NT

      I, I, it's still got... Um, I, I don't think it's entirely mainstream yet because it's got, it's got many little kinks that all add up.

    29. AM

      Mm-hmm.

    30. NT

      But, for me, you know, half of the value of voice is actually just having someone to talk to and forcing yourself to articulate, um, um, yourself. And I, I find that to sometimes be very difficult to do in writing. So, on my way to work, I'll use it to process my own thoughts. And, like, with some luck, and I think this works most days, I'll have sort of a structured list of to-dos by the time I actually get there, so...

Episode duration: 1:07:17

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode atXyXP3yYZ4

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome