Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

What happens when AI agents can design experiments, collect data, and improve — without a human in the loop? Andrej Karpathy joins Sarah Guo on the state of models, the future of engineering and education, thinking about impact on jobs, and his project AutoResearch: where agents close the loop on a piece of AI research (experimentation, training, and optimization, autonomously). 00:00 Andrej Karpathy Introduction 02:55 What Capability Limits Remain? 06:15 What Mastery of Coding Agents Looks Like 11:16 Second Order Effects of Natural Language Coding 15:51 Why AutoResearch 22:45 Relevant Skills in the AI Era 28:25 Model Speciation 32:30 Building More Collaboration Surfaces for Humans and AI 37:28 Analysis of Jobs Market Data 48:25 Open vs. Closed Source Models 53:51 Autonomous Robotics 1:00:59 MicroGPT and Agentic Education 1:05:40 Conclusion

Sarah GuohostAndrej Karpathyguest

Mar 20, 20261h 6mWatch on YouTube ↗

EVERY SPOKEN WORD

80 min read · 16,042 words

0:00 – 2:55
Andrej Karpathy Introduction
1. SGSarah Guo
  Code's not even the right verb anymore, right?
2. AKAndrej Karpathy
  [laughs] Yeah.
3. SGSarah Guo
  But I have to, um, express my will to my agents for-
4. AKAndrej Karpathy
  Manifest
5. SGSarah Guo
  ... sixteen hours a day. Manifest.
6. AKAndrej Karpathy
  How can I have not just a single session of Claude Code or Codex or some of these agent harnesses? How can I have more of them? How can I do that appropriately? The agent part is now taken from granted. Now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions. But-
7. SGSarah Guo
  [laughs]
8. AKAndrej Karpathy
  ... there-- I mean, this is why it gets to the psychosis, is that this is, like, infinite and everything is skill issue.
9. SGSarah Guo
  [upbeat music] Hi, listeners. Welcome back to No Priors. Today, I'm here with Andrej Karpathy, and we have a wide-ranging conversation for you about code agents, the future of engineering and AI research, how more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age. Welcome, Andrej. Andrej, thanks for doing this.
10. AKAndrej Karpathy
  Yeah, thank you for having me. [laughs]
11. SGSarah Guo
  Uh, so it's been a very exciting couple of months in AI.
12. AKAndrej Karpathy
  Uh, yeah, [laughs] you could say that.
13. SGSarah Guo
  I remember, um, walking into the office at some point, and you were, like, really locked in, and I was asking what you were up to, and you're like, "I just... I have to code for sixteen hours a day," or code's not even the right verb anymore, right?
14. AKAndrej Karpathy
  [laughs] Yeah.
15. SGSarah Guo
  But I have to, um, express my will to my agents for-
16. AKAndrej Karpathy
  Manifest
17. SGSarah Guo
  ... sixteen hours a day. Manifest. Um, because, like, there's been a jump in capability.
18. AKAndrej Karpathy
  Yeah.
19. SGSarah Guo
  Uh, what's happening? Tell me about your experience.
20. AKAndrej Karpathy
  Yeah, I kinda feel like I was just in this perpetual... I still am often, uh, in this state of AI psychosis just, like, all the time, uh, because there was a huge unlock in what you can achieve as a person, as an individual, right? Because you were bottlenecked by, you know, your typing speed and so on. But now with these agents, it really... I would say in December is when it really just... something flipped, where I kinda went from eighty-twenty of like, you know, uh, to, like, twenty-eighty of writing code by myself versus just delegating to agents. And I don't even think it's twenty-eighty by now. I think it's a lot more than that. I don't think I've typed, like, a line of code probably since December, basically, [laughs] um, which is, like, an extremely large, uh, change. Um, I was talking to, like, for example, I was talking about it to, for example, my parents and so on, and I don't think, like, a normal person actually realizes that this happened or how dramatic it was. Like, literally, like, if you just find a random software engineer or something like that at their, at their desk and what they're doing, like, their default workflow of, you know, building software is completely different as of basically December. Uh, so I'm just, like, in this state of psychosis of trying to figure out, like, what's possible, uh, trying to push it to the limit. How is it-- how can I have not just a single session of, you know, um, Claude Code or Codex or some of these agent harnesses? How can I have more of them? How can I do that, uh, appropriately? And then how can I use these claws? What are these claws? Uh, [laughs] and, uh, so there's, like, a lot of new things. I wanna be at the forefront of it, you know, and I'm very antsy that I'm not at the forefront of it. And I see lots of people on Twitter doing all kinds of things, and they all sound like really good ideas, and I need to be at the forefront, or I feel
2:55 – 6:15
What Capability Limits Remain?
1. AKAndrej Karpathy
  extremely nervous. And so I guess I'm just in this psychosis of, like, what's possible? Like, because it's unexplored fundamentally.
2. SGSarah Guo
  Well, if you're nervous, the rest of us are, are nervous.
3. AKAndrej Karpathy
  [laughs]
4. SGSarah Guo
  We have a, uh, we have a team that we work with at Conviction that their setup is everybody is like... You know, none of the engineers write code by hand.
5. AKAndrej Karpathy
  Yeah.
6. SGSarah Guo
  And they, they, they're all microphoned, and they just, like, whisper to their agents all the time.
7. AKAndrej Karpathy
  Mm-hmm, mm-hmm.
8. SGSarah Guo
  It's the strangest work setting ever.
9. AKAndrej Karpathy
  Yeah. [laughs]
10. SGSarah Guo
  Uh, and I thought they were crazy, and now I like-
11. AKAndrej Karpathy
  [laughs]
12. SGSarah Guo
  ... I fully accept. I was like, "Oh, this was the way."
13. AKAndrej Karpathy
  Uh-huh.
14. SGSarah Guo
  Like, you're just ahead of it.
15. AKAndrej Karpathy
  Yes.
16. SGSarah Guo
  Um, what, uh, w- how do you think about your own capacity now to, like, explore or to do projects? Like, what is it limited by?
17. AKAndrej Karpathy
  Yeah, what is it limited by? Uh, just I think everything, like so many things, even if they don't work, I think to a large extent you feel like it's a skill issue. It's not that the capability's not there-
18. SGSarah Guo
  [laughs]
19. AKAndrej Karpathy
  ... it's that you just haven't found a way-
20. SGSarah Guo
  Yeah
21. AKAndrej Karpathy
  ... to string it together of what's available. Like, I just don't... I didn't give good enough instructions in the agent's MD file or whatever it may be. I don't have a nice enough memory tool that I put in there or something like that. So it all kind of feels like skill issue when it doesn't work to some extent. You wanna see how you can parallelize them, et cetera, and you wanna be Peter Steinberg, basically. Uh, so Peter is famous. He has a funny photo where he's in front of a monitor with lots of, uh, like, uh, he uses Codex, so lots of Codex agents tiling the, the, the monitor. And they all take about twenty minutes if you prompt them correctly and you use the high effort. And so they all take about twenty minutes. So you have multiple, you know, ten, uh, repos checked out. And so he's just, um, going between them and giving them work. It's just like you can, you can, you can move in much larger macro actions. It's not just like, "Here's a line of code. Here's a new function." It's like, "Here's a new functionality, and delegate it to agent one. Here's a new functionality that's not gonna interfere with the other one. Give it agent two." And then try to, uh, review their work as best as you can [laughs] depending on how much you care about that code. Like, what are these macro actions that I can, like, manipulate my software repository by? And, like, another agent is doing some, like, research. Another agent is writing code. Another one is coming up with a plan for some new implementation. And so everything just, like, happens in these, like, macro actions over your repository. Um, and you're just trying to become, like, really good at it and develop, like, a muscle memory for it is extremely, um, yeah, it's very rewarding, number one, because it actually works. Uh, but it's also kinda like the new thing to learn, so that's why, hence the psychosis. [laughs]
22. SGSarah Guo
  Yeah, I, I do feel like my instinct is like whenever I am waiting for an agent to complete something-
23. AKAndrej Karpathy
  Mm-hmm
24. SGSarah Guo
  ... the obvious thing to do is like, well, I can do more work.
25. AKAndrej Karpathy
  Yeah.
26. SGSarah Guo
  Right? Like, if I have access to more tokens-
27. AKAndrej Karpathy
  Yeah
28. SGSarah Guo
  ... then, like, I should just parallelize-
29. AKAndrej Karpathy
  Oh, yeah
30. SGSarah Guo
  ... add more tasks. And so that's, that's very stressful because if you-
6:15 – 11:16
What Mastery of Coding Agents Looks Like
1. SGSarah Guo
  feel like they, they, they felt resource-bound.
2. AKAndrej Karpathy
  Mm-hmm.
3. SGSarah Guo
  Uh, and now that you have this big capability jump, you're like, "Oh, actually, it's not, you know, my ability to access the compute anymore-
4. AKAndrej Karpathy
  Yeah.
5. SGSarah Guo
  "Like I'm, I'm the binding constraint."
6. AKAndrej Karpathy
  Yes. Yes, a skill issue.
7. SGSarah Guo
  Yeah.
8. AKAndrej Karpathy
  Which is very empowering 'cause, um, yeah, 'cause you could be getting better, so that's why, that's why I think it's very addictive, because there's unlocks when you, when you get better.
9. SGSarah Guo
  Where do you think it goes? Like if you just think about like, okay, you know, Andrej's iterating and everybody-
10. AKAndrej Karpathy
  [laughs]
11. SGSarah Guo
  ... else is for 16 hours a day, getting better at using coding agents, like what does it look like in a year of like you've reached mastery?
12. AKAndrej Karpathy
  [laughs] Yeah, what does mastery look like, right? At the end of the year or like two, three years, five years, 10 years, et cetera.
13. SGSarah Guo
  Yeah.
14. AKAndrej Karpathy
  Well, I think everyone is basically interested in like going up the stack. So I would say it's, yeah, it's not about a single session with your agent, um, multiple agents, how do they collaborate, and teams and so on. So everyone's trying to figure out what that looks like. And then I would say Claw is also kind of an interesting direction because it really... When I say a Claw, I mean this like layer that, uh, kind of takes persistence to a whole new level. Like it's something that like keeps looping. It's, it's like, um, it's not something that you are interactively in the middle of. It kind of like has its own little sandbox, its own little... You know, it, it kind of like does stuff on your behalf, even if you're not looking kind of thing. Um, and then also has like maybe more sophisticated memory systems, et cetera, that are not yet implemented in agents. So, uh, um, OpenClaw has a lot more sophisticated memory, I would say, than what you would get by default, uh, which is just, uh, memory compaction when your context runs out, right?
15. SGSarah Guo
  You think that's the piece that resonated for more users versus like perhaps like broader tool access?
16. AKAndrej Karpathy
  For OpenClaw?
17. SGSarah Guo
  Yeah.
18. AKAndrej Karpathy
  Uh, there, there's like... I think there's at least five things that resonated with me. [laughs]
19. SGSarah Guo
  There's a lot of really good ideas in here. Yeah. Good job, Peter.
20. AKAndrej Karpathy
  I mean-
21. SGSarah Guo
  Yeah
22. AKAndrej Karpathy
  ... Peter has done a really amazing job. Um, I saw him recently, uh, and I talked to him about it, and I... He's very humble about it, but I think he innovated simultaneously in like five different ways and put it all together. Um, so for example, like the Sol ND document, like he actually really crafted a personality that is kind of compelling and interesting. And I feel like a lot of the current agents, they don't get this correctly. I actually think, uh, Claude has a pretty good personality. It feels like a teammate.
23. SGSarah Guo
  Mm-hmm.
24. AKAndrej Karpathy
  Uh, and, uh, it's excited with you, et cetera. Uh, I would say, um, for example, Codex is a lot more dry.
25. SGSarah Guo
  Mm-hmm.
26. AKAndrej Karpathy
  Um, which is kind of interesting because in ChatGPT, Codex is like a lot more upbeat and highly sycophantic. But I would say Codex the coding agent is very dry. It doesn't, it doesn't seem to care about what you're creating.
27. SGSarah Guo
  Mm-hmm.
28. AKAndrej Karpathy
  It's kind of like, "Oh, I implemented it." It's like, "Okay, but do you understand what we're building?"
29. SGSarah Guo
  Mm-hmm.
30. AKAndrej Karpathy
  [laughs]
11:16 – 15:51
Second Order Effects of Natural Language Coding
1. AKAndrej Karpathy
  so, so Dobby is in charge of the house. I text through with it through WhatsApp. Um, and it's been like really fun to have these macro actions that maintain my house. I haven't like really pushed it, uh, like way more beyond that, and I think people are doing a lot more crazy things with it. Uh, but for me, even just the home automation setup, I used to use like six apps-
2. SGSarah Guo
  Yeah
3. AKAndrej Karpathy
  ... completely different apps, and I don't have to use these apps anymore. Like Dobby controls everything in natural language. It's amazing. Um, and so I think like I haven't even pushed the paradigm fully, but already that is so helpful and so inspiring, I would say.
4. SGSarah Guo
  Do you think that's indicative of like what people want from a user experience perspective with software, right? Because I, I don't-
5. AKAndrej Karpathy
  Yeah
6. SGSarah Guo
  ... think... You know, it's pretty ignored that it takes humans effort to like learn new software-
7. AKAndrej Karpathy
  Uh-huh. Yeah
8. SGSarah Guo
  ... like new UI.
9. AKAndrej Karpathy
  Yeah. I think, uh, to some extent-That's right. It's like working backwards from how people think an AI should be, because the, what people have in their mind of like what an AI is, is not actually what an LLM is by, by, like in a raw sense. Like LLM is a token generator, you know, like more tokens come out. But what they think of is like this pers- this persona identity that they can tell stuff and it remembers it, you know? And, uh, it's just kind of an entity behind the WhatsApp. It's like a lot more understandable.
10. SGSarah Guo
  Mm-hmm.
11. AKAndrej Karpathy
  Uh, so I think to some extent it's like matching the expectations that humans already have for what an AI should behave. But under the hood, there's like a lot of technical details go into that, and LLMs are too raw of a primitive, uh, to actually, um, type check as AI, I think, for most people, if that makes sense.
12. SGSarah Guo
  Yeah. Um, I think that's like how we understand what the AI is and like the, um, description of it as Dobby or some personality-
13. AKAndrej Karpathy
  Mm-hmm
14. SGSarah Guo
  ... obviously resonates-
15. AKAndrej Karpathy
  Mm
16. SGSarah Guo
  ... with people. Um, I also think that it, it, the, uh, the unification that you did across your six different software systems for your home automation-
17. AKAndrej Karpathy
  Yeah
18. SGSarah Guo
  ... speaks to a different question of like, do people really want all the software that we have today?
19. AKAndrej Karpathy
  Mm-hmm. Yeah.
20. SGSarah Guo
  Right? Um, because I, I would argue like, well, you have the hardware-
21. AKAndrej Karpathy
  Yeah
22. SGSarah Guo
  ... but you've now thrown away the software, or the-
23. AKAndrej Karpathy
  Yes
24. SGSarah Guo
  ... the U- the UX layer of it.
25. AKAndrej Karpathy
  Yeah.
26. SGSarah Guo
  Um, do you think that's what people want?
27. AKAndrej Karpathy
  Yeah, I think there's this like, there's this sense that these apps that are on the App Store for using these smart home devices, et cetera, uh, these shouldn't even exist kind of in a certain sense. Like shouldn't it just be APIs, and shouldn't agents be just using it directly? And, um, wouldn't it... Like, I can do all kinds of home automation stuff that, uh, in any individual app will not be able to do, right?
28. SGSarah Guo
  Mm-hmm.
29. AKAndrej Karpathy
  Um, and an LLM can actually drive the tools and call all the right tools and do, uh, do pretty complicated things. Um, and so in a certain sense, it does point to this, like maybe there's like an overproduction of lots of custom bespoke apps that shouldn't exist because agents kinda like crumble them up, and everything should be a lot more just like exposed API endpoints, and agents are the glue of the intelligence that actually like tool calls all the, all the parts. Um, another example is like my treadmill. Uh, there's an app for my treadmill, and I wanted to like keep track of how often I do my cardio, uh, but like I don't want to like log into a web UI and-
30. SGSarah Guo
  Yeah
15:51 – 22:45
Why AutoResearch
1. SGSarah Guo
  us, we're all just busier, unfortunately.
2. AKAndrej Karpathy
  Yeah. Yeah. Uh, I didn't really take advantage of a lot of like email and calendar and all this other stuff, and I didn't give it access because I'm still a little bit like suspicious, and it's still very new and rough around the edges. So I didn't wanna give it like full access to my digital life yet, and part of it is just the security, privacy, and, uh, just being very cautious in that, uh, in that realm. And, um, so some of it is like held back by that, I would say. Yeah, maybe that's like the dominant, dominant feature, but some of it is also just I feel so distracted because I feel like I had a week of Claw, and then other stuff is happening and-
3. SGSarah Guo
  What was the, um... I, I mean, you've talked about like being able to train or at least optimize a, a, a model as a task you want to see agents do for a long time.
4. AKAndrej Karpathy
  Mm-hmm.
5. SGSarah Guo
  Like what was the motivation behind AutoResearch?
6. AKAndrej Karpathy
  AutoResearch, yeah. So I think like I had a tweet earlier where I kind of like said something along the lines of, to get the most out of the tools that have become available now, you have to remove yourself as the, as the bottleneck. You can't be there to prompt the next thing. You're... You need to take yourself outside. Um, you have to arrange things such that they are completely autonomous. And the more, you, you know, how can you maximize your token throughput and not be in the loop? This is the, this is the goal. And so I kind of mentioned that the, the name of the game now is to increase your leverage. Uh, I put in just very few tokens just once in a while, and a huge amount of stuff happens on my behalf. And so AutoResearch, like I tweeted that, and I think people liked it and whatnot, but that, they haven't like maybe worked through like the implications of that. And for me, AutoResearch is an example of like an implication of that.
7. SGSarah Guo
  Hmm.
8. AKAndrej Karpathy
  Where it's like I don't wanna be like the researcher in the loop like looking at results, et cetera. Like I'm, I'm holding the system back. So the question is how do I refactor all the abstractions so that I'm not... I have to arrange it once and hit go. The name of the game is how can you get more agents running for longer periods of time without your involvement doing stuff on your behalf? And AutoResearch is just, yeah, here's an objective, here's a metric, here's your boundaries of what you can and cannot do, and go.And, uh, yeah-
9. SGSarah Guo
  You were surprised at its effectiveness
10. AKAndrej Karpathy
  ... yeah, I, I didn't expect, uh, it to work because so I have the project Nanochat. Um, and fundamentally, like, I think a lot of people are very confused with my obsession for, like, training GPT-2 models and so on. But for me, uh, training GPT models and so on is just a little harness, a little playground for training LLMs. And fundamentally, what I'm more interested in is, like, this idea of recursive self-improvement and to what extent you can actually have LLMs improving LLMs. Because I think all the frontier labs, this is, like, the thing-
11. SGSarah Guo
  Mm-hmm
12. AKAndrej Karpathy
  ... uh, for obvious reasons, and they're all trying to recursively self-improve, roughly speaking. And so for me, this is kinda like, um, a little playpen of that. Um, and I guess I'd, like, tuned Nanochat already quite a bit by hand in a good old-fashioned way that I'm used to. Like, I'm a researcher. I've done this for, like, you know, two decades. I have some amount of, like, what is the opposite of hubris?
13. SGSarah Guo
  Instinct for it. Uh, yeah.
14. AKAndrej Karpathy
  [laughs]
15. SGSarah Guo
  Earned confidence.
16. AKAndrej Karpathy
  [laughs] Okay.
17. SGSarah Guo
  Yeah.
18. AKAndrej Karpathy
  I have, like, two decades of, like, oh, I've trained this model, like, thousands of times of, like, um... So I've done a bunch of experiments. I've done hyperparameter tuning. I've done all the things I'm very used to and I've done for two decades.
19. SGSarah Guo
  Yeah.
20. AKAndrej Karpathy
  And I've gotten to a certain point, and I thought it was, like, fairly well tuned. And then I let AutoResearch go for, like, overnight, and it came back with, like, tunings that I didn't see.
21. SGSarah Guo
  Mm-hmm.
22. AKAndrej Karpathy
  And yeah, I did forget, like, the weight decay on the value embeddings, and my Adam betas were not sufficiently tuned, and these things jointly interact, so, like, once you tune one thing, the other things have to potentially change, too. You know, I shouldn't be a bottleneck. I shouldn't be running these hyperparameters or optimizations. I shouldn't be looking at the results. There's objective criteria in this case, uh, so you just let, you just have to arrange it so that it can just go forever. So that's a single sort of version of AutoResearch, of, like, a single loop trying to improve. And I was surprised that it, um, it found these things that I, you know, the repo is already fairly well tuned and still found something. And that's just a single, it's a single loop. Like these frontier labs, they have GPU clusters of tens of thousands of them. And so it's very easy to imagine how you would basically get a lot of this automation on, um, smaller models. And fundamentally, everything around, like, frontier-level intelligence is about extrapolation and scaling laws. And so you basically do a ton of the exploration on the smaller models, and then you try to, um, extrapolate out.
23. SGSarah Guo
  So you're saying our research efforts are gonna get more efficient, like we're gonna have better direction for when we scale as well-
24. AKAndrej Karpathy
  Um-
25. SGSarah Guo
  ... if we can do this experimentation better.
26. AKAndrej Karpathy
  Yeah, I would say that, like, the most interesting project and probably what the frontier labs are working on is, uh, you know, you experiment on the smaller models. You try to make it as autonomous as possible, remove researchers [laughs] -
27. SGSarah Guo
  Mm-hmm
28. AKAndrej Karpathy
  ... from the loop. Uh, they have way too much... What is the, what is the opposite of, uh-
29. SGSarah Guo
  Earned confidence?
30. AKAndrej Karpathy
  ... way too much confidence?
22:45 – 28:25
Relevant Skills in the AI Era
1. AKAndrej Karpathy
  it's like, [laughs] you know, I think, like, you sort of go one step at a time, where you sort of have one process and then second process and then the next process, and these are all layers of an onion.
2. SGSarah Guo
  Mm-hmm.
3. AKAndrej Karpathy
  Uh, like the LLM sort of part is now taken for granted. The agent part is now taken for granted. Now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions, and it's just like, it's a little too much, you know? But they're-
4. SGSarah Guo
  [laughs]
5. AKAndrej Karpathy
  I mean, this is why it gets to the psychosis, is that this is, like, infinite, and everything is skill issue, and that's why I feel like, yeah, that's just coming back to [laughs] this is why it's so insane.
6. SGSarah Guo
  Okay.
7. AKAndrej Karpathy
  [laughs]
8. SGSarah Guo
  Well, if we're, we're just trying to, like, diagnose the current moment and, uh, what is a relevant skill right now, what do you, like, what do you think is the implication that this, um, that this is the loop we should be trying to achieve-
9. AKAndrej Karpathy
  Yeah
10. SGSarah Guo
  ... in different areas-
11. AKAndrej Karpathy
  Mm-hmm
12. SGSarah Guo
  ... and that it works, right?
13. AKAndrej Karpathy
  Yeah.
14. SGSarah Guo
  Like, you know, remove-Create the metric or create the ability for, um, agents to continue working-
15. AKAndrej Karpathy
  Mm-hmm
16. SGSarah Guo
  ... on it without you.
17. AKAndrej Karpathy
  Yeah.
18. SGSarah Guo
  Do we still have performance engineering? Like what- [chuckles]
19. AKAndrej Karpathy
  Uh-huh. Yeah. I mean, so there's a few caveats that I would put on top of the LLM psychosis. So number one-
20. SGSarah Guo
  Mm-hmm
21. AKAndrej Karpathy
  ... uh, this is extremely well-suited to anything that has objective, uh, metrics that are easy to evaluate.
22. SGSarah Guo
  Mm-hmm.
23. AKAndrej Karpathy
  So for example, like writing kernels for more efficient CUDA, you know, code for various parts of a model, et cetera, the perfect fit.
24. SGSarah Guo
  Mm-hmm.
25. AKAndrej Karpathy
  Because you have inefficient code, and then you want efficient code that has the exact same behavior-
26. SGSarah Guo
  Yeah
27. AKAndrej Karpathy
  ... but it's much faster. Perfect fit. Uh, so a lot of things like, like are perfect fit for AutoResearch, but many things will not be. And so they... It's just if you can't evaluate it, then you can't AutoResearch it, right? Uh, so that's like caveat number one. And then maybe caveat number two I would say is, you know, we're, we're kind of talking about next steps, and we kind of see what the next steps are, but fundamentally the, the whole thing still doesn't... It's still kind of like bursting at the seams a little bit and there's cracks, and it doesn't fully work. And if you kind of try to go too far ahead, the whole thing is actually net not useful, if that makes sense.
28. SGSarah Guo
  Hmm.
29. AKAndrej Karpathy
  Um, because these models like still are not, you know, they've improved a lot, but they're still are like rough around the edges is maybe the way I would describe it. I simultaneously feel like I'm talking to an extremely brilliant PhD student who's been like a systems programmer for their entire life and a 10-year-old, and it's so weird because humans, like there's... [chuckles] I feel like they're a lot more coupled. Like you have, you know, um, everything is a lot more coupled.
30. SGSarah Guo
  Yes, you wouldn't, you wouldn't encounter that combination. Yeah.
28:25 – 32:30
Model Speciation
1. AKAndrej Karpathy
  fundamentally what's going on, and there's some blind spots, and some, some things are not being optimized for, and this is all clustered up in these neural opaque models, right? So you're either on rails of what it was trained for, and everything is like you're going at speed of light, or you're not. Um, and so it's the jaggedness. So, um, so that's why I think like even though the, the progression is obvious what should happen, you can't let it fully go there yet because it doesn't fully work, or it's a skill issue, and we just haven't like figured out how to use it. So, you know, it's hard to tell.
2. SGSarah Guo
  Can I ask kind of a blasphemous question, which is like if this jaggedness is persisting, um, and it's all rolled up in a, uh, at least monolithic interface-
3. AKAndrej Karpathy
  Mm
4. SGSarah Guo
  ... right? But-
5. AKAndrej Karpathy
  Mm
6. SGSarah Guo
  ... you know, single model-
7. AKAndrej Karpathy
  Mm
8. SGSarah Guo
  ... um, does that make sense, or do, do you... should, should it be unbundled into things that are... can be optimized or improved against different domains of intelligence?
9. AKAndrej Karpathy
  Uh, like unbundling the models into multiple experts in different areas, et cetera? Um-
10. SGSarah Guo
  More directly, yeah.
11. AKAndrej Karpathy
  Um-
12. SGSarah Guo
  Instead of just MoE that we have no exposure to-
13. AKAndrej Karpathy
  Yeah
14. SGSarah Guo
  ... because that can be like confusing as a user from the outside-
15. AKAndrej Karpathy
  Uh-huh
16. SGSarah Guo
  ... which is like, why is it so good at this but not at this other thing?
17. AKAndrej Karpathy
  Yeah. I think currently my impression is, uh, the labs are trying to have a single sort of like monoculture of a model that is, uh, arbit-arbit-arbitrarily intelligent in all these different domains-
18. SGSarah Guo
  Mm-hmm
19. AKAndrej Karpathy
  ... and they just stuff into the parameters. I do think that we will, we... I, I do think we should expect more speciation in the, um, intelligences. Um-Like, you know, the animal kingdom is extremely diverse in the brains that exist, and there's lots of different niches of, uh, of nature, and some animals have overdeveloped visual cortex or other part-- kind of parts. And I think we, we should be able to see more speciation. And, um, you don't need like this oracle that knows everything. You kind of speciate it, and then you put it on a specific task, and we should be seeing some of that because you should be able to have like much smaller models that still have the cognitive core, like they're still competent, but then they specialize and then, um, and then they can become more efficient, uh, in terms of latency or throughput on, uh, specific tasks that you really care about. Like if you're a mathematician working in Lean, I saw, for example, there's a few releases that really like target that as a domain. Um, uh, so there's probably gonna be a few examples like that where the unbundling kind of makes sense.
20. SGSarah Guo
  One question I have is whether or not, uh, the capacity constraint on available compute infrastructure-
21. AKAndrej Karpathy
  Mm-hmm.
22. SGSarah Guo
  -drives more of this-
23. AKAndrej Karpathy
  Oh.
24. SGSarah Guo
  -because efficiency-
25. AKAndrej Karpathy
  Yeah.
26. SGSarah Guo
  -actually matters more, right?
27. AKAndrej Karpathy
  Yeah. Yeah.
28. SGSarah Guo
  Like y-your-- if you... Financing aside, though financing's involved in all of this, if you have access to full compute for anything you do, like even-
29. AKAndrej Karpathy
  Mm.
30. SGSarah Guo
  -one single model, right?
32:30 – 37:28
Building More Collaboration Surfaces for Humans and AI
1. AKAndrej Karpathy
  continual learning maybe, or how you, um, how you fine-tune in a certain area, how you get better in a certain area, or like how you actually touch the weights, not just the context windows. And so it's a lot more tricky, I would say, to touch the weights than just the context windows, uh, because you're actually fundamentally changing the full model and potentially its intelligence. And so, um, so maybe it's just like not a fully developed science, if that makes sense, of speciation.
2. SGSarah Guo
  A-a-and it also has to be like cheap enough-
3. AKAndrej Karpathy
  Yeah.
4. SGSarah Guo
  -for that speciation to be worthwhile-
5. AKAndrej Karpathy
  Yeah.
6. SGSarah Guo
  -in these given-
7. AKAndrej Karpathy
  Yeah.
8. SGSarah Guo
  -contexts. Can I ask a question about, uh, like, uh, an extension to AutoResearch that you described in terms of, um, open ground? You said, "Okay, well, you know, we have this thing, um, we need more collaboration surface around it-
9. AKAndrej Karpathy
  Mm.
10. SGSarah Guo
  -essentially-
11. AKAndrej Karpathy
  Mm.
12. SGSarah Guo
  -for people to-
13. AKAndrej Karpathy
  Oh, yeah.
14. SGSarah Guo
  -contribute, um, to research overall." Can you talk about that?
15. AKAndrej Karpathy
  Yeah. So we talked about AutoResearch has a single thread of like, I'm gonna try stuff in loop.
16. SGSarah Guo
  Mm-hmm.
17. AKAndrej Karpathy
  But fundamentally, uh, the parallelization of this is like the interesting component. Um, and I guess I was trying to like play around with a few ideas, but I don't have anything that like clicks as simply as like... I don't have something that I'm like super happy with just yet, but it's something I'm like working on on the side when I'm not working on my claw. [laughs] Um, so I think like one issue is if you have a bunch of nodes, uh, of parallelization available to you, then it's very easy to just have multiple AutoResearchers talking through a s- um, a common system or something like that. What I was more interested in is how you can have an untrusted pool of workers out there on the internet.
18. SGSarah Guo
  Mm-hmm.
19. AKAndrej Karpathy
  So, for example, in AutoResearch, uh, you're just trying to find, um, the piece of code that trains a model to a very low validation loss. If anyone gives you a candidate commit, it's very easy to verify that that commit is correct, is good. Like they-- someone could claim from the internet that this piece of code will optimize, uh, much better and give you much better performance. You could just check.
20. SGSarah Guo
  Yeah.
21. AKAndrej Karpathy
  Very easy. But probably a, a lot of work goes into that checking. Uh, but fundamentally, they could lie and et cetera. So you're basically dealing with a similar kind of pro-- It almost actually like looks a little bit like... My s- my designs that incorporate an untrusted pool of workers, uh, actually look a little bit more like a blockchain a little bit, uh, because instead of blocks, you have, uh, commits-
22. SGSarah Guo
  Mm-hmm.
23. AKAndrej Karpathy
  -and these commits can build on each other, and they contain like changes to the code as you're improving it. Um, and, uh, the proof of work is basically doing tons of experimentation to find the commits that work. Um, and that's hard. Um, and then the reward is just being on the leaderboard right now. [laughs] There's no monetary reward whatsoever. Uh, but I don't wanna push the analogy too far, but it fundamentally has this issue where you-- a huge amount of search goes into it, but it's very cheap to verify that a candidate solution is indeed good because you can just train a single... You know, someone had to try ten thousand ideas, but you just have to check that the thing that they produced actually works.
24. SGSarah Guo
  Mm-hmm.
25. AKAndrej Karpathy
  Because the ninety-nine thousand of them didn't work, you know? Um, and so basically, long story short, is like you have to come up with a system where an untrusted pool of workers can collaborate with a trusted pool of workers, uh, that do the verification, and the whole thing is kinda like asynchronous and works and, um, and so on. And, uh, it's, it's like safe from a security perspective because if anyone sends you arbitrary code and you're gonna run it, that's very sketchy and dodgy. So, um, but fundamentally, it should be totally possible. So you're familiar with projects like SETI@home and Folding@home. All of these problems have a similar kind of, uh, setup. So Folding@home, you're folding a protein, um, and it's very hard to find a configuration that is low energy. But if someone finds a configuration that they evaluate to be low energy, that's perfect. You can just use it. You can easily verify it. So a lot of things have this property that, you know, very expensive to come up with but very cheap to verify. And so in all those cases, things like Folding@home or SETI@home or AutoResearch@home-
26. SGSarah Guo
  Mm-hmm.
27. AKAndrej Karpathy
  -will be good fits. And so, um, long story shortA swarm of agents on the internet could collaborate to improve LLMs and could potentially even, like, run circles around Frontier Labs. Like, who knows, you know? Um, yeah, like maybe that's even possible. Like, Frontier Labs have a huge amount of trusted compute, but the Earth is much bigger and has a huge amount of untrusted compute. But if you put systems in check, uh, systems in place that, you know, deal with this, then maybe it is possible that the swarm out there could, uh, could come up with, with better, with better solutions. And people kind of, like, contribute cycles, um, to, to a thing that they care about. And so, sorry, so, so the last thought is [laughs] lots of companies or whatnot, they could maybe have like their own, uh, things that they care about. And you, if you have compute capacity, you could contribute to different kind of AutoResearch tracks. Like maybe you care about certain, you know... Like, you care about, like, cancer or something like that of a certain type. You don't have to just donate money to an institution. You actually could, like, purchase compute, and then you could join the AutoResearch swarm for that project, you know? Uh, so if everything is rebundled into AutoResearchers, then compute becomes the thing that you're contributing to the pool.
28. SGSarah Guo
  Yeah. That's very inspiring, and it's also interesting. Like, I don't, I don't know how far this goes-
29. AKAndrej Karpathy
  Yeah
30. SGSarah Guo
  ... but it is interesting that at least some audience of people, you know, here in Silicon Valley or lining up at, um, you know, retail stores in China-
37:28 – 48:25
Analysis of Jobs Market Data
1. AKAndrej Karpathy
  but is flop the thing that actually everyone cares about in the future? Like, is there gonna be like a flipping almost of like what's the thing that you care about? Like right now, for example, it's really hard to get compute even if you have money. [laughs]
2. SGSarah Guo
  Yeah.
3. AKAndrej Karpathy
  So actually, it almost seems like the flop is, like, dominant [laughs] in a certain sense. Um, yeah, so, uh, so maybe that's kinda like, kinda like that. Like, how much, how many flops do you control instead of, like, what wealth do you control? I don't actually think that's true, but it's kind of interesting to think about.
4. SGSarah Guo
  The last thing you released was, like, a little bit of jobs data analysis.
5. AKAndrej Karpathy
  Yeah.
6. SGSarah Guo
  Is that right? What, um... And might have touched a nerve even though you're just, like, visualizing some public data.
7. AKAndrej Karpathy
  Yeah. [laughs]
8. SGSarah Guo
  Uh, what was, you know, what were you curious about?
9. AKAndrej Karpathy
  Yeah, I guess I was curious to, um... I mean, everyone is like real-- it's-- everyone is really thinking about the impacts of AI on the job market and what it's gonna look like. So I was just interested to take a look, like what does the job market look like? Where are the different roles, um, and how many people are in different professions? And I was like really just interested to like look through, uh, the individual cases and try to think to myself about like, you know, with these AIs and how they're likely to evolve, like are these gonna be tools that people are using? Are these gonna be displacing tools for these, uh, professions? And like what are the current professions, and h-how are they gonna change? Are they gonna grow or, or, uh, adjust to a large extent? Or like what could be new professions? So it's really just like a way to fuel my own chain of thought about the industry, I suppose.
10. SGSarah Guo
  Mm-hmm.
11. AKAndrej Karpathy
  Um, and so, uh, yeah, the jobs data basically is just a Bureau of Labor Statistics. Uh, they actually have a, um, percent outlook for each profession about how much it's con-
12. SGSarah Guo
  Grow
13. AKAndrej Karpathy
  ... expected to grow-
14. SGSarah Guo
  Yeah
15. AKAndrej Karpathy
  ... over the next, I think, almost decade. Uh, yeah, I think it's a decade, but it was made in twenty twenty-four.
16. SGSarah Guo
  Mm-hmm.
17. AKAndrej Karpathy
  Uh, so-
18. SGSarah Guo
  We need a lot of healthcare workers.
19. AKAndrej Karpathy
  Yeah. So, so they've already made those projections, and I'm not sure actually a hundred percent what the methodology was that they, that they put into the projections. Um, I guess I was interested to color things by... Like, if people think that what's, like, primarily being r- um, developed now is this kinda like more digital AI, uh, that it's kind of like almost like these ghost or spirit entities that can, like, interact, uh, in the digital world and manipulate a lot of, like, digital information, and they currently don't really have a physical embodiment, uh, or presence. And the physical stuff is probably gonna go slightly slower because you're manipulating atoms. So flipping, flipping bits and y- and the ability to copy-paste, uh, digital information is, like, makes everything a million times faster than accelerating matter, you know? So, um, so energetically, I just think we're gonna see a huge amount of activity in digital space, huge amount of rewriting, huge amount of activity, boiling soup, and I think the-- we're gonna see something that, that in the digital space goes at the speed of light compared to, I think, what's gonna happen in the physical world to some extent if, uh, it would be the extrapolation. And so I think, like, [clears throat] there's currently kind of like, I think, a, a overhang where there can be like a lot of unhobbling, uh, almost potentially of like a lot of digital information processing that used to be done by computers and people, and now with AIs as like a third kind of manipulator of digital information, there's gonna be a lot of refactoring in those b- in those, uh, disciplines. Um, but the physical world is actually gonna be like, I think, um, behind that by some amount of time. And so I think what's really fascinating to me is like... So that's why I was highlighting the, the professions that fundamentally manipulate digital information. This is work you could do from your home, et cetera. Uh, because I feel like those will be... like things will change. And that doesn't mean that there's gonna be less of those jobs or more of those jobs because it—that has to do with like demand elasticity and many other factors. But things will change in these professions because of these new tools and, um, because of this upgrade to the nervous system of the human superorganism, [laughs] if you wanna think about it that way.
20. SGSarah Guo
  Given the look you had at the data, do you have either any observations or, um, uh, guidance for people facing the job market or thinking about what to study now or what skills to develop? I mean, we can all go get like... I'm very thankful that I have to, like, meet people for my job right now.
21. AKAndrej Karpathy
  Mm. Yeah. [laughs]
22. SGSarah Guo
  Be more physical. Yeah.
23. AKAndrej Karpathy
  Could you do your work from home, though? Uh, I could. [laughs]
24. SGSarah Guo
  I think there are relationship parts of it that are hard.
25. AKAndrej Karpathy
  [laughs]
26. SGSarah Guo
  But most of it I could.
27. AKAndrej Karpathy
  Yeah. I think it's really hard to tell because, again, like the job market is extremely diverse, and I think the answers will probably vary. But, uh, to a large extent, like these tools are extremely new, extremely powerful, and so just being, uh, you know, just trying to keep up with it is like the first thing. Um, and, um, yeah, because I think a lot of people kinda like dismiss it or-
28. SGSarah Guo
  Or they're afraid of it
29. AKAndrej Karpathy
  ... or they're afraid of it, et cetera, uh, which is totally understandable, of course. Yeah, I, I think like, um, it's fundamentally an empowering tool at the moment. Um, and these jobs are bundles of tasks, and some of these tasks can go a lot faster, and so people should think of it as primarily a tool that it is right now. Um, and I think the long-term future of that is uncertain. Yeah, it's kind of really hard to forecast, to be honest. [laughs] And like I'm not professionally like doing that really, and I think this is a job of like economists to do properly.
30. SGSarah Guo
  You are an engineer, though. Uh, and like one thing I thought was interesting is that, like, th-the, uh, demand for engineering jobs-Is continuing to increase.
48:25 – 53:51
Open vs. Closed Source Models
1. AKAndrej Karpathy
  well be outside of OpenAI.
2. SGSarah Guo
  Noam, that's a call to be an independent researcher-
3. AKAndrej Karpathy
  [laughs]
4. SGSarah Guo
  ... with AutoResearch. [laughs]
5. AKAndrej Karpathy
  Yeah, there's many things to do on the outside, and it's, uh, it's a... And I think ultimately, I think the ideal solution maybe is like, yeah, going back and forth, uh, or, um, yeah, and I think fundamentally, you can have really am-amazing impact in both places. So very compli- I don't know, like, it's a very loaded question a little bit, but I mean, I joined the Frontier Lab, and I'm outside, and then maybe in the future, I'll want to join again, and I think, um, uh, that's kinda like how I look at it.
6. SGSarah Guo
  One question related to what visibility to-- does the world or the AI ecosystem have into, uh, the frontier is, like, how, how close open source is-
7. AKAndrej Karpathy
  Mm-hmm
8. SGSarah Guo
  ... to the frontier-
9. AKAndrej Karpathy
  Mm-hmm
10. SGSarah Guo
  ... um, and how sustainable that is.
11. AKAndrej Karpathy
  Mm-hmm.
12. SGSarah Guo
  I, I think-
13. AKAndrej Karpathy
  Yeah
14. SGSarah Guo
  ... I think it is quite surprising, the entire sequence of events actually from, like, having a handful of Chinese models-
15. AKAndrej Karpathy
  Mm-hmm
16. SGSarah Guo
  ... and global models, and I think people are gonna continue releasing here in the near term that are closer than much of the industry anticipated-
17. AKAndrej Karpathy
  Mm-hmm
18. SGSarah Guo
  ... from a capability perspective.
19. AKAndrej Karpathy
  Yeah.
20. SGSarah Guo
  Um, I don't know if you're surprised by that, but you're a long-term contributor to open source. Like, what's your prediction here?
21. AKAndrej Karpathy
  Yeah. So roughly speaking, basically the, um, yeah, the close models are ahead, but, like, people are monitoring the number of months that sort of like open source models are behind, um-
22. SGSarah Guo
  And it started with there's nothing, and then it went to eighteen months, and now it's like-
23. AKAndrej Karpathy
  Yeah, and there's been a convergence-
24. SGSarah Guo
  Yes
25. AKAndrej Karpathy
  ... right?
26. SGSarah Guo
  Yeah.
27. AKAndrej Karpathy
  So there may be they're behind by, like, what is the latest? Maybe like eight-- six months, eight months kind of thing right now. Yeah, I'm a huge fan of open source, obviously. So for example, in operating systems, you have, like, closed sou-- like, you know, Windows and macOS. These are large software projects, kind of like what LLMs are gonna become, and there's Linux.
28. SGSarah Guo
  Mm-hmm.
29. AKAndrej Karpathy
  But Linux is very easy. Like, actually, Linux is extremely successful project. It runs on the vast majority of computers. Like, last time I checked, was it, like, sixty percent or something, like, run Linux? Um, and that's because there is a need in the industry to have a common open platform that everyone feels, uh, sort of safe using. I would say, like, the industry has always felt a demand for that kind of a project to exist.
30. SGSarah Guo
  Mm-hmm.
53:51 – 1:00:59
Autonomous Robotics
1. AKAndrej Karpathy
  And so I want there to be ensembles of people thinking about all the hardest problems, and I want there to be ensembles of people in a room when they, um, to be all well-informed and to make all those decisions, you know? So, uh, I don't want it to be like a closed doors with two people or three people. I feel like that's, like-Not a good, not a good feature. I almost wish like there were more labs, is long story short, and I, I, I do think that-
2. SGSarah Guo
  They're virtually competitive
3. AKAndrej Karpathy
  ... open source has a-
4. SGSarah Guo
  Yeah
5. AKAndrej Karpathy
  ... uh, has a place to play. I hope it sticks around, and I basically... I- It's currently slightly behind, and that's actually kind of like a good thing.
6. SGSarah Guo
  Okay. You worked on the precursor to generalized robotics autonomy-
7. AKAndrej Karpathy
  Yeah
8. SGSarah Guo
  ... um, in cars, right?
9. AKAndrej Karpathy
  Right.
10. SGSarah Guo
  Uh, a, a lot has happened in the last couple months with robotics companies as well-
11. AKAndrej Karpathy
  Mm
12. SGSarah Guo
  ... like acceleration of really impressive generalization of environment, of tasks-
13. AKAndrej Karpathy
  Mm
14. SGSarah Guo
  ... like increasing long horizon tasks, lots of money going into the space. Like, is it gonna happen?
15. AKAndrej Karpathy
  Mm.
16. SGSarah Guo
  Has anything, in your view, changed recently?
17. AKAndrej Karpathy
  Uh, so like my view is kind of informed by what I saw in self-driving, and I do feel like self-driving is the first robotics application. So probably what I saw is, at the time, like 10 years ago, there were a large number of startups. And I kinda feel like, um, like most of them basically, like didn't long-term make it. Um, and what I saw is that like a lot of capital expenditure had to go in and a lot of time. And so, um, I think it's like, I think robotics, because it's so difficult and so messy and requires a huge amount of capital investment and a lot of like con- conviction [laughs] um, just it's like a big problem, and I think atoms are really hard. So I kinda feel like they will lag be- it will lag behind what's gonna happen in digital space. And in digital space, there's gonna be a huge amount of unhobbling, uh, basically like things that weren't super efficient becoming a lot more efficient by like a factor of 100-
18. SGSarah Guo
  Mm-hmm
19. AKAndrej Karpathy
  ... because bits are so much easier. And so I think currently, in terms of what's gonna change and, like where the activity is, I kinda feel like digital space is going to like change a huge amount, and then the physical space will lag behind. And what I find very interesting is like this interface in between them as, as well because I think in this like... If you- we do have more agents acting on behalf of humans and more agents kinda like talking to each other ac- uh, and, and doing tasks and participating in this kind of economy of agents, et cetera, um, you're gonna run out of things that you're gonna do purely in the digital space. At some point you have to go to the universe, and you have to ask it questions. Um, you have to run an experiment and see what the universe tells you to get back to learn something. And so we currently have a huge amount of like digital work, uh, because there's an overhang in how much we collectively thought about what already is digital. So we just didn't have enough thinking cycles among the humans to think about all the information that is already digital and already uploaded. Um, and so we're gonna start running out of stuff that is actually like, um, already up- uploaded. Uh, so you're gonna at some point read all the papers and process them and have some ideas about what to try, but, um, yeah, we're just gonna, uh... I don't actually know how much you can like get intelligence that's like fully closed off and with just the information that's available through it, you know? And so I think what, what's gonna happen is first there's gonna be a huge amount of unhobbling, and I think there's a huge amount of work there. Then actually it's going to move to like the interfaces between physical and digital. So I... And that's like sensors of like seeing the world and actuators of like doing something to the world.
20. SGSarah Guo
  Mm-hmm.
21. AKAndrej Karpathy
  So I think a lot of interesting companies will actually come from that interface of like can we feed the super intelligence in a certain sense, uh-
22. SGSarah Guo
  Okay
23. AKAndrej Karpathy
  ... data, and can we actually like take data out and manipulate the physical world, um, per its bidding, [laughs] if you wanna like anthropomorphize the whole thing, right? And then the, the physical world, actually I almost feel like the, the total addressable market, et cetera, in terms of like the amount of work and so on, is, is massive, possibly even much larger maybe what can happen in digital space. So I actually think it's like a much bigger opportunity as well. But, um, I do feel like it's a huge amount of work, and b- and in my, in my mind, the atoms are just like a, a million times harder. So, um, so it will lag behind, but it's also, I think, a little bit of a bigger market. So it's kinda like, uh, yeah, I think the opportunity is kind of like follow that kind of trajectory. So right now is digital is like my main interest. Then interfaces would be like after that, and then maybe like some of the physical things, um, like their time will come, and they'll be huge, uh, when they do come.
24. SGSarah Guo
  Well, there's, there's an interesting framework for it too because, uh, certain things, not the things I'm working on right now, but certain things are much easier even in the world of atoms.
25. AKAndrej Karpathy
  Mm-hmm.
26. SGSarah Guo
  Right? Like, if you just think about like read and write-
27. AKAndrej Karpathy
  Mm-hmm
28. SGSarah Guo
  ... to the physical world, like read, like sensors-
29. AKAndrej Karpathy
  Mm
30. SGSarah Guo
  ... cameras, like there's a lot of existing hardware.
1:00:59 – 1:05:40
MicroGPT and Agentic Education
1. AKAndrej Karpathy
  training, it actually is like very easily... It like really fits the paradigm.
2. SGSarah Guo
  Mm-hmm.
3. AKAndrej Karpathy
  Um, so you'd actually expect-
4. SGSarah Guo
  Yeah, clean metric. [chuckles]
5. AKAndrej Karpathy
  Yeah, like LLM training actually fits the paradigm really well, really easily.
6. SGSarah Guo
  Mm-hmm.
7. AKAndrej Karpathy
  Like all the optimization of all the code, and so it runs faster. And then you also have like metrics that you can optimize against. I do think that if you had an autonomous loop over those metrics, there's gonna be a lot of like good harding going on, where the system will like overfit to those metrics.
8. SGSarah Guo
  Mm-hmm.
9. AKAndrej Karpathy
  And so, um, but then you can use the system to devise more metrics, and you just have really good coverage. So it's kinda hard to tell. But, um, in a certain sense, it's like a pretty, pretty good fit.
10. SGSarah Guo
  I wanna talk about a little, uh, tiny side project you have before we end.
11. AKAndrej Karpathy
  Mm-hmm.
12. SGSarah Guo
  Um, tell me about the MicroGPT.
13. AKAndrej Karpathy
  [chuckles] Oh, yeah. Okay, so MicroGPT. So I have this like running obsession of like maybe a decade or two of just like simplifying and boiling down the, uh, basically LLMs, uh, to like their bare essence, and I've had a number of projects along these lines, so like NanoGPT and, um, MakeMore and, uh, MicroGB- MicroGrad, et cetera. [chuckles] So I feel like MicroGPT is now the state of the art of me trying to like just boil it down to just the essence. Because the thing is, like training neural nets and LLMs specifically, um, is a huge amount of code, but all of that code is actually complexity from efficiency.
14. SGSarah Guo
  Mm-hmm.
15. AKAndrej Karpathy
  It's just because you need it to go fast.
16. SGSarah Guo
  Mm-hmm.
17. AKAndrej Karpathy
  If you don't need it to go fast and you just care about the algorithm, then that algorithm actually is, uh, 200 lines of Python, very simple to read, and this includes comments and everything. Um, because you just have like, uh, your dataset, which is a text, um, and you need your neural network architecture, which is like 50 lines. You need to do your forward pass, and then you have to do, uh, your backward pass to calculate the gradients.
18. SGSarah Guo
  Mm.
19. AKAndrej Karpathy
  And so a little autograd engine, uh, to calculate the gradients is like 100 lines. And then you need an optimizer, an atom, for example, uh, which is a very state-of-the-art optimizer, is like, again, 10 lines really. And so putting everything together in the training loop is like, yeah, 200 lines. And what's interesting to me, like normally before, like maybe a year ago or more, if I had come up with MicroGPT, I would be tempted to basically explain to people, like I have a video like stepping through it or something like that, uh, and actually try to make that video a little bit, and I tried to make like a little guide to it and so on.
20. SGSarah Guo
  Mm-hmm.
21. AKAndrej Karpathy
  But I kind of realized that this is, is not really, is not really adding too much because people-- 'cause the... it's already so simple, that it's 200 lines, that anyone could ask their agent to explain it in various ways. And the agents-- Like, I'm not explaining to people anymore. I'm explaining it to agents. If you can explain it to agents, then agents can be the router, and they can actually target it to the human in their language, uh, with infinite, uh, you know, uh, uh, patience-
22. SGSarah Guo
  Mm-hmm
23. AKAndrej Karpathy
  ... and, uh, just at their capability and so on.
24. SGSarah Guo
  Right. If I don't understand, um, this particular function, I can ask the agent to explain it to me like three different ways.
25. AKAndrej Karpathy
  Yeah, exactly.
26. SGSarah Guo
  And I'm not gonna get that from you.
27. AKAndrej Karpathy
  Exactly.
28. SGSarah Guo
  Yeah.
29. AKAndrej Karpathy
  And so I kinda feel like, you know, what is education? Like it used to be guides, it used to be lectures, it used to be this thing. But now, I feel like now more I'm explaining things to agents, and maybe I'm coming up with skills, uh, where like, um, uh... So basically skill is just a way to instruct the agent how to teach the thing. So maybe I could have a skill for MicroGPT of the progression I imagine the agent should take you through if you're interested in understanding the code base, and it's just like hints to the model to like, "Oh, first start off with this, and then with that." And so I could just script the curriculum a little bit as a skill. Uh, so, um, so I, I don't feel like, um... Yeah, I feel like there's gonna be less of like explaining things directly to people, and it's gonna be more of just like, does the agent get it? And if the agent gets it, they'll do the explanation. And we're not fully there yet because they, y- I still can-- I still think I can probably explain things a little bit better than the agents, but I still feel like the models are improving so rapidly that, um, I feel like it's a losing battle to some, to some extent. [chuckles] Um, and so I think, uh, education is gonna be kinda like reshuffled by this, uh, quite substantially, uh, where it's the end of like teaching each other things almost a little bit. Like if I have a, um, library, for example, of code or something like that, it used to be that you have documentation for other people-
30. SGSarah Guo
  Mm-hmm
1:05:40 – 1:06:31
Conclusion
1. AKAndrej Karpathy
  these few bits, but everything else in terms of like the education that goes on after that is like not my domain anymore. So maybe, yeah, it's like education kinda changes in those ways where you kinda have to infuse the few bits that you feel strongly about the curriculum or the, the best, the better way of explaining it or something like that. The things that agents can't do is your job now. The things that agents can do, they can probably do better than you, or like very soon. And so you should, um, be strategic about what you're actually spending time on.
2. SGSarah Guo
  Well, we appreciate the few cents.
3. AKAndrej Karpathy
  [chuckles]
4. SGSarah Guo
  Thank you, Andrej.
5. AKAndrej Karpathy
  Okay. [upbeat music]
6. SGSarah Guo
  Find us on Twitter at nopriorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.

Episode duration: 1:06:31

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode kwSVtQ7dziU

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome