
Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI
Sarah Guo (host), Andrej Karpathy (guest)
In this episode of No Priors, featuring Sarah Guo and Andrej Karpathy, Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI explores karpathy maps AI’s loopy era: agents, claws, autoresearch, robotics, education Karpathy describes a recent workflow shift where he rarely types code and instead coordinates multiple coding agents in parallel, making human “token throughput” and instruction quality the new bottlenecks.
Karpathy maps AI’s loopy era: agents, claws, autoresearch, robotics, education
Karpathy describes a recent workflow shift where he rarely types code and instead coordinates multiple coding agents in parallel, making human “token throughput” and instruction quality the new bottlenecks.
He frames “claws” as persistent, looping agent systems with memory and tool access, illustrating their power via a WhatsApp-controlled home automation setup that discovers devices, reverse engineers APIs, and orchestrates household actions.
AutoResearch is presented as removing the researcher from the loop by defining objectives, metrics, and boundaries so agents can run experiments autonomously, including meta-optimization where models could eventually improve the very “Program.md” that defines the research process.
He argues current models remain “jagged,” excelling in verifiable, RL-optimized domains (e.g., code/tests) while stagnating in softer domains (e.g., humor/nuance), motivating both better evaluation scaffolds and eventual model “speciation” into specialized intelligences.
The conversation connects these trends to labor-market shifts (digital work changes first; Jevons paradox may expand software demand), open-vs-closed ecosystem dynamics (open source trailing by months but covering most use cases), and a robotics timeline where atoms lag bits and the key opportunity is the sensor/actuator interface layer.
Key Takeaways
Engineering leverage is shifting from typing speed to orchestration skill.
Karpathy reports moving from mostly hand-coding to mostly delegating, where the key competency becomes decomposing work into parallelizable “macro actions,” writing effective instructions, and reviewing outputs at the right fidelity.
Get the full analysis with uListen AI
Maximizing output now looks like maximizing token throughput, not CPU/GPU utilization.
He likens unused agent quota to idle GPUs in a PhD lab: if an agent is running, the human should queue the next task or spin up another agent, making the person the primary bottleneck.
Get the full analysis with uListen AI
Persistent “claws” are a UX re-architecture: fewer apps, more intent-driven APIs.
His Dobby home claw replaces multiple vendor apps by discovering local devices, finding/deriving endpoints, and exposing a single natural-language control surface, suggesting software may refactor toward agent-consumable APIs over human-first UIs.
Get the full analysis with uListen AI
AutoResearch works best where evaluation is cheap, objective, and automatable.
He emphasizes kernels/perf work and model training loops as ideal because correctness and improvement can be verified via tests or metrics, while domains without clear evaluators resist full autonomy.
Get the full analysis with uListen AI
Jaggedness persists because labs optimize what they can verify.
He argues RL pipelines strongly improve tasks with clear rewards (tests, benchmarks) but leave softer capabilities under-optimized, producing systems that can “move mountains” in coding yet still default to stale, low-diversity jokes.
Get the full analysis with uListen AI
Research organizations may become tunable codebases (and eventually self-tuning).
Program. ...
Get the full analysis with uListen AI
Open-source and closed frontier can form a healthy power balance—if pluralism persists.
Karpathy expects the current pattern—closed frontier ahead, open models trailing by months but covering broad needs—to continue, arguing it mitigates systemic risk from centralized intelligence while still funding expensive frontier progress.
Get the full analysis with uListen AI
Notable Quotes
“I don't think I've typed, like, a line of code probably since December, basically.”
— Andrej Karpathy
“Now it's not about FLOPs, it's about tokens. What is your token throughput, and what token throughput do you command?”
— Andrej Karpathy
“I simultaneously feel like I'm talking to an extremely brilliant PhD student... and a 10-year-old.”
— Andrej Karpathy
“A research organization is a set of Markdown files that describe all the roles and how the whole thing connects.”
— Andrej Karpathy
“In a certain sense, these apps... shouldn't even exist... shouldn't it just be APIs, and shouldn't agents be just using it directly?”
— Andrej Karpathy
Questions Answered in This Episode
What specific practices make you effective at reviewing agent-generated changes when you’re coordinating 5–10 parallel repos (tests, diffs, invariants, risk tiers)?
Karpathy describes a recent workflow shift where he rarely types code and instead coordinates multiple coding agents in parallel, making human “token throughput” and instruction quality the new bottlenecks.
Get the full analysis with uListen AI
In your Dobby home claw, what security boundaries did you enforce (sandboxing, network isolation, secrets handling), and what scared you enough to avoid email/calendar access?
He frames “claws” as persistent, looping agent systems with memory and tool access, illustrating their power via a WhatsApp-controlled home automation setup that discovers devices, reverse engineers APIs, and orchestrates household actions.
Get the full analysis with uListen AI
If jaggedness is driven by verifiability, what new evaluation signals would you add to train better “nuance” behaviors like asking clarifying questions at the right time?
AutoResearch is presented as removing the researcher from the loop by defining objectives, metrics, and boundaries so agents can run experiments autonomously, including meta-optimization where models could eventually improve the very “Program. ...
Get the full analysis with uListen AI
What would a concrete AutoResearch@home protocol look like for untrusted contributors—how do you safely execute arbitrary commits, prevent exfiltration, and handle compute fraud?
He argues current models remain “jagged,” excelling in verifiable, RL-optimized domains (e. ...
Get the full analysis with uListen AI
Where did AutoResearch find improvements in Nanochat that surprised you most, and what does that imply about how much “researcher intuition” is actually leaving performance on the table?
The conversation connects these trends to labor-market shifts (digital work changes first; Jevons paradox may expand software demand), open-vs-closed ecosystem dynamics (open source trailing by months but covering most use cases), and a robotics timeline where atoms lag bits and the key opportunity is the sensor/actuator interface layer.
Get the full analysis with uListen AI
Transcript Preview
Code's not even the right verb anymore, right?
[laughs] Yeah.
But I have to, um, express my will to my agents for-
Manifest
... sixteen hours a day. Manifest.
How can I have not just a single session of Claude Code or Codex or some of these agent harnesses? How can I have more of them? How can I do that appropriately? The agent part is now taken from granted. Now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions. But-
[laughs]
... there-- I mean, this is why it gets to the psychosis, is that this is, like, infinite and everything is skill issue.
[upbeat music] Hi, listeners. Welcome back to No Priors. Today, I'm here with Andrej Karpathy, and we have a wide-ranging conversation for you about code agents, the future of engineering and AI research, how more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age. Welcome, Andrej. Andrej, thanks for doing this.
Yeah, thank you for having me. [laughs]
Uh, so it's been a very exciting couple of months in AI.
Uh, yeah, [laughs] you could say that.
I remember, um, walking into the office at some point, and you were, like, really locked in, and I was asking what you were up to, and you're like, "I just... I have to code for sixteen hours a day," or code's not even the right verb anymore, right?
[laughs] Yeah.
But I have to, um, express my will to my agents for-
Manifest
... sixteen hours a day. Manifest. Um, because, like, there's been a jump in capability.
Yeah.
Uh, what's happening? Tell me about your experience.
Yeah, I kinda feel like I was just in this perpetual... I still am often, uh, in this state of AI psychosis just, like, all the time, uh, because there was a huge unlock in what you can achieve as a person, as an individual, right? Because you were bottlenecked by, you know, your typing speed and so on. But now with these agents, it really... I would say in December is when it really just... something flipped, where I kinda went from eighty-twenty of like, you know, uh, to, like, twenty-eighty of writing code by myself versus just delegating to agents. And I don't even think it's twenty-eighty by now. I think it's a lot more than that. I don't think I've typed, like, a line of code probably since December, basically, [laughs] um, which is, like, an extremely large, uh, change. Um, I was talking to, like, for example, I was talking about it to, for example, my parents and so on, and I don't think, like, a normal person actually realizes that this happened or how dramatic it was. Like, literally, like, if you just find a random software engineer or something like that at their, at their desk and what they're doing, like, their default workflow of, you know, building software is completely different as of basically December. Uh, so I'm just, like, in this state of psychosis of trying to figure out, like, what's possible, uh, trying to push it to the limit. How is it-- how can I have not just a single session of, you know, um, Claude Code or Codex or some of these agent harnesses? How can I have more of them? How can I do that, uh, appropriately? And then how can I use these claws? What are these claws? Uh, [laughs] and, uh, so there's, like, a lot of new things. I wanna be at the forefront of it, you know, and I'm very antsy that I'm not at the forefront of it. And I see lots of people on Twitter doing all kinds of things, and they all sound like really good ideas, and I need to be at the forefront, or I feel extremely nervous. And so I guess I'm just in this psychosis of, like, what's possible? Like, because it's unexplored fundamentally.
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome