No PriorsAndrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI
At a glance
WHAT IT’S REALLY ABOUT
Karpathy maps AI’s loopy era: agents, claws, autoresearch, robotics, education
- Karpathy describes a recent workflow shift where he rarely types code and instead coordinates multiple coding agents in parallel, making human “token throughput” and instruction quality the new bottlenecks.
- He frames “claws” as persistent, looping agent systems with memory and tool access, illustrating their power via a WhatsApp-controlled home automation setup that discovers devices, reverse engineers APIs, and orchestrates household actions.
- AutoResearch is presented as removing the researcher from the loop by defining objectives, metrics, and boundaries so agents can run experiments autonomously, including meta-optimization where models could eventually improve the very “Program.md” that defines the research process.
- He argues current models remain “jagged,” excelling in verifiable, RL-optimized domains (e.g., code/tests) while stagnating in softer domains (e.g., humor/nuance), motivating both better evaluation scaffolds and eventual model “speciation” into specialized intelligences.
- The conversation connects these trends to labor-market shifts (digital work changes first; Jevons paradox may expand software demand), open-vs-closed ecosystem dynamics (open source trailing by months but covering most use cases), and a robotics timeline where atoms lag bits and the key opportunity is the sensor/actuator interface layer.
IDEAS WORTH REMEMBERING
5 ideasEngineering leverage is shifting from typing speed to orchestration skill.
Karpathy reports moving from mostly hand-coding to mostly delegating, where the key competency becomes decomposing work into parallelizable “macro actions,” writing effective instructions, and reviewing outputs at the right fidelity.
Maximizing output now looks like maximizing token throughput, not CPU/GPU utilization.
He likens unused agent quota to idle GPUs in a PhD lab: if an agent is running, the human should queue the next task or spin up another agent, making the person the primary bottleneck.
Persistent “claws” are a UX re-architecture: fewer apps, more intent-driven APIs.
His Dobby home claw replaces multiple vendor apps by discovering local devices, finding/deriving endpoints, and exposing a single natural-language control surface, suggesting software may refactor toward agent-consumable APIs over human-first UIs.
AutoResearch works best where evaluation is cheap, objective, and automatable.
He emphasizes kernels/perf work and model training loops as ideal because correctness and improvement can be verified via tests or metrics, while domains without clear evaluators resist full autonomy.
Jaggedness persists because labs optimize what they can verify.
He argues RL pipelines strongly improve tasks with clear rewards (tests, benchmarks) but leave softer capabilities under-optimized, producing systems that can “move mountains” in coding yet still default to stale, low-diversity jokes.
WORDS WORTH SAVING
5 quotesI don't think I've typed, like, a line of code probably since December, basically.
— Andrej Karpathy
Now it's not about FLOPs, it's about tokens. What is your token throughput, and what token throughput do you command?
— Andrej Karpathy
I simultaneously feel like I'm talking to an extremely brilliant PhD student... and a 10-year-old.
— Andrej Karpathy
A research organization is a set of Markdown files that describe all the roles and how the whole thing connects.
— Andrej Karpathy
In a certain sense, these apps... shouldn't even exist... shouldn't it just be APIs, and shouldn't agents be just using it directly?
— Andrej Karpathy
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome