Dwarkesh Podcast

Grant Sanderson (@3blue1brown) – AI and the future of math

Always so much fun to chat with @3blue1brown AI has been making much faster progress in math than in other fields. As a result, mathematics is showing us, very concretely, what AI progress in other fields will look like. Even within mathematics, there's a jagged landscape. What does it look like? What is the nature of the most important conceptual breakthroughs in the history of mathematics, and how different are they from what AIs are currently able to do? Does AI (on net) increase or decrease human understanding of the field? How big is the overhang from having AIs systematically try to connect ideas already in the literature? And what advice does Grant have for aspiring mathematicians, coders, and other students who are passionate about fields that are being most transformed upon by AI? 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkesh.com/p/grant-sanderson-2 * Apple Podcasts: https://podcasts.apple.com/us/podcast/grant-sanderson-ai-and-the-future-of-math/id1516093381?i=1000774870615 * Spotify: https://open.spotify.com/episode/0X3t4uRlpVT4MXPYDIrNYX?si=HZf_0Ky2Q42tOWYZNvWi6w 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 * Gemini 3.5 Live Translate is what I wished I'd had on my last trip to China. It detects more than 70 languages and translates them in near real-time… and it preserves your original pacing and intonation. If you're building an app that needs live translation, you should check out Gemini 3.5 Live Translate. Get started at https://ai.studio/live * Cursor’s harness lets me use models for a huge range of tasks at the podcast. For example, Cursor cuts out the ads from each episode I produce so I can post them on Bilibili. It also helps me prep for interviews — I have a repo full of books and papers that Cursor sorts through to find the exact right file for any given question. Try Cursor yourself at https://cursor.com/dwarkesh * Jane Street sponsors 3Blue1Brown, so Grant has gotten to spend a lot of time with various Jane Streeters. He actually just recorded an interview with a few of them, so when we sat down for this episode, he told me about some of the things he learned, like how Jane Street keeps their role definitions fuzzy to make sure their people keep learning and growing. Go check out Grant’s full interview at https://3b1b.co/janestreet To sponsor a future episode, visit https://dwarkesh.com/advertise. 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – AI is discovering new proofs. Is that AGI? 00:11:32 – The verification loop on conceptual breakthroughs can be a century long 00:26:12 – Will we understand an AI proof of the Riemann hypothesis? 00:38:08 – Can AI find the hidden bridges between fields? 00:53:48 – Why real-world tasks don’t fit into RL environments 01:07:07 – Good writing requires theory of mind that AI still lacks 01:16:02 – Why learning will still depend on human curation

Dwarkesh Patelhost

Jun 30, 20261h 33mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

AI’s rapid math progress reshapes proofs, discovery, and curation roles

Math looks like a leading-edge “spike” for AI, but even within math capability is fractal and uneven, making single benchmarks (like IMO gold) poor proxies for general intelligence.
They distinguish three modes of major mathematical progress—connecting existing fields (“lightning bolts”), building new theory (“mountain building”), and brute-force long proofs—each with different implications for human understanding and downstream economic impact.
Breakthrough-quality work (good conjectures, definitions, and new conceptual frameworks) is hard to benchmark or reward-train because its value can take decades to validate, as illustrated by the long arc from Lagrange to Galois to modern group theory.
AI math progress is driven not only by verifiability but by “grindability” (cheap parallel rollouts in stable environments), which helps explain why code and math advance faster than real-world computer use tasks.
Even if AI becomes excellent at proofs and explanations, humans may retain a durable role as curators/mentors—selecting what ideas matter, motivating learners, and providing social trust—though AI may increasingly assist or outperform on many explanatory tasks.

IDEAS WORTH REMEMBERING

5 ideas

Benchmark wins don’t imply AGI because capability is uneven and task-specific.

IMO success can hinge on categories (e.g., geometry brute-force vs combinatorics creativity), so crossing a headline benchmark may not translate to broad competence or economic automation.

The hardest-to-train math skills are “what to study” and “how to define,” not “solve.”

Great mathematicians are credited for conjectures and especially definitions; these are subjective, slow to validate, and lack clean pass/fail scoring, making them resistant to current benchmark-driven training.

Math progress can arrive in three qualitatively different forms with different interpretability.

Field-bridging ideas are often human-parsable; new “mountain” theories can be alien and slow to digest; brute-force long proofs risk being correct but unenlightening—each affects whether humans gain understanding.

Long verification loops make “conceptual breakthroughs” hard to reward, even for humans.

Galois’ symmetry-based insights were rejected, rediscovered, and only later became foundational with far-off applications (physics/cryptography), showing that immediate reviewer feedback is a poor proxy for value.

Grindability is a major hidden driver of AI progress—often more than verifiability alone.

Coding/math can be containerized and parallelized with deterministic feedback, enabling massive rollouts and credit assignment; real-world computer use is verifiable but not easily repeatable at scale due to cost, variability, and bot defenses.

WORDS WORTH SAVING

5 quotes

Good mathematicians prove theorems, great mathematicians, um, come up with conjectures, and the greatest mathematicians come up with definitions.

— Grant Sanderson

I wanna propose the idea of an unsolved expository problem- where like, sure, we've proven it, but we don't really know why it's true.

— Grant Sanderson

It's like an alien trying to empathize. Like how, how could it have theory of mind? It would be like this very emergent thing to have theory of mind.

— Grant Sanderson

I think teaching is one of the most stable, uh, like- post-AGI jobs that there is because it's so relational.

— Grant Sanderson

Like, mostly it feels like a random drunken walk where you're, like, doing a thing and then, oh, you're wrong- ... and, like, constantly discovering wrong.

— Grant Sanderson

Spiky/fractal capability frontier in mathIMO vs combinatorics and benchmark fragilityLightning-bolt connections vs theory-building vs brute-force proofsCentury-long verification loops for new conceptsConjectures/definitions as “highest-tier” math outputsGrindability vs verifiability; sample efficiency limitsFormalization (Lean/Mathlib) and automated correctness guaranteesHuman understanding vs proof; “unsolved expository problems”Multi-agent entropy, biasing, and context-reset strategiesWhy AI struggles with writing: insight, theory-of-mind, non-modularityFuture roles: curator, teacher, mentor; economics of math labor

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.