No Priors Ep. 90 | With Google's DeepMind's AlphaProof Team

In this week’s episode of No Priors, Sarah and Elad sit down with the Google DeepMind team behind AlphaProof, Laurent Sartran, Rishi Mehta, and Thomas Hubert. AlphaProof is a new reinforcement learning-based system for formal math reasoning that recently reached a silver-medal standard in solving International Mathematical Olympiad problems. They dive deep into AI and its role in solving complex mathematical problems, featuring insights into AlphaProof and its capabilities. They cover its functionality, unique strengths in reasoning, and the challenges it faces as it scales. The conversation also explores the motivations behind AI in math, practical applications, and how verifiability and human input come into play within a reinforcement learning approach. The DeepMind team shares advice and future perspectives on where math and AI are headed. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Rishicomplex | @LaurentSartran | @ThomasHubert Show Notes: 0:00 Personal introductions 2:19 Achieving silver medal in IMO competition 3:52 How AlphaProof works 5:56 AlphaProof’s strengths within mathematical reasoning 8:56 Challenges in scaling AlphaProof 13:40 Why solve math? 17:50 Pursuing knowledge versus practical applications 21:30 Insights on verifying correctness within reinforcement learning 28:27 How AI could foster more collaboration among mathematicians 30:28 Surprising insights from AI proof generation 34:17 Future of math and AI: advice for math enthusiasts and researchers

Sarah GuohostLaurent SifreguestThomas HubertguestRishi MehtaguestElad Gilhost

Nov 13, 202439mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

DeepMind’s AlphaProof Pushes AI Toward Rigorous Mathematical Reasoning Frontier

The episode features members of DeepMind’s AlphaProof team explaining how they adapted AlphaZero-style reinforcement learning and search to discover and verify formal mathematical proofs, achieving IMO-level problem solving (4 of 6 problems in 2024).
They describe AlphaProof’s architecture, its use of formal proof languages like Lean, and a key innovation—Test-Time RL—which lets the system iteratively generate and solve problem variants to crack very hard problems over days of compute.
The discussion covers current strengths (algebra and number theory), weaknesses (combinatorics, geometry, and lack of theory-building), and the long-term goal of enabling systems that “think more” to develop new mathematical theories.
They also explore broader implications for AGI, code verification, mathematical collaboration and education, and how human expertise and “taste” in posing good questions will matter even more as AI becomes better at finding answers.

IDEAS WORTH REMEMBERING

5 ideas

Formal proof languages like Lean are becoming central to AI–math collaboration.

AlphaProof operates in a formal language so its proofs can be mechanically verified, enabling self-improvement loops and opening the door to large-scale human–AI collaboration where machines check correctness and humans focus on ideas.

Test-Time RL lets AI substantially improve on a single hard problem by ‘thinking more’.

When AlphaProof gets stuck, it generates many nearby problem variants, learns from solving those, and gradually hill-climbs toward a solution of the original problem—sometimes over several days of compute.

AlphaProof already matches top high-school competition level in certain domains but lacks theory-building.

It is strongest in algebra and number theory at IMO level, but it does not yet invent new mathematical frameworks or deep theories, which are likely required for tackling grand challenges like the Riemann Hypothesis.

Human expert data plus RL-generated data are complementary for superhuman performance.

Small amounts of high-quality human proofs can efficiently seed behavior; then reinforcement learning and large-scale search let the system develop its own, sometimes ‘alien’, styles that can exceed human problem-solving on specific tasks.

Formal verification could transform software engineering by scaling beyond human-written proofs.

The same techniques used to prove math theorems can prove program properties, potentially making rigorous code verification far more common and reducing bugs and security vulnerabilities.

WORDS WORTH SAVING

5 quotes

Math seems to be a perfect domain for systems that can spend more compute either to tackle harder problems or to think more.

— Thomas Hubert

Maybe the main thing that AlphaProof doesn’t do is theory building.

— Rishi Mehta

We can learn general mathematics almost from scratch and arrive at impressive high school level.

— Laurent Sifre

As machines get better at finding the answers, we’re going to have to get better at finding the questions.

— Rishi Mehta

Formal math is going to be an increasingly important thing going forward.

— Rishi Mehta

AlphaProof’s architecture and adaptation of AlphaZero to formal mathematicsTest-Time Reinforcement Learning (RL) as a way to ‘think more’ at inferencePerformance on International Mathematical Olympiad (IMO) problems and domain strengths/weaknessesCurrent limitations: lack of theory-building, auto-formalization, and handling combinatorics/geometryImplications for AGI, reasoning, and transfer to other domains (science, language)Applications in formal methods, code verification, and mathematical collaborationRole of human mathematicians, formal proof languages (Lean), education, and “taste” in problem selection

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.