No Priors Ep. 90 | With Google's DeepMind's AlphaProof Team

Name: No Priors Ep. 90 | With Google's DeepMind's AlphaProof Team
Uploaded: 2024-11-14T12:00:00Z
Duration: 39 min 21 s
Description: The episode features members of DeepMind’s AlphaProof team explaining how they adapted AlphaZero-style reinforcement learning and search to discover and verify formal mathematical proofs, achieving IMO-level problem solving (4 of 6 problems in 2024).

No PriorsNov 14, 202439m

Sarah Guo (host), Laurent Sifre (guest), Thomas Hubert (guest), Rishi Mehta (guest), Elad Gil (host)

AlphaProof’s architecture and adaptation of AlphaZero to formal mathematicsTest-Time Reinforcement Learning (RL) as a way to ‘think more’ at inferencePerformance on International Mathematical Olympiad (IMO) problems and domain strengths/weaknessesCurrent limitations: lack of theory-building, auto-formalization, and handling combinatorics/geometryImplications for AGI, reasoning, and transfer to other domains (science, language)Applications in formal methods, code verification, and mathematical collaborationRole of human mathematicians, formal proof languages (Lean), education, and “taste” in problem selection

In this episode of No Priors, featuring Sarah Guo and Laurent Sifre, No Priors Ep. 90 | With Google's DeepMind's AlphaProof Team explores deepMind’s AlphaProof Pushes AI Toward Rigorous Mathematical Reasoning Frontier The episode features members of DeepMind’s AlphaProof team explaining how they adapted AlphaZero-style reinforcement learning and search to discover and verify formal mathematical proofs, achieving IMO-level problem solving (4 of 6 problems in 2024).

DeepMind’s AlphaProof Pushes AI Toward Rigorous Mathematical Reasoning Frontier

The episode features members of DeepMind’s AlphaProof team explaining how they adapted AlphaZero-style reinforcement learning and search to discover and verify formal mathematical proofs, achieving IMO-level problem solving (4 of 6 problems in 2024).

They describe AlphaProof’s architecture, its use of formal proof languages like Lean, and a key innovation—Test-Time RL—which lets the system iteratively generate and solve problem variants to crack very hard problems over days of compute.

The discussion covers current strengths (algebra and number theory), weaknesses (combinatorics, geometry, and lack of theory-building), and the long-term goal of enabling systems that “think more” to develop new mathematical theories.

They also explore broader implications for AGI, code verification, mathematical collaboration and education, and how human expertise and “taste” in posing good questions will matter even more as AI becomes better at finding answers.

Key Takeaways

Formal proof languages like Lean are becoming central to AI–math collaboration.

AlphaProof operates in a formal language so its proofs can be mechanically verified, enabling self-improvement loops and opening the door to large-scale human–AI collaboration where machines check correctness and humans focus on ideas.

Get the full analysis with uListen AI

Test-Time RL lets AI substantially improve on a single hard problem by ‘thinking more’.

When AlphaProof gets stuck, it generates many nearby problem variants, learns from solving those, and gradually hill-climbs toward a solution of the original problem—sometimes over several days of compute.

Get the full analysis with uListen AI

AlphaProof already matches top high-school competition level in certain domains but lacks theory-building.

It is strongest in algebra and number theory at IMO level, but it does not yet invent new mathematical frameworks or deep theories, which are likely required for tackling grand challenges like the Riemann Hypothesis.

Get the full analysis with uListen AI

Human expert data plus RL-generated data are complementary for superhuman performance.

Small amounts of high-quality human proofs can efficiently seed behavior; then reinforcement learning and large-scale search let the system develop its own, sometimes ‘alien’, styles that can exceed human problem-solving on specific tasks.

Get the full analysis with uListen AI

Formal verification could transform software engineering by scaling beyond human-written proofs.

The same techniques used to prove math theorems can prove program properties, potentially making rigorous code verification far more common and reducing bugs and security vulnerabilities.

Get the full analysis with uListen AI

Math is a powerful testbed for “thinking more” and for general reasoning progress.

Because math is purely cognitive and perfectly verifiable, it’s an ideal domain to study systems that get better by using more compute and search—insights that can later transfer to science, engineering, and even complex language tasks.

Get the full analysis with uListen AI

As AI improves at finding answers, human value shifts toward asking the right questions.

The guests foresee a future where machines handle many proof and detail-level tasks, while humans’ comparative advantage is in theory-building, problem selection, and developing ‘taste’ for which questions and directions matter.

Get the full analysis with uListen AI

Notable Quotes

“Math seems to be a perfect domain for systems that can spend more compute either to tackle harder problems or to think more.”
— Thomas Hubert

“Maybe the main thing that AlphaProof doesn’t do is theory building.”
— Rishi Mehta

“We can learn general mathematics almost from scratch and arrive at impressive high school level.”
— Laurent Sifre

“As machines get better at finding the answers, we’re going to have to get better at finding the questions.”
— Rishi Mehta

“Formal math is going to be an increasingly important thing going forward.”
— Rishi Mehta

Questions Answered in This Episode

What specific mechanisms or architectures might enable AlphaProof—or its successors—to progress from proof-finding to genuine theory-building?

Get the full analysis with uListen AI

How could Test-Time RL be adapted to open-ended, less formally verifiable domains like natural language, scientific discovery, or creative writing?

Get the full analysis with uListen AI

What are the practical steps needed to make formal methods and tools like Lean mainstream in mathematics departments and software engineering teams?

Get the full analysis with uListen AI

How should the mathematical community decide which areas or conjectures to target first when collaborating with systems like AlphaProof?

Get the full analysis with uListen AI

In a world where AI can verify and even generate most proofs, what new skills and forms of ‘taste’ will define an outstanding mathematician or researcher?

Get the full analysis with uListen AI

Transcript Preview

Sarah Guo

Hi, listeners, and welcome to No Priors. Today, we have Thomas Hubert, Rishi Mehta, and Laurent Sartrand from DeepMind's AlphaProof team. AlphaProof is a new AI system that can find and verify mathematical proofs, building on DeepMind's earlier successes in chess and Go to tackle one of AI's greatest challenges, mathematical reasoning. In today's episode, we'll explore how AlphaProof works, its implications for math and AI, um, more about test-time RL, and what this reveals about machine learning's capability to reason rigorously. Really happy to have you guys. Welcome.

Laurent Sifre

Thank you for having us. Yeah.

Sarah Guo

Maybe you can start by just talking a little bit about your backgrounds and how you came to be working on AlphaProof together.

Laurent Sifre

I'm Rishi. Uh, I was one of the tech leads on AlphaProof. I've been working in sort of computer science and- and machine learning for a while. I'm a chess player, and I came across, um, the AlphaZero paper and saw some of the- the chess games, uh, that that agent produced, and I found it really inspiring. And I thought, like, "This is the kind of thing I need to work on." Like, coming up with something beautiful and superhuman and almost alien, uh, felt magical. I came over to DeepMind and, uh, the AlphaZero team, which Thomas was leading, uh, was working on, um, math, and that's how I got into math.

Thomas Hubert

My background, so yeah, started working in, uh, in industry to, uh, in- in my early career. Um, I worked on anomaly detection computer networks, I worked on, uh, ad targeting, then switched to AI research. Uh, and there, a constant interest of mine has been, uh, systems that can, uh, spend more computes to either tackle harder problems, uh, or to think more. And, uh, math seemed to be a perfect domain for that.

Laurent Sifre

Yeah, on my side, I, um, I was actually a Go player. Um, so instead of doing programming since the age of 10, I was actually playing Go. And I played a lot of Go during my youth. And, um, and then at some point, like, it was also my- my dad's dream to build a computer Go program. And so I was kind of figuring out, like, "What do I need to know to be able to build a computer Go program?" And then I realized that maybe it was being built at that time. And so that's how I discovered DeepMind and how I discovered AGI. And, uh, that's how I joined the company, and I've been kind of involved with AlphaGo and AlphaZero, MuZero, this line of work. Um, and recently, we had worked on AlphaCode and AlphaTensor. So, you know, like, that's way before, uh, ChatGPT, but we already knew that transformers were- were kind of changing a little bit how things were done. And so we, you know ... I found that in math, yes, you could get this perfect verifiability, and with AlphaCode, we realized we can generate a lot of good code. And so it was very natural at that time to think about, um, the potential, uh, there was for- for mathematics.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome