Skip to content
No PriorsNo Priors

No Priors Ep. 120 | With Google DeepMind’s Pushmeet Kohli and Matej Balog

Much of the scientific process involves searching. But rather than continue to rely on the luck of discovery, Google DeepMind has engineered a more efficient AI agent that mines complex spaces to facilitate scientific breakthroughs. Sarah Guo speaks with Pushmeet Kohli, VP of Science and Strategic Initiatives, and research scientist Matej Balog at Google DeepMind about AlphaEvolve, an autonomous coding agent they developed that finds new algorithms through evolutionary search. Pushmeet and Matej talk about how AlphaEvolve tackles the problem of matrix multiplication efficiency, scaling and iteration in problem solving, and whether or not this means we are at self-improving AI. Together, they also explore the implications AlphaEvolve has to other sciences beyond mathematics and computer science. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @pushmeet | @matejbalog Chapters: 00:00 Pushmeet Kohli and Matej Balog Introduction 0:48 Origin of AlphaEvolve 02:31 AlphaEvolve’s Progression from AlphaGo and AlphaTensor 08:02 The Open Problem of Matrix Multiplication Efficiency 11:18 How AlphaEvolve Evolves Code 14:43 Scaling and Predicting Iterations 16:52 Implications for Coding Agents 19:42 Overcoming Limits of Automated Evaluators 25:21 Are We At Self-Improving AI? 28:10 Effects on Scientific Discovery and Mathematics 31:50 Role of Human Scientists with AlphaEvolve 38:30 Making AlphaEvolve Broadly Accessible 40:18 Applying AlphaEvolve Within Google 41:39 Conclusion

Sarah GuohostMatej BalogguestPushmeet Kohliguest
Jun 26, 202542mWatch on YouTube ↗

CHAPTERS

  1. 0:05 – 1:16

    What AlphaEvolve is and why it matters (algorithmic creativity)

    Sarah introduces AlphaEvolve as an autonomous coding agent that combines Gemini models with evolutionary search to discover new algorithms. The framing positions it as a step beyond boilerplate code generation—closer to the kind of surprising “Move 37”-style technical creativity previously seen in AlphaGo.

    • AlphaEvolve: autonomous coding agent for algorithm discovery
    • Uses Gemini + evolutionary search to explore program space
    • Aims at new algorithms in CS/math, with potential beyond
    • Compared to AlphaGo’s Move 37 as a creativity benchmark
  2. 1:16 – 2:39

    Origin story: DeepMind’s long-running push for algorithm discovery

    Matej explains that AlphaEvolve comes from a multi-year effort to use AI to discover more efficient algorithms with broad real-world impact. The team’s earlier breakthrough, AlphaTensor (2022), provided evidence that AI can exceed human discoveries in certain algorithmic domains.

    • Algorithm discovery as a core pathway to AI benefiting humanity
    • AlphaTensor as an early breakthrough demonstrating superhuman algorithm search
    • Focus on efficiency gains for fundamental computational problems
    • Goal: high-impact improvements in ubiquitous algorithms
  3. 2:39 – 8:33

    From AlphaGo to AlphaTensor to FunSearch to AlphaEvolve (the lineage)

    Pushmeet connects AlphaEvolve’s philosophy to AlphaGo: efficient search in a massive space guided by feedback. AlphaTensor proved the concept in a narrow domain (matrix multiplication), FunSearch generalized to program search with LLMs, and AlphaEvolve extends that approach further.

    • AlphaGo demonstrated scalable search guided by evaluation
    • Matrix multiplication chosen as a fundamental, ubiquitous target
    • AlphaTensor: domain-specific superhuman result; question became generalization
    • FunSearch: LLM-driven search in program space; first “scientific discovery” framing
    • AlphaEvolve: extension of FunSearch toward broader, more practical impact
  4. 8:33 – 11:08

    Why matrix multiplication remains hard: vast search space and non-intuitive constructions

    Sarah asks why valuable improvements weren’t found earlier; Pushmeet and Matej argue it’s not complacency but difficulty. The constructions are intricate and counterintuitive, and the space of candidate algorithms grows explosively with matrix size—making discovery unlikely by chance.

    • Strassen-style gains are ingenious and non-obvious
    • Search space becomes ‘unbelievably vast’ at larger sizes
    • Solutions can be highly non-intuitive and hard to stumble upon
    • Emphasis: prior researchers were not complacent; problems were heavily optimized
  5. 11:08 – 14:43

    How AlphaEvolve evolves code: evaluation functions, LLM proposals, and evolutionary search

    Matej walks through the concrete workflow using data-center scheduling: the user supplies an evaluation function (often a simulator), and AlphaEvolve searches for better code. The system blends LLM creativity, strict scoring, and population-based evolution to maintain diversity and combine ideas over generations.

    • User specifies the problem via an evaluation function (e.g., scheduling simulator)
    • AlphaEvolve fills in the ‘how’: proposes code changes and tests them
    • Can start from scratch or from a strong existing baseline
    • Evolutionary algorithm maintains a diverse pool; recombines strong ideas
    • End product is deployable code that passed evaluation
  6. 14:43 – 16:52

    Scaling behavior: continual improvement, plateaus, and why iteration counts are hard to predict

    Sarah probes compute and iteration limits; Matej explains AlphaEvolve adapts to difficulty—easy problems resolve quickly, hard ones may require long-running exploration. Predicting required iterations is inherently difficult because true problem difficulty is often surprising, especially for scientific questions.

    • Adaptive scaling: runtime expands with problem difficulty
    • Key engineering challenge: sustaining improvement without early plateau
    • Iteration prediction is hard; difficulty is often only learned empirically
    • Progress can continue long enough to tackle decades-old challenges
  7. 16:52 – 19:42

    Implications for everyday coding agents: the evaluator as the missing ingredient

    Sarah contrasts AlphaEvolve with common developer agents that get lost or hallucinate. Pushmeet argues typical coding prompts are partial specs; without strong evaluation you can’t reliably tell right from wrong. AlphaEvolve turns “hallucinations” into useful exploration by filtering with evaluators and careful stress-testing.

    • General coding agents often operate on incomplete specifications
    • Hallucinations become an asset if you can robustly evaluate candidates
    • Evaluation design includes how deeply to test ideas and on what distributions
    • Key balance: explore creative candidates while filtering for robustness
  8. 19:42 – 25:22

    Beyond perfect simulators: relaxing evaluator requirements with LLM critiques and hybrid signals

    Sarah asks how to overcome evaluator limitations; Matej and Pushmeet outline a spectrum from strict simulators to model-based critique. They cite AI CoScientist as evidence that LLMs can meaningfully evaluate natural-language ideas, and suggest hybrid/auxiliary evaluators correlated with true objectives can still drive progress.

    • Strict evaluators are powerful but not a fundamental limitation
    • LLMs can provide critique/selection signals (AI CoScientist example)
    • Hybrid spectrum: imperfect simulators, auxiliary metrics, correlated signals
    • Potential need for proof-oriented agents to validate properties in some domains
  9. 25:22 – 28:10

    Are we seeing self-improving AI? Infrastructure speedups vs capability gains

    Sarah raises recursive self-improvement, noting AlphaEvolve improved parts of its own training infrastructure. Pushmeet and Matej agree this is an early form of self-improvement, but emphasize it’s currently focused on efficiency (speed/compute) with long feedback loops; improving core cognitive capability remains to be validated and depends on evaluation quality.

    • Demonstrated loop: AlphaEvolve improves training efficiency (compute/time)
    • Harder next step: improvements that make the model fundamentally better at tasks
    • Self-improvement feedback cycles are currently long (months)
    • Open question: do gains saturate, taper, or compound over time?
  10. 28:10 – 31:50

    Scientific discovery outlook: search as the common substrate across disciplines

    Matej discusses applying AlphaEvolve beyond math/CS, noting those fields are easiest today due to readily available evaluators. Pushmeet generalizes: much of science is search over ideas/candidates; as systematic evaluation and simulation improve (e.g., rational drug/material discovery), these agents can provide a ‘search superpower.’

    • Math/CS are first targets due to abundant automated evaluation
    • Other sciences can use simulators/predictive models (e.g., molecule design)
    • AlphaEvolve is an early version; roadmap includes broader applicability
    • Core thesis: AI accelerates science by scaling structured search
  11. 31:50 – 38:29

    Human scientists’ role: problem framing, constraints, and collaboration through interpretable code

    Sarah asks what humans should focus on as real-world evaluation expands (labs/robotics). Pushmeet highlights defining evaluators, constraints, and multi-objective requirements (safety, side effects, delivery, etc.). Matej emphasizes the workflow is collaborative: AlphaEvolve outputs algorithms/code that humans can inspect, learn from, and decide to deploy—often making the algorithmic insight more valuable than the raw solution.

    • Humans define evaluators, constraints, and desirable properties of solutions
    • Multi-objective requirements often require expert judgment and framing
    • AlphaEvolve outputs code/algorithms that can be inspected and understood
    • Collaboration with mathematicians: understanding the ‘how’ drives insight
    • Interpretability varies; learning from discovered structure is a key future area
  12. 38:29 – 40:17

    Access and deployment: trusted tester program, evaluator requirement, and compute costs

    Sarah asks about broader accessibility; Pushmeet describes efforts to expand availability via a trusted tester program. Key gating factors are having suitable evaluation functions and sufficient compute, since AlphaEvolve requires many evaluations and iterations for hard problems.

    • Goal: broaden access beyond internal Google use
    • Trusted tester program to learn best use cases and shape release strategy
    • Main prerequisites: good evaluation functions + significant compute resources
    • Hard problems demand extensive search and repeated evaluation
  13. 40:17 – 42:08

    Applying AlphaEvolve across Google’s stack and closing thoughts

    Sarah asks about future internal applications; Matej notes the white paper aimed to show versatility across data centers, hardware, and software. They hint at more results coming but don’t share specifics, then wrap up the conversation and sign off.

    • Demonstrated breadth: data center efficiency, hardware design, core software
    • AlphaEvolve already used internally on many problems
    • Future applications anticipated; specifics not yet public
    • Conversation wrap-up and podcast outro

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.