Ilya Sutskever on Dwarkesh Patel: Why RL Overfits the Evals

Why RL targeting benchmark evals creates models that ace GPT-3 tests but cycle bugs: Sutskever links this to skipping value functions in the training mix.

Ilya SutskeverguestDwarkesh Patelhost

Nov 25, 20251h 36mWatch on YouTube ↗

EPISODE INFO

Released: November 25, 2025
Duration: 1h 36m
Channel: Dwarkesh Podcast
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Ilya & I discuss SSI’s strategy, the problems with pre-training, how to improve the generalization of AI models, and how to ensure AGI goes well. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒
Transcript: https://www.dwarkesh.com/p/ilya-sutskever-2
Apple Podcasts: https://podcasts.apple.com/us/podcast/dwarkesh-podcast/id1516093381?i=1000738363711
Spotify: https://open.spotify.com/episode/7naOOba8SwiUNobGz8mQEL?si=39dd68f346ea4d49
𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒
Gemini 3 is the first model I’ve used that can find connections I haven’t anticipated. I recently wrote a blog post on RL’s information efficiency, and Gemini 3 helped me think it all through. It also generated the relevant charts and ran toy ML experiments for me with zero bugs. Try Gemini 3 today at https://gemini.google
Labelbox helped me create a tool to transcribe our episodes! I’ve struggled with transcription in the past because I don’t just want verbatim transcripts, I want transcripts reworded to read like essays. Labelbox helped me generate the *exact* data I needed for this. If you want to learn how Labelbox can help you (or if you want to try out the transcriber tool yourself), go to https://labelbox.com/dwarkesh
Sardine is an AI risk management platform that brings together thousands of device, behavior, and identity signals to help you assess a user’s risk of fraud & abuse. Sardine also offers a suite of agents to automate investigations so that as fraudsters use AI to scale their attacks, you can use AI to scale your defenses. Learn more at https://sardine.ai/dwarkesh
To sponsor a future episode, visit https://dwarkesh.com/advertise 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – Explaining model jaggedness 00:09:39 - Emotions and value functions 00:18:49 – What are we scaling? 00:25:13 – Why humans generalize better than models 00:35:45 – Straight-shotting superintelligence 00:46:47 – SSI’s model will learn from deployment 00:55:07 – Alignment 01:18:13 – “We are squarely an age of research company” 01:29:23 -- Self-play and multi-agent 01:32:42 – Research taste

SPEAKERS

Ilya Sutskever
guest
Dwarkesh Patel
host

EPISODE SUMMARY

In this episode of Dwarkesh Podcast, featuring Ilya Sutskever and Dwarkesh Patel, Ilya Sutskever on Dwarkesh Patel: Why RL Overfits the Evals explores ilya Sutskever: Beyond scaling laws toward deeply generalizing superintelligence Ilya Sutskever argues that the era of simply scaling pre‑training is ending and we are re‑entering an era where genuine research and new training recipes matter more than raw compute. He highlights a glaring gap between benchmark performance and real‑world usefulness, blaming overfitting to evals, weak generalization, and poorly understood RL fine‑tuning. Much of the discussion contrasts human learning and robustness with current models, exploring value functions, emotions, evolution, and why humans generalize so much better from far less data. Sutskever outlines SSI’s bet on a different technical path to human‑like continual learners, the societal implications of such systems, and his views on alignment, superintelligence, and what “AI going well” might require.

RELATED EPISODES

David Reich – Bronze Age shock, the Neanderthal puzzle, & the sudden spread of farming

Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat

Dario Amodei — “We are near the end of the exponential”

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Why Leonardo was a saboteur, Gutenberg went broke, and Florence was weird – Ada Palmer

Richard Sutton – Father of RL thinks LLMs are a dead end

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Episode Details