Ilya Sutskever on Dwarkesh Patel: Why RL Overfits the Evals

Name: Ilya Sutskever on Dwarkesh Patel: Why RL Overfits the Evals
Uploaded: 2025-11-25T00:00:00Z
Duration: 1 h 36 min 3 s

Why RL targeting benchmark evals creates models that ace GPT-3 tests but cycle bugs: Sutskever links this to skipping value functions in the training mix.

Ilya SutskeverguestDwarkesh Patelhost

Nov 25, 20251h 36mWatch on YouTube ↗

CHAPTERS

0:00 – 9:39
Explaining model jaggedness
9:39 – 18:49
Emotions and value functions
18:49 – 25:13
What are we scaling?
25:13 – 35:45
Why humans generalize better than models
35:45 – 46:47
Straight-shotting superintelligence
46:47 – 55:07
SSI’s model will learn from deployment
55:07 – 1:18:13
Alignment
1:18:13 – 1:29:23
“We are squarely an age of research company”
1:29:23 – 1:32:42
- Self-play and multi-agent
1:32:42 – 1:36:03
Research taste

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Explaining model jaggedness

Emotions and value functions

What are we scaling?

Why humans generalize better than models

Straight-shotting superintelligence

SSI’s model will learn from deployment

Alignment

“We are squarely an age of research company”

- Self-play and multi-agent

Research taste

Get more out of YouTube videos.