Skip to content
Dwarkesh PodcastDwarkesh Podcast

Ilya Sutskever on Dwarkesh Patel: Why RL Overfits the Evals

Why RL targeting benchmark evals creates models that ace GPT-3 tests but cycle bugs: Sutskever links this to skipping value functions in the training mix.

Ilya SutskeverguestDwarkesh Patelhost
Nov 25, 20251h 36mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 9:39

    Explaining model jaggedness

  2. 9:39 – 18:49

    Emotions and value functions

  3. 18:49 – 25:13

    What are we scaling?

  4. 25:13 – 35:45

    Why humans generalize better than models

  5. 35:45 – 46:47

    Straight-shotting superintelligence

  6. 46:47 – 55:07

    SSI’s model will learn from deployment

  7. 1:18:13 – 1:29:23

    “We are squarely an age of research company”

  8. 1:29:23 – 1:32:42

    - Self-play and multi-agent

  9. 1:32:42 – 1:36:03

    Research taste

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome