Skip to content
Dwarkesh PodcastDwarkesh Podcast

Richard Sutton on Dwarkesh Patel: Why LLMs Lack a Goal

How temporal difference learning gives AI a ground truth that LLMs lack: Sutton argues without reward signals, there is no right or wrong action.

Richard SuttonguestDwarkesh Patelhost
Sep 26, 20251h 7mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 13:51

    Are LLMs a dead end?

  2. 13:51 – 23:57

    Do humans do imitation learning?

  3. 23:57 – 34:25

    The Era of Experience

  4. 34:25 – 42:17

    Current architectures generalize poorly out of distribution

  5. 42:17 – 47:28

    Surprises in the AI field

  6. 47:28 – 54:35

    Will The Bitter Lesson still apply after AGI?

  7. 54:35 – 1:07:08

    Succession to AI

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome