Skip to content
Dwarkesh PodcastDwarkesh Podcast

Sholto Douglas & Trenton Bricken — How LLMs actually think

Had so much fun chatting with my good friends Trenton Bricken and Sholto Douglas on the podcast. No way to summarize it, except: * This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them. * You would be shocked how much of what I know about this field, I've learned just from talking with them. * To the extent that you've enjoyed my other AI interviews, now you know why. There's a transcript with links to all the papers the boys were throwing down - may help you follow along. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken * Spotify: https://open.spotify.com/episode/2dtDauiE4v8ldNRqPFq0uP?si=7S4n69QuTjeYz0lZwW4xIw * Apple Podcasts: https://podcasts.apple.com/us/podcast/sholto-douglas-trenton-bricken-how-to-build-understand/id1516093381?i=1000650748087 * Trenton Bricken's twitter: https://twitter.com/TrentonBricken * Sholto Douglas's twitter: https://twitter.com/_sholtodouglas 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 - Long contexts 00:17:04 - Intelligence is just associations 00:33:27 - Intelligence explosion & great researchers 01:07:44 - Superposition & secret communication 01:23:26 - Agents & true reasoning 01:35:32 - How Sholto & Trenton got into AI research 02:08:08 - Are feature spaces the wrong way to think about intelligence? 02:22:04 - Will interp actually work on superhuman models 02:45:57 - Sholto's technical challenge for the audience 03:04:49 - Rapid fire

Dwarkesh PatelhostTrenton BrickenguestSholto Douglasguest
Mar 28, 20243h 13mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
March 28, 2024
Duration
3h 13m
Channel
Dwarkesh Podcast
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

Had so much fun chatting with my good friends Trenton Bricken and Sholto Douglas on the podcast. No way to summarize it, except:

  • This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them.
  • You would be shocked how much of what I know about this field, I've learned just from talking with them.
  • To the extent that you've enjoyed my other AI interviews, now you know why.

There's a transcript with links to all the papers the boys were throwing down - may help you follow along. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒

𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 - Long contexts 00:17:04 - Intelligence is just associations 00:33:27 - Intelligence explosion & great researchers 01:07:44 - Superposition & secret communication 01:23:26 - Agents & true reasoning 01:35:32 - How Sholto & Trenton got into AI research 02:08:08 - Are feature spaces the wrong way to think about intelligence? 02:22:04 - Will interp actually work on superhuman models 02:45:57 - Sholto's technical challenge for the audience 03:04:49 - Rapid fire

SPEAKERS

  • Narrator

    other
  • Dwarkesh Patel

    host
  • Trenton Bricken

    guest
  • Sholto Douglas

    guest

EPISODE SUMMARY

In this episode of Dwarkesh Podcast, featuring Narrator and Dwarkesh Patel, Sholto Douglas & Trenton Bricken — How LLMs actually think explores inside LLM Minds: Context Windows, Features, and Future Superintelligence Dwarkesh Patel interviews Google’s Sholto Douglas and Anthropic’s Trenton Bricken about how large language models work internally, why long context windows matter, and what an “intelligence explosion” might actually look like from the perspective of frontier researchers.

RELATED EPISODES

Machiavelli is the most misunderstood thinker of all time – Ada Palmer

Machiavelli is the most misunderstood thinker of all time – Ada Palmer

The better AI gets, the smaller its share of the economy might get – Alex Imas and Phil Trammell

The better AI gets, the smaller its share of the economy might get – Alex Imas and Phil Trammell

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

Jensen Huang – Will Nvidia’s moat persist?

Jensen Huang – Will Nvidia’s moat persist?

Terence Tao – How the world’s top mathematician uses AI

Terence Tao – How the world’s top mathematician uses AI

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.