Episode Details
EPISODE INFO
- Released
- October 31, 2023
- Duration
- 3h 7m
- Channel
- Dwarkesh Podcast
- Watch on YouTube
- ▶ Open ↗
EPISODE DESCRIPTION
Talked with Paul Christiano (world’s leading AI safety researcher) about:
- Does he regret inventing RLHF?
- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?
- Why he has relatively modest timelines (40% by 2040, 15% by 2030),
- Why he’s leading the push to get to labs develop responsible scaling policies, & what it would take to prevent an AI coup or bioweapon,
- His current research into a new proof system, and how this could solve alignment by explaining model's behavior,
- and much more.
𝐎𝐏𝐄𝐍 𝐏𝐇𝐈𝐋𝐀𝐍𝐓𝐇𝐑𝐎𝐏𝐘 Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations. For more information and to apply, please see this application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/ The deadline to apply is November 9th; make sure to check out those roles before they close: 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒
- Transcript: https://www.dwarkeshpatel.com/p/paul-christiano
- Apple Podcasts: https://podcasts.apple.com/us/podcast/paul-christiano-preventing-an-ai-takeover/id1516093381?i=1000633226398
- Spotify: https://open.spotify.com/episode/5vOuxDP246IG4t4K3EuEKj?si=VW7qTs8ZRHuQX9emnboGcA
- Follow me on Twitter: https://twitter.com/dwarkesh_sp
𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 - What do we want post-AGI world to look like? 00:24:25 - Timelines 00:45:28 - Evolution vs gradient descent 00:54:53 - Misalignment and takeover 01:17:23 - Is alignment dual-use? 01:31:38 - Responsible scaling policies 01:58:25 - Paul’s alignment research 02:35:01 - Will this revolutionize theoretical CS and math? 02:46:11 - How Paul invented RLHF 02:55:10 - Disagreements with Carl Shulman 03:01:53 - Long TSMC but not NVIDIA
SPEAKERS
Dwarkesh Patel
hostPaul Christiano
guestNarrator
other
EPISODE SUMMARY
In this episode of Dwarkesh Podcast, featuring Dwarkesh Patel and Paul Christiano, Paul Christiano — Preventing an AI takeover explores paul Christiano on timelines, AI coups, and real alignment work Paul Christiano discusses how advanced AI could reshape economics, war, and governance, emphasizing that the most likely failure mode is a gradual handover of real-world control to opaque AI systems rather than a sudden, sci‑fi style ‘escape’.
RELATED EPISODES
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome





