Paul Christiano — Preventing an AI takeover

Talked with Paul Christiano (world’s leading AI safety researcher) about: * Does he regret inventing RLHF? * What do we want post-AGI world to look like (do we want to keep gods enslaved forever)? * Why he has relatively modest timelines (40% by 2040, 15% by 2030), * Why he’s leading the push to get to labs develop responsible scaling policies, & what it would take to prevent an AI coup or bioweapon, * His current research into a new proof system, and how this could solve alignment by explaining model's behavior, * and much more. 𝐎𝐏𝐄𝐍 𝐏𝐇𝐈𝐋𝐀𝐍𝐓𝐇𝐑𝐎𝐏𝐘 Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations. For more information and to apply, please see this application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/ The deadline to apply is November 9th; make sure to check out those roles before they close: 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkeshpatel.com/p/paul-christiano * Apple Podcasts: https://podcasts.apple.com/us/podcast/paul-christiano-preventing-an-ai-takeover/id1516093381?i=1000633226398 * Spotify: https://open.spotify.com/episode/5vOuxDP246IG4t4K3EuEKj?si=VW7qTs8ZRHuQX9emnboGcA * Follow me on Twitter: https://twitter.com/dwarkesh_sp 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 - What do we want post-AGI world to look like? 00:24:25 - Timelines 00:45:28 - Evolution vs gradient descent 00:54:53 - Misalignment and takeover 01:17:23 - Is alignment dual-use? 01:31:38 - Responsible scaling policies 01:58:25 - Paul’s alignment research 02:35:01 - Will this revolutionize theoretical CS and math? 02:46:11 - How Paul invented RLHF 02:55:10 - Disagreements with Carl Shulman 03:01:53 - Long TSMC but not NVIDIA

Dwarkesh PatelhostPaul Christianoguest

Oct 31, 20233h 7mWatch on YouTube ↗

CHAPTERS

0:00 – 24:25
What do we want post-AGI world to look like?
24:25 – 45:28
Timelines
45:28 – 54:53
Evolution vs gradient descent
54:53 – 1:17:23
Misalignment and takeover
1:17:23 – 1:31:38
Is alignment dual-use?
1:31:38 – 1:58:25
Responsible scaling policies
1:58:25 – 2:35:01
Paul’s alignment research
2:35:01 – 2:46:11
Will this revolutionize theoretical CS and math?
2:46:11 – 2:55:10
How Paul invented RLHF
2:55:10 – 3:01:53
Disagreements with Carl Shulman
3:01:53 – 3:07:01
Long TSMC but not NVIDIA

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome