This video isn’t embeddableWatch on YouTube →

David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | Lex Fridman Podcast #86

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. Support this podcast by signing up with these sponsors: - MasterClass: https://masterclass.com/lex - Cash App - use code "LexPodcast" and download: - Cash App (App Store): https://apple.co/2sPrUHe - Cash App (Google Play): https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning (book): https://amzn.to/2Jwp5zG PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41 OUTLINE: 0:00 - Introduction 4:09 - First program 11:11 - AlphaGo 21:42 - Rule of the game of Go 25:37 - Reinforcement learning: personal journey 30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued) 53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 - Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21 - AlphaZero applications 1:37:59 - Reward functions 1:40:51 - Meaning of life CONNECT: - Subscribe to this YouTube channel - Twitter: https://twitter.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/LexFridmanPage - Instagram: https://www.instagram.com/lexfridman - Medium: https://medium.com/@lexfridman - Support on Patreon: https://www.patreon.com/lexfridman

Lex FridmanhostDavid Silverguest

Apr 3, 20201h 48mWatch on YouTube ↗

CHAPTERS

0:00 – 4:09
Introduction
4:09 – 11:11
First program
11:11 – 21:42
AlphaGo
21:42 – 25:37
Rule of the game of Go
25:37 – 30:15
Reinforcement learning: personal journey
30:15 – 43:51
What is reinforcement learning?
43:51 – 53:40
AlphaGo (continued)
53:40 – 1:06:12
Supervised learning and self play in AlphaGo
1:06:12 – 1:08:57
Lee Sedol retirement from Go play
1:08:57 – 1:14:10
Garry Kasparov
1:14:10 – 1:31:29
Alpha Zero and self play
1:31:29 – 1:35:21
Creativity in AlphaZero
1:35:21 – 1:37:59
AlphaZero applications
1:37:59 – 1:40:51
Reward functions
1:40:51 – 1:48:00
Meaning of life

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome