Skip to content
Stanford OnlineStanford Online

Stanford CS230 | Autumn 2025 | Lecture 5: Deep Reinforcement Learning

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai October 21, 2025 This lecture covers deep reinforcement learning. To learn more about enrolling in this course, visit: https://online.stanford.edu/courses/cs230-deep-learning To follow along with the course schedule and syllabus, visit: https://cs230.stanford.edu/syllabus/ More lectures will be published regularly. View the playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNRRGdS0rBbXOUGA0wjdh1X Andrew Ng Founder of DeepLearning.AI Adjunct Professor, Stanford University’s Computer Science Department Kian Katanforoosh CEO and Founder of Workera Adjunct Lecturer, Stanford University’s Computer Science Department

Kian Katanforooshhost
Oct 31, 20251h 45mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
October 31, 2025
Duration
1h 45m
Channel
Stanford Online
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai October 21, 2025 This lecture covers deep reinforcement learning. To learn more about enrolling in this course, visit: https://online.stanford.edu/courses/cs230-deep-learning To follow along with the course schedule and syllabus, visit: https://cs230.stanford.edu/syllabus/ More lectures will be published regularly. View the playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNRRGdS0rBbXOUGA0wjdh1X Andrew Ng Founder of DeepLearning.AI Adjunct Professor, Stanford University’s Computer Science Department Kian Katanforoosh CEO and Founder of Workera Adjunct Lecturer, Stanford University’s Computer Science Department

SPEAKERS

  • Kian Katanforoosh

    host

    Co-instructor for Stanford CS230 (Deep Learning) and CEO/co-founder of Workera.

EPISODE SUMMARY

In this episode of Stanford Online, featuring Kian Katanforoosh, Stanford CS230 | Autumn 2025 | Lecture 5: Deep Reinforcement Learning explores deep reinforcement learning fundamentals, deep Q-networks, and RLHF for LLMs Reinforcement learning (RL) is framed as learning good sequences of decisions from experience, especially when labels are delayed and supervised learning targets are ill-defined (e.g., Go).

RELATED EPISODES

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS153 Frontier Systems | The Discipline of Delivering Value per Gigawatt

Stanford CS153 Frontier Systems | The Discipline of Delivering Value per Gigawatt

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Stanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence

Stanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence

Stanford CS153 Frontier Systems | Scott Nolan from General Matter on Energy Bottlenecks

Stanford CS153 Frontier Systems | Scott Nolan from General Matter on Energy Bottlenecks

Stanford CS153 Frontier Systems | Ben Horowitz from a16z on Venture Capital Systems, Network Effects

Stanford CS153 Frontier Systems | Ben Horowitz from a16z on Venture Capital Systems, Network Effects

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.