Lex Fridman Podcast

Anca Dragan: Human-Robot Interaction and Reward Engineering | Lex Fridman Podcast #81

Anca Dragan is a professor at Berkeley, working on human-robot interaction -- algorithms that look beyond the robot's function in isolation, and generate robot behavior that accounts for interaction and coordination with human beings. This episode is presented by Cash App. Download it & use code "LexPodcast": Cash App (App Store): https://apple.co/2sPrUHe Cash App (Google Play): https://bit.ly/2MlvP5w PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41 EPISODE LINKS: Anca's Twitter: https://twitter.com/ancadianadragan Anca's Website: https://people.eecs.berkeley.edu/~anca/ OUTLINE: 0:00 - Introduction 2:26 - Interest in robotics 5:32 - Computer science 7:32 - Favorite robot 13:25 - How difficult is human-robot interaction? 32:01 - HRI application domains 34:24 - Optimizing the beliefs of humans 45:59 - Difficulty of driving when humans are involved 1:05:02 - Semi-autonomous driving 1:10:39 - How do we specify good rewards? 1:17:30 - Leaked information from human behavior 1:21:59 - Three laws of robotics 1:26:31 - Book recommendation 1:29:02 - If a doctor gave you 5 years to live... 1:32:48 - Small act of kindness 1:34:31 - Meaning of life CONNECT: - Subscribe to this YouTube channel - Twitter: https://twitter.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/LexFridmanPage - Instagram: https://www.instagram.com/lexfridman - Medium: https://medium.com/@lexfridman - Support on Patreon: https://www.patreon.com/lexfridman

Lex FridmanhostAnca Draganguest

Mar 18, 20201h 38mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Designing Robots That Understand Human Intent, Limits, and Preferences

Lex Fridman and Anca Dragan discuss human-robot interaction with a focus on how robots can model, predict, and adapt to human behavior and preferences. They explore inverse reinforcement learning, rationality assumptions, and how to reinterpret human 'irrationality' as optimal behavior under different beliefs, constraints, or physics models. A major theme is reward design: how hard it is to specify objectives that elicit the right behavior in all situations, and how robots can learn from 'leaked' information in human actions, corrections, environment, and even emergency stops. They also touch on autonomous driving, semi-autonomous systems, ethical and social dimensions of robots, and broader reflections on meaning, mortality, and what it means to build AI that truly serves humans.

IDEAS WORTH REMEMBERING

5 ideas

Model humans as approximately rational—but under their own beliefs and constraints.

Rather than dismissing people as irrational, robots can treat human behavior as roughly optimal given different world models, planning horizons, or intuitive physics; this shift makes behavior more predictable and supports better assistance and coordination.

Use inverse reinforcement learning to infer what people want from what they do.

By assuming actions are (noisily) optimal for some underlying reward, robots can infer user preferences or driving styles from demonstrations and then optimize accordingly, instead of relying solely on hand-specified objectives.

Let robots act to gather information, not just passively predict humans.

Robots can nudge, probe, or test the environment (e.g., a car edging toward a neighboring lane to see if a driver yields) to learn about human behavior or preferences more quickly and robustly than passive observation allows.

Treat human behavior as part of an underactuated system you influence but don’t control.

Humans are like degrees of freedom you cannot command directly but can shape through your actions; planning should account for the fact that people change their behavior in response to what robots do, not just vice versa.

Reward design is brittle; assume specified rewards are evidence, not ground truth.

Engineers rarely write perfect reward functions—agents can optimize them in unintended ways (Goodhart’s law); robots should treat designer-specified objectives as noisy signals about the true human desiderata and keep uncertainty over what they should optimize.

WORDS WORTH SAVING

5 quotes

Maybe people are operating this thing, but assuming a much more simplified physics model… and under those assumptions, their behavior actually makes sense.

— Anca Dragan

When the robot moves in an optimal way and I intervene, that means I disagree with its notion of optimality.

— Anca Dragan

We’ve moved the tuning from the behavior side into the reward side, and it still seems really hard to anticipate every possible situation.

— Anca Dragan

Our world is something that we’ve been acting in according to our preferences; the environment itself leaks information about what people want.

— Anca Dragan

It’s such a great privilege to exist that the idea of being told I’m going to die is my biggest nightmare.

— Anca Dragan

Anca Dragan’s path into robotics, AI, and human-robot interactionExpressive robot motion and anthropomorphism (e.g., WALL-E, Boston Dynamics)Human modeling: rationality, inverse reinforcement learning, and intuitive physicsHuman-robot collaboration as a game-theoretic and underactuated control problemAutonomous driving: interacting with human drivers, pedestrians, and semi-autonomous systemsReward design and reward learning (mis-specification, Goodhart’s law, leaked information)Ethical, social, and philosophical questions: how we treat robots, mortality, and meaning

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.