Lex Fridman PodcastYann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36
CHAPTERS
- 0:00 – 4:45
HAL 9000, value misalignment, and laws as “objective functions”
Lex opens with 2001: A Space Odyssey and asks whether HAL is evil or simply flawed. LeCun frames HAL’s actions as an example of value misalignment and draws an analogy between legal systems and the shaping of objective functions for behavior in society.
- 4:45 – 7:43
Designing a better HAL: secrecy, lying, and hard limits on autonomy
LeCun argues HAL failed largely because it was forced to keep secrets and lie, creating internal conflict. He discusses whether AI systems should withhold information and suggests something akin to a Hippocratic Oath, while emphasizing that today’s systems are not truly autonomous agents yet.
- 7:43 – 9:10
The surprising empirical fact of deep learning: huge nets + SGD actually work
Lex asks about the most beautiful or surprising idea in AI. LeCun highlights how deep nets with many parameters, trained with SGD on relatively modest data, defy older textbook intuitions about non-convexity and overparameterization.
- 9:10 – 12:32
Learning as the core of intelligence; reasoning must fit gradient-based learning
LeCun explains why he saw learning as inseparable from intelligence and dismisses pure hand-programming as a path to human-like AI. He argues reasoning must be made compatible with gradient-based methods and critiques discrete logic-based views as mismatched to learning.
- 12:32 – 16:25
What a reasoning system needs: working memory, recurrence, and memory access
They explore what neural reasoning could look like architecturally. LeCun emphasizes the need for a hippocampus-like working memory, mechanisms for iterative/recurrent processing, and efficient read/write access—beyond what standard transformers provide.
- 16:25 – 18:03
Reasoning via planning: energy minimization and model predictive control
LeCun introduces another reasoning route grounded in control theory: planning as optimization of an energy/objective function using a learned model of the world. He connects this to survival-driven planning in animals and to optimal control methods.
- 18:03 – 20:51
Limits of symbolic graphs and logic; vectors and continuous “machine reasoning”
Lex asks about expert systems and symbolic knowledge. LeCun calls logic/graph representations brittle and highlights the knowledge acquisition bottleneck, endorsing Hinton’s idea of replacing symbols with vectors and logic with continuous functions, referencing Bottou’s ‘From Machine Learning to Machine Reasoning.’
- 20:51 – 24:43
Causality and humans’ weak intuitions: Pearl, physics, and Papert’s wind example
The discussion turns to causal inference and whether neural nets can learn causality. LeCun notes both conceptual challenges (time reversibility in microphysics) and human fallibility in causal reasoning, illustrating with Papert’s example of children reversing the cause of wind.
- 24:43 – 27:14
Why neural nets fell out of favor in the 1990s: tooling, datasets, and “bag of tricks”
Lex asks about the AI winter for neural nets. LeCun attributes it to practical difficulty—lack of good software environments, small datasets, fragile training practices, and the need to know many tricks before getting reliable results.
- 27:14 – 33:03
LeNet’s Lisp stack and early autodiff graphs; open-source constraints and patents
LeCun recounts building early convnet systems in Lisp, including writing an interpreter, compiler, and modular forward/backward propagation framework. He explains how legal/IP constraints prevented open sourcing then, and shares war stories about Bell Labs-era convnet patents and their eventual expiration.
- 33:03 – 36:04
Benchmarks as reality checks: avoiding AGI hype and building shared evaluation tasks
LeCun argues that credible progress requires measurable tasks and community benchmarks—even toy ones like bAbI. He criticizes overhyped AGI claims and stresses that the field lacks the core technology for common-sense assistants, requiring broad open research rather than secret ‘breakthroughs.’
- 36:04 – 44:46
Interactive environments and why “AGI” is a misleading term; human specialization argument
They discuss benchmarks for intelligence in interactive, action-dependent settings (robotics, simulators, games) where i.i.d. dataset assumptions break. LeCun rejects ‘AGI’ by arguing humans are highly specialized, illustrating with the optical nerve permutation thought experiment and the vast space of Boolean functions we cannot compute.
- 44:46 – 51:32
Self-supervised learning: why it works in language, struggles in vision, and the uncertainty problem
LeCun reframes ‘unsupervised’ as self-supervised learning—predicting missing parts of inputs (e.g., masked words) with supervised-style objectives. He explains why NLP succeeds (discrete distributions over vocabularies) while image/video prediction is harder due to multi-modal uncertainty leading to blurry averages and poor planning.
- 51:32 – 1:15:58
RL, active learning, and the road to autonomy: world models, self-driving, grounding, and emotions
LeCun critiques model-free RL as data-inefficient (Atari hours vs minutes; StarCraft ‘200 years’) and impractical for real driving. He outlines a path centered on self-supervised world models enabling model-based control, then expands to requirements for human-level assistants: grounding language in perception, an architecture with world model + objective + planner/policy, and the role of emotions as anticipatory objective prediction.