Lex Fridman PodcastYann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36
Lex Fridman and Yann LeCun on yann LeCun outlines path to human-level AI through self-supervision.
In this episode of Lex Fridman Podcast, featuring Lex Fridman and Yann LeCun, Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36 explores yann LeCun outlines path to human-level AI through self-supervision Yann LeCun discusses the limitations of current AI, arguing that real progress toward human-level intelligence requires self-supervised learning and rich predictive models of the world rather than just bigger supervised or reinforcement learning systems.
At a glance
WHAT IT’S REALLY ABOUT
Yann LeCun outlines path to human-level AI through self-supervision
- Yann LeCun discusses the limitations of current AI, arguing that real progress toward human-level intelligence requires self-supervised learning and rich predictive models of the world rather than just bigger supervised or reinforcement learning systems.
- He contrasts symbolic, logic-based AI with gradient-based neural approaches, emphasizing continuous representations, working memory, and planning as keys to enabling reasoning in neural networks.
- LeCun explores ethical and societal issues via HAL 9000, legal systems as "objective functions," and the non-generality of human intelligence, stressing that grounding in physical reality and common sense is essential for true language understanding.
- He also reflects on deep learning’s history, why neural nets briefly fell out of favor, the role of benchmarks and open-source tools, and why emotions, causality, and model-based reinforcement learning will be central to future autonomous systems.
IDEAS WORTH REMEMBERING
7 ideasAI safety parallels human lawmaking: objective functions are like legal codes.
LeCun frames AI alignment as an extension of what societies already do with laws—designing objective functions (rules, penalties) that shape behavior toward the common good, suggesting AI ethics will fuse computer science and jurisprudence rather than invent something entirely new.
Deep learning works by violating classical theory, and that’s informative.
Modern neural nets with huge parameter counts and non-convex objectives train successfully on relatively modest data via stochastic gradient descent, contradicting pre–deep learning textbooks; this empirical success implies our theoretical understanding of generalization and optimization was too narrow.
Reasoning in neural nets requires working memory, recurrence, and world models.
LeCun argues that human-like reasoning emerges from systems with hippocampus-like memory, recurrent access to that memory, and energy-minimization style planning (model predictive control), not from static feed-forward models alone.
Symbolic logic is brittle and hard to learn; continuous representations scale better.
He critiques logic- and graph-based expert systems for their brittleness and manual knowledge acquisition bottleneck, advocating vector-based “symbols” and continuous functions (à la Hinton and Bottou) as a way to make reasoning compatible with gradient-based learning.
Self-supervised learning is crucial for common sense and data efficiency.
LeCun sees self-supervised prediction (e.g., masked word prediction, video/frame prediction) as the primary route to learning rich world models that later make supervised and reinforcement learning vastly more sample-efficient, mirroring how babies learn physics and causality from observation.
Current RL is far from human learning; model-based approaches are needed.
He notes that deep RL systems need the equivalent of years or centuries of experience to reach human performance in games, whereas humans learn tasks like driving in tens of hours because they rely on internal predictive models of physics, not just trial-and-error reward signals.
Human intelligence is highly specialized and not truly “general.”
Using arguments about the structure of the visual system and the vast space of possible Boolean functions, LeCun contends that humans operate over a tiny subset of possible tasks and stimuli—our sense of “generality” is confined to what we can even conceptualize.
WORDS WORTH SAVING
5 quotesMachine learning is the science of sloppiness.
— Yann LeCun
Intelligence is inseparable from learning. The idea you can create an intelligent machine by basically programming was a non-starter for me from the start.
— Yann LeCun
We’re not going to have autonomous intelligence without emotions.
— Yann LeCun
Human intelligence is nothing like general. It’s very, very specialized.
— Yann LeCun
The main problem we need to solve is: how do we learn models of the world? That’s what self-supervised learning is all about.
— Yann LeCun
QUESTIONS ANSWERED IN THIS EPISODE
5 questionsIf self-supervised world modeling is so central, what specific architectures or objective functions might finally crack uncertainty-aware video and image prediction?
Yann LeCun discusses the limitations of current AI, arguing that real progress toward human-level intelligence requires self-supervised learning and rich predictive models of the world rather than just bigger supervised or reinforcement learning systems.
How can we rigorously benchmark “common sense” and grounding in AI systems beyond language tasks like the Winograd schemas?
He contrasts symbolic, logic-based AI with gradient-based neural approaches, emphasizing continuous representations, working memory, and planning as keys to enabling reasoning in neural networks.
What would a practical, legally-informed “objective function” for a powerful general-purpose AI actually look like in code or system design?
LeCun explores ethical and societal issues via HAL 9000, legal systems as "objective functions," and the non-generality of human intelligence, stressing that grounding in physical reality and common sense is essential for true language understanding.
To what extent can large language models acquire genuine causal understanding from text alone, and where will they fundamentally need non-linguistic grounding?
He also reflects on deep learning’s history, why neural nets briefly fell out of favor, the role of benchmarks and open-source tools, and why emotions, causality, and model-based reinforcement learning will be central to future autonomous systems.
How might model-based reinforcement learning and self-supervision be combined in real-world domains like autonomous driving to avoid the sample inefficiency of current RL methods?
EVERY SPOKEN WORD
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome