Lex Fridman PodcastYann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36
EVERY SPOKEN WORD
140 min read · 27,605 words- 0:00 – 15:00
The following is a…
- LFLex Fridman
The following is a conversation with Yann LeCun. He's considered to be one of the fathers of deep learning, which, if you've been hiding under a rock, is the recent revolution in AI that's captivated the world with the possibility of what machines can learn from data. He's a professor at New York University, a vice president and chief AI scientist at Facebook, and co-recipient of the Turing Award for his work on deep learning. He's probably best known as the founding father of convolutional neural networks, in particular, their application to optical character recognition and the famed MNIST dataset. He is also an outspoken personality, unafraid to speak his mind in a distinctive French accent and explore provocative ideas, both in the rigorous medium of academic research and the somewhat less rigorous medium of Twitter and Facebook. This is the Artificial Intelligence podcast. If you enjoy it, subscribe on YouTube, give it five stars on iTunes, support it on Patreon, or simply connect with me on Twitter, @lexfridman, spelled F-R-I-D-M-A-N. And now, here's my conversation with Yann LeCun. You said that 2001: Space Odyssey is one of your favorite movies. HAL 9000 decides to get rid of the astronauts, for people who haven't seen the movie, spoiler alert, because he, it, she believes that the astronauts, they will interfere with the mission. Do you see HAL as flawed in some fundamental way, or even evil, or did he do the right thing?
- YLYann LeCun
Neither. There's no notion of evil in that- in that context, other than the fact that people die. But it was an example of what people call, uh, value misalignment, right? You give an objective to a machine and the machine str- strives to achieve this objective, and if you don't put any constraints on this objective, like, don't kill people and don't do things like this, the machine, given the power, will do stupid things just to achieve this- this objective, or damaging things to achieve this objective. It's a little bit like... I mean, we are used to this in the context of human society. We- we put in place laws to prevent people from doing bad things because
- LFLex Fridman
(laughs)
- YLYann LeCun
... spontaneously they would do those bad things, right? So, we have to shape their cost function, their objective function, if you want, through laws to kind of correct, and education, obviously, to sort of correct for- for those.
- LFLex Fridman
So, maybe just pushing a little further on- on that point, HAL, you know, there's a mission. There's a- there's fuzziness around the- the ambiguity around what the actual mission is, but, you know, d- do you think that there will be a time, from a utilitarian perspective, where an AI system, where it is not misalignment, where it is alignment for the greater good of society, that an AI system will make decisions that are difficult?
- YLYann LeCun
Well, that's the trick. I mean, uh, eventually, we'll have to figure out how to do this. And again, th- we're not starting from scratch because we've been doing this with humans for- for millennia.
- LFLex Fridman
Oh, yeah, yeah.
- YLYann LeCun
So, designing objective functions for people is something that we know how to do, and we don't do it by, um, you know, programming things, although th- the legal code is called code, um, so (laughs) that tells you something, and it's actually the design of an objective function. That's really what legal code is, right? It tells you, "Here is what you can do. Here is what you can't do. If you do it, you pay that much." That's- that's an objective function. So, there is this idea somehow that it's a new thing for people to try to design objective functions that are aligned with the common good, but no. We've been writing laws for millennia, and that's exactly what it is. Um, so this- that's where, you know, uh, the science of, uh, lawmaking and, uh- and computer science will-
- LFLex Fridman
Come together.
- YLYann LeCun
... will come together.
- LFLex Fridman
So, it's nothing- there's nothing special about HAL or AI systems. It's just a continuation of tools used to make some of these difficult ethical judgments that laws make, essentially.
- YLYann LeCun
Yeah, and we- and we have systems like this already that, you know, uh, make many decisions for- for ourselves in society that, you know, need to be designed in a way that they, like, you know, rules about things that someti- sometimes have bad side effects, and we have to be flexible enough about those rules so that they can be broken when it's obvious that they shouldn't be applied. So, you don't see this on the camera here, but all the decoration in this room is all pictures from 2001: A Space Odyssey. (laughs)
- LFLex Fridman
Wow.
- YLYann LeCun
Um...
- LFLex Fridman
Is that by accident, or is there a lot-
- YLYann LeCun
It's not by accident. It's by design.
- LFLex Fridman
(laughs) Oh, wow. So, if you were, um... If you were to build HAL 10,000, so an improvement of HAL 9000, what would you improve?
- YLYann LeCun
Well, first of all, I wouldn't ask it to hold secrets and tell lies, because that's really what breaks it in the end. That's the- the fact that it's asking itself questions about the purpose of the mission, and it, you know, pieces things together, that it's heard, you know, all the secrecy of the preparation of the mission and the fact that there was a discovery on- on the lunar surface that really was kept secret. And- and one part of HAL's memory knows this, and the other part is- does not know it and is supposed to not tell anyone, and that creates a internal conflict.
- LFLex Fridman
So, you think there never should be a set of things that an AI system should not be allowed, like a- a set of facts that should not be shared with the human operators?
- YLYann LeCun
Well, I think no. I think the- I think it should be a bit like, in the design of, um, autonomous AI systems, um, there should be the equivalent of, you know, the- the- the oath that, uh, Hippocratic Oath calls for-
- LFLex Fridman
Hippocratic Oath, yeah.
- YLYann LeCun
... that, uh, doctors, uh, sign up to, right? So, there's certain things, certain rules that- that- that you have to abide by. And we can sort of hardwire this into- into our- into our machines to kind of make sure they don't go... So, I'm not, you know, an advocate of the- the three do- three laws of robotics, you know, the Asimov kind of thing, because I don't think it's practical, but...... but, you know, some, some level of, uh, of limits. But, but to be clear, uh, this is not, these are not questions that are kind of really worth asking today because we just don't have the technology to do this. We don't, we don't have autonomous intelligent machines. We have intelligent machines, semi-intelligent machines that are very specialized, but they don't, they don't really sort of satisfy an objective. They're just, you know, kind of trained to do one thing.
- LFLex Fridman
Yeah.
- YLYann LeCun
So, until we have some idea for a design of a full-fledged autonomous intelligent system, asking the question of how, how we design this objective I think is a little, a little too abstract.
- LFLex Fridman
It's a little too abstract. There's useful elements to it, in that it helps us understand our own ethical codes as humans. So, even just as a thought experiment, if you imagine that an AGI system is here today, how, how would we program it is a kind of nice thought experiment of constructing how should we have a law, have a system of laws for us humans. It's, it's just a nice practical tool. And I think there's echoes of that idea, too, in the A- AI systems we have today that don't have to be that intelligent.
- YLYann LeCun
Yeah.
- LFLex Fridman
Uh, like autonomous vehicles.
- YLYann LeCun
Right.
- LFLex Fridman
Is these things start creeping in that, uh, we're thinking about, but certainly they shouldn't be framed as, as HAL.
- YLYann LeCun
Yep.
- 15:00 – 30:00
And you think that…
- YLYann LeCun
can access this memory, get an information back, and then kinda crunch on it, and then do this iteratively multiple times. Because a chain of reasoning is, is a process by which you, you, you can, uh, you update your knowledge about the state of the world, about, you know, what's gonna happen, et cetera. And that, that has to be this sort of recurrent operation basically.
- LFLex Fridman
And you think that kind of... I- if we think about a transformer, so that seems to be too small to contain the knowledge that's, that's, uh... To represent the knowledge that's contained in Wikipedia, for example.
- YLYann LeCun
Well, a transformer doesn't have this idea of, uh, recurrence. It's got a fixed number of layers and that's the number of steps that, you know, limits basically its representation.
- LFLex Fridman
So, but recurrence would build on the knowledge somehow.
- YLYann LeCun
Yeah.
- LFLex Fridman
I mean, it would, it would evolve the knowledge and ex- expand the amount of information perhaps, or useful information within that knowledge.
- YLYann LeCun
Yeah.
- LFLex Fridman
But is, is this something that just can emerge with size? Because it seems like everything we have now is too small.
- YLYann LeCun
Not just. No, it's not cle- it's not, it's not clear. I mean, how, how you access and write into an associative memory in an efficient way... I mean, sort of the original memory network maybe had something like the right architecture. But, uh, if you try to scale up a memory network so that the memory contains all of Wikipedia, I mean, it doesn't quite work.
- LFLex Fridman
Right.
- YLYann LeCun
So, so there, there's, there's a need for new ideas there. Okay. But it's not the only form of reasoning. So, there's another form of reasoning which is through... Which is very classical also in s- some types of AI, uh, and it's based on, uh... Let's call it energy minimization. Okay. So, you have, uh, some sort of objective, s- some energy function that represents the, the, the, um, the quality or the negative quality. Okay. Energy goes up when things get bad and they get low when things g- g- get good. So, let's say you, you want to figure out, you know, what gestures do I need to, to do to grab an object or walk out the door. If you have a good model of, uh, your own body, a good model of the environment, using this kind of energy minimization, you can make a, you can make. You can do planning. And it's, uh, a cl- in optimal control, it's called, it's called mode- model predictive control. You have a model of what's gonna happen in the world as a consequence of your actions, and that allows you to, by energy minimization, figure out the sequence of action that optimizes a particular objective function which measures, you know, minimizes the number of times you're gonna hit something and the energy you're gonna spend doing the gesture and et cetera. So, so that's one form of reasoning. Planning is a form of reasoning. And perhaps what led to the ability of humans to reason is the fact that... Or, you know, species, you know, that appeared before us had to do some sort of planning to be able to hunt and survive, and survive the winter in particular. And so, you know, it's the same capacity that, that you need to have.
- LFLex Fridman
So, in your intuition, is, um... If we look at expert systems and, and encoding knowledge as logic systems, and as graphs, in this kind of way, is not a useful way to think about knowledge?
- YLYann LeCun
Graphs are a little brittle. Or, or logic, uh, representation, so basically, you know, variables that, that have values and then constrained between them that are represented by rules, uh, is a little too rigid and too brittle, right?
- LFLex Fridman
Mm-hmm. Yes.
- YLYann LeCun
So, one of the... You know, some of the early efforts in that respect, um...... were, were to put probabilities on them. So a rule, you know, you know, if you have this and that symptom, you know, you have this, uh, disease with that probability, and you should prescribe that antibiotic with that probability, right? That's the, the Mycin system from the, from the '70s. Um, and that, that's what that branch of AI led to, uh, you know, based on networks and graphical models and causal inference and variational pro- you know-
- LFLex Fridman
Yep.
- YLYann LeCun
... method. So, so there, there is, I mean, certainly, uh, uh, a lot of interesting work going on in this area. The main issue with this is, is knowledge acquisition. How do you, uh, reduce a bunch of data to a graph of this type?
- LFLex Fridman
Yeah, it relies on the expert to ... on the human being to encode that or to add knowledge.
- YLYann LeCun
And that's essentially impractical. (laughs)
- LFLex Fridman
Yeah. So it's not scalable, right?
- YLYann LeCun
So that, that's, uh, that's, that's a big question. The second question is, do you want to represent knowledge as symbols, and do you want to manipulate them with logic? And again, that's incompatible with learning. So, uh, one suggestion which, you know, Geoff Hinton has been advocating for many decades is replace symbols by, uh, vectors. Think of it as pattern of activities in a bunch of neurons or units or whatever you want to call them, and replace logic by continuous functions.
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
Okay? And that becomes now compatible. There's a very good set of ideas by, uh, written in a, a paper about 10 years ago by, uh, Leon Bottou, on, uh, who is here at, at Facebook. Um, um, the title of the paper is From Machine Learning to Machine Reasoning, and his idea is that, uh, a learning s- a learning system should be able to manipulate objects that are in the same sp- in a space and then put the result back in the same space. So, it's this idea of working memory, basically, and it's, uh, it's very enlightening.
- LFLex Fridman
And in a sense, that might learn something like the simple expert systems. I mean, it's w- you can learn basic logic operations there.
- YLYann LeCun
Yeah, quite possibly.
- LFLex Fridman
Yeah.
- YLYann LeCun
Yeah. There's a big debate on sort of how much prior structure you have to put in for this kind of stuff to emerge. That's the debate I have with Gary Marcus and people like that.
- LFLex Fridman
Yeah. Yeah, so ... And the other person, so I, I just talked to Judea Pearl-
- YLYann LeCun
Mm-hmm.
- LFLex Fridman
... from you mentioned causal inference world. So, his worry is that the current neural networks are not able to learn what causes what, causal inference between things.
- 30:00 – 45:00
Okay. So, they're not…
- LFLex Fridman
or mathematical ideas, or what are they?
- YLYann LeCun
Okay. So, they're not mathematical ideas. So, they are, you know, algorithms. And there was a period-
- LFLex Fridman
Got it.
- YLYann LeCun
... where the US, uh, patent office would allow the patent of software as long as it was embodied. Uh, the Europeans are very different. They don't, they don't quite accept that. They have a different concept, but, you know. I don't, I no longer bel- I mean, I never actually strongly believed in this, but I, I don't believe in this kind of patent. Facebook basically doesn't believe in this kind of patent. Google files patents because they've been burned with Apple, and so now they do this for defensive purpose. But usually they say, "We're not gonna sue you if you infringe."
- LFLex Fridman
Right.
- YLYann LeCun
Facebook has a, has a similar policy. They say, you know, "We file patent on certain things for defensive purpose. We're not gonna sue you if you infringe unless you sue us." Um, so the...... the industry does not believe in, in patents. They are there because of, you know, the legal landscape and, and, and various things but, but I don't really believe in patents for this kind of stuff.
- LFLex Fridman
Okay, so, so that's... that's a great thing. So I- I-
- YLYann LeCun
I'll tell you a war story, actually.
- LFLex Fridman
Yeah.
- YLYann LeCun
So what happens was the- the first, uh, the first patent about convolutional net was about, kind of the early version of convolutional net that didn't have separate pooling layers. It had, uh, uh, convolutional layers which tried more than one, if you want, right?
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
And then there was a second one on convolutional nets with separate pooling layers, uh, trained with back prop. And they were filed as- filed in '89 and 1990 or something like this. At the time, the life, life of a patent was 17 years. So here's what ha- what happened over the next few years, is that we started developing, um, character recognition technology around, uh, convolutional nets. And, uh, in 1994, a check reading system was deployed in ATM machines. In 1995, it was for large, uh, check reading machines in back offices, et cetera. And those systems were developed by an engineering group that we were collaborating with, uh, at AT&T, and they were commercialized by NCR, which at the time was a subsidiary of AT&T.
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
Now, AT&T split up in 1996, eh, 199- early 1996, and the lawyers just looked at all the patents and they distributed the patents among the various companies. They gave the neuro- the convolutional net patent to NCR because they were actually selling products that used it. But nobody at NCR had any idea what a convolutional net was.
- LFLex Fridman
(laughs) Yeah.
- YLYann LeCun
Okay? So between 1996 and 2007... So there's a whole period until 2002 where I didn't actually work on machine learning or convolutional net. I resumed working on this around 2002. And between 2002 and 2007, I was working on them, crossing my finger that nobody at NCR would notice.
- LFLex Fridman
(laughs)
- YLYann LeCun
And nobody noticed.
- LFLex Fridman
Yeah, and I- and I hope that th- this kind of somewhat, as you said, lawyers aside, relative openness of the community now will continue.
- YLYann LeCun
It accelerates the entire progress of, of the industry. And, you know, the- the problems that Facebook and Google and others are facing today is not whether Facebook or Google or Microsoft or IBM or whoever is ahead of the other. It's that we don't have the technology to build those things we want to build. We want to build intelligent virtual assistants that have common sense. We don't have a monopoly on good ideas for this. We don't believe we do. Maybe others do- believe they do, but we don't, okay? If a startup tells you they have the secret to, you know, human level intelligence and common sense, don't believe them. They don't. And it's gonna take the entire, uh, work of the world research community for a while to get to the point where you can go off and- and each of those companies can start to build things on this. We're not there yet.
- LFLex Fridman
Absolutely, and this- this calls to the- the gap between the space of ideas and the rigorous testing of those ideas, of practical application that you often speak to. You've, uh, written advice saying, "Don't get fooled by people who claim to have a solution to artificial general intelligence, who claim to have an AI system that work just like the human brain, or who claim to have figured out how the brain works. Ask them what the error rate they get on MNIST or ImageNet." So...
- YLYann LeCun
Yeah, this is a little dated, by the way. (laughs)
- LFLex Fridman
Two thous- I mean, five years?
- YLYann LeCun
Yes.
- LFLex Fridman
Who's counting? Okay. But I think your opinion is still MNIST and ImageNet, uh, yes, may be dated. There may be new benchmarks, right? But I think that philosophy is one you still and- and- and somewhat hold, that benchmarks and the practical testing, the practical application is where you really get to test the ideas.
- YLYann LeCun
Well, it may not be completely practical. Like for example, you know, it could be a toy dataset or it could... but it- but it has to be some sort of task that the community as a whole has accepted as some sort of standard, you know, kind of benchmark, if you want. It doesn't need to be real. So for example, uh, many years ago here at FAIR, um, people, you know, Jason West and Antoine Board and a few others proposed the- the Baby tasks, which were kind of a, a toy problem to test the ability of machines to reason actually, to access working memory and things like this. And, uh, it was very useful even though it wasn't a real task. MNIST is kind of halfway a real task. Uh, so, you know, toy problems can be very useful. It's just that... I was really struck by the fact that a- a lot of people, particularly a lot of people with money to invest, would be fooled by people telling them, "Oh, we have, you know, the algorithm of the cortex and you should give us 50 million."
- LFLex Fridman
Yes, absolutely. So there's a lot of people who... who try to take advantage of the hype for business reasons and so on. But let me sort of talk to this idea that new ideas, the- the ideas that push the field forward may not yet have a benchmark, or it may be very difficult to establish a benchmark.
- YLYann LeCun
I agree. That's part of the process. Establishing benchmarks is part of the process.
- LFLex Fridman
So what are your thoughts about... So we have these benchmarks on- around stuff we can do with images, from classification, to captioning-
- YLYann LeCun
Mm-hmm.
- 45:00 – 1:00:00
Yeah, I mean, there's…
- LFLex Fridman
involvement of human input and still have, uh, successful systems that are have practical use?
- YLYann LeCun
Yeah, I mean, there's definitely a hope. It's, it's more than a hope actually. It's, uh, it's, you know, mounting evidence for it, and that's basically all I do. Like the only thing I'm interested in at the moment is, I call it self-supervised learning, not unsupervised, 'cause unsupervised learning is, is a loaded term. People who know something about machine learning, you know, tell you, "So you're doing clustering or PCA?"
- LFLex Fridman
Right, right.
- YLYann LeCun
Which is not the case. And the wide public, you know, when you say unsupervised learning, "Oh my God, you know, machines are gonna learn by themselves and without supervision?"
- LFLex Fridman
(laughs)
- YLYann LeCun
You know, they see this as-
- LFLex Fridman
Where's, where's the parents? (laughs)
- YLYann LeCun
Yeah. So, so I call it self-supervised learning because, in fact, the underlying algorithms that are used are the same algorithms as the supervised learning algorithms, um, except that what we train them to do is not predict a particular set of variables, like the category of a, of an image, and, and not to predict a set of variables that have been provided by human labelers. But what you train the machine to do is basically reconstruct a piece of its input that it's being, is being masked, masked out, essentially. You can think of it this way, right? So show a piece of video to a machine and ask it to predict what's gonna happen next. And of course, after a while, you can show what, what happens and the machine will kinda train itself to do better at that task. You can do, like all the latest, most successful models in natural language processing use self-supervised learning, you know, sort of BERT-style systems, for example, right? You show it a window of a dozen words on a test corpus, you take out 15% of the words, and then you train the machine to predict the words that are missing.
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
That's self-supervised learning. It's not predicting the future, it's just, you know, predicting things in the middle. But you could have it predict the future. That's what language models do. Um-
- LFLex Fridman
So you construct... So in an unsupervised way, you construct a model of language. Do you think-
- YLYann LeCun
Or video, or the physical world, or whatever, right?
- LFLex Fridman
Reality. How far do you think that can take us? Do you think-
- YLYann LeCun
Very far, I think.
- LFLex Fridman
... BERT understands anything?
- YLYann LeCun
To some level. It has, you know, a, a shallow understanding of, uh, of text, but it needs to... I mean, to have kind of true human-level intelligence, I think you need to ground language in reality, so some people are attempting to do this, right?
- LFLex Fridman
Attempted to do this.
- YLYann LeCun
Having systems that can have some visual representation of what, what is being talked about, which is one reason you need those interactive environments actually. But there's like a, a huge technical problem that is not solved, and that explains why self-supervised learning works in the context of natural language, but does not work in the context, or at least not well, in the context of image recognition and, uh, video, although it's making progress quickly. And the reason, uh, that reason is the fact that it's much easier to represent uncertainty in the prediction in the context of natural language than it is in the context of things like video and images. So for example, if I ask you to predict what words are missing, you know, 15% of the words that are-
- LFLex Fridman
Right.
- YLYann LeCun
... have taken out.
- LFLex Fridman
The possibility is just small. I mean that-
- YLYann LeCun
It's small, right? There is 100,000 words in the, in the lexicon, and you, what the machine spits out is a, a big probability vector, right? It's a bunch of numbers between zero and one that sum to one. And we know how to do, how to do this with computers.
- LFLex Fridman
Right.
- YLYann LeCun
So there, representing uncertainty in the prediction is relatively easy, and that's, in my opinion, why those techniques work for NLP. For images, if you ask... if you block a piece of an image and you ask the system reconstruct that piece of the image, there are many possible answers. Uh, they are all perfectly legit, right? And how do you represent that th- this set of possible answers? You, you can't train a system to make one prediction. You can't train a neural net to say, "Here it is. That's the image," because it's, it, there's a whole set of things that are compatible with it. So how do you get the machine to represent not a single output, but a whole set of outputs?And, y- you know, similarly with, uh, video prediction. There's a lot of things that can happen in the future of video. You're looking at me right now, I'm not-
- LFLex Fridman
Yeah.
- YLYann LeCun
... moving my head very much. But, you know, I might, you know, turn my, my head to the left or to the right.
- LFLex Fridman
Right.
- YLYann LeCun
If you don't have a system that can predict this, and you train it with least square to kind of minimize the error with the prediction and what I'm doing, what you get is a blurry image of myself-
- LFLex Fridman
Right.
- YLYann LeCun
... in all possible future positions that I might be in, which is not a good prediction.
- 1:00:00 – 1:15:00
Yes. …
- YLYann LeCun
of, you know, uh, imitation learning for self-driving car, but most of those are incredibly boring. What I like is select, you know, 10% of them that are kind of the most informative. And with just that, I would probably s- reach the same. So it's a weak form of, uh, of, of active learning if you want.
- LFLex Fridman
Yes.
- YLYann LeCun
Right?
- LFLex Fridman
But there might be a r- a, a, a much stronger version.
- YLYann LeCun
Yeah, that's right.
- LFLex Fridman
And that's what-
- YLYann LeCun
When the machine asks- Yeah.
- LFLex Fridman
... if it exists.
- YLYann LeCun
Yeah.
- LFLex Fridman
Uh, uh, the g- the question is how much stronger can you get? Elon Musk is confident, I talked to him recently, uh, he's confident that large-scale data and deep learning can solve the autonomous driving problem. What are your thoughts on the limitless possibilities of deep learning in this space?
- YLYann LeCun
Well, it's, it's obviously part of the solution. I mean, I don't think we'll ever have a self-driving system, or at least not in the foreseeable future, that does not use deep learning. Let me put it this way. Now, how much of it? So in the history of sort of engineering, particularly sort of, sort of AI-like systems, there's, uh, generally a first phase where everything is built by hand, then there is a second phase, and that was the case for autonomous driving, you know, 20, 30 years ago. There's a phase where there's a little bit of learning is used, but there's a lot of engineering that's involved in kind of, you know, taking care of corner cases and, and putting limits, uh, et cetera because the learning system is not perfect. And then as technology progresses-... we end up relying more and more on learning. That's the history of character recognitions, the history of speech recognition, now computer vision, natural language processing. And I think the same is going to happen with, uh, with autonomous driving, that currently the, the, uh, the methods that are closest to providing some level of autonomy, some, you know, decent level of autonomy, where you don't expect a driver to kind of do anything, is where you constrain the world. So, you only run within, you know, 100 square kilometers or square miles in Phoenix, where the weather is nice and the roads are wide. Which is what Waymo is doing. You, uh, completely over-engineer the car with tons of LiDARs and sophisticated sensors that are too expensive for consumer cars, but they're fine if you just run a fleet. And you engineer the thing, uh, the hell out of the, everything else. You, you map the entire world so you have complete 3D model of everything, so the only thing that the perception system has to take care of is moving objects and, and, and construction and sort of, you know, things that, that weren't in your map. And you can engineer a good, you know, SLAM system and all that stuff, right?
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
So, so that's kind of the current approach that's closest to some level of autonomy, but I think eventually, the long-term solution is gonna rely more and more on learning and possibly using a combination of self-supervised learning and model-based reinforcement or something like that.
- LFLex Fridman
But ultimately, learning will be at, not just at the core, but really the fundamental part of the system?
- YLYann LeCun
Yeah. It already is, but it'll become more and more.
- LFLex Fridman
What do you think it takes to build a system with human-level intelligence? You talked about the AI system in the movie Her being way out of reach, our current reach.
- YLYann LeCun
Mm-hmm.
- LFLex Fridman
This might be outdated as well. But, uh-
- YLYann LeCun
It's still way out of reach.
- LFLex Fridman
... it's still way out of reach. Uh, what would it take to build Her, do you think?
- YLYann LeCun
Um, so I can tell you the first two obstacles that we have to clear, but I don't know how many obstacles there are after this. So, the image I usually use is that there is a bunch of mountains that we have to climb. And we can see the first one, but we don't know if there are 50 mountains behind it or not. And this might be a good sort of metaphor for why AI researchers in the past have been overly optimistic about the result of, uh, of AI. You know, for example, um, uh, Newell and Simon, right, wrote the General Problem Solver, and they called it the General Problem Solver.
- LFLex Fridman
(laughs) General Problem Solver.
- YLYann LeCun
Okay? And, of course, the first thing you realize is that all the problems you want to solve are exponential, and so you can't actually use it for anything useful. But, you know-
- LFLex Fridman
Yeah. So, yeah, all you see is the first peak. So, in general, what are the first couple of peaks for Her?
- YLYann LeCun
Um, so the first peak, which is precisely what I'm working on, is, uh, self-supervised learning. How, how do we get machines to learn models of the world by observation, kind of like babies and like young animals? So, I- we've been working with, um, you know, cognitive scientists. So, this Emmanual Dupoux, who's at FAIR in Paris is, uh... Half time is also a researcher in French university. And he, he, um, he has this chart that shows that which- how many months of, of life baby humans kind of learn different concepts. And you can mea- you can measure this in sort of various ways. So, things like, um, uh, distinguishing animate objects from inanimate, inanimate object, you can, you can tell the difference at age two, three months. Whether an object is gonna stay stable or it's gonna fall, you know, about four months you can tell.
- LFLex Fridman
Mm-hmm.
- YLYann LeCun
You know, there, there are various things like this. And then things like gravity, the fact that objects are not supposed to float in the air, but are supposed to fall, you learn this around the age of eight or nine months. If you look at a lot of, you know, eight-month-old babies, you give them a bunch of toys on their high chair, first thing they do is to throw them on the ground and they look at them. It's because, you know, they're learning about- actively learning-
- LFLex Fridman
Yeah.
- YLYann LeCun
... about gravity.
- LFLex Fridman
Gravity, yeah. (laughs)
- 1:15:00 – 1:16:04
And if she says…
- YLYann LeCun
And if she answers, "Oh, it's because the leaves of the tree are moving and that creates wind," she's onto something.
- LFLex Fridman
And if she says that's a stupid question, she's really onto something.
- YLYann LeCun
No. And then you tell her, actually, you know, here is the real thing. And she says, "Oh, yeah, that makes sense."
- LFLex Fridman
So questions that reveal the ability to do common sense reasoning about the physical world.
- YLYann LeCun
Yeah. And, you know, something that we'll call toning inference.
- LFLex Fridman
Causal inference. Well, it was a huge honor. Congratulations-
- YLYann LeCun
It was a pleasure.
- LFLex Fridman
-on the Turing Award.
- YLYann LeCun
Thank you.
- LFLex Fridman
Yann, thank you so much for talking today.
- YLYann LeCun
Thank you.
- LFLex Fridman
Appreciate it.
Episode duration: 1:15:58
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode SGSOCuByo24
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome