Lex Fridman PodcastFrançois Chollet: Keras, Deep Learning, and the Progress of AI | Lex Fridman Podcast #38
EVERY SPOKEN WORD
150 min read · 30,026 words- 0:00 – 5:30
Challenging the “intelligence explosion” narrative
- LFLex Fridman
The following is a conversation with Francois Chollet. He's the creator of Keras, which is an open source deep learning library that is designed to enable fast, user-friendly experimentation with deep neural networks. It serves as an interface to several deep learning libraries, most popular of which is TensorFlow, and it was integrated into the TensorFlow main code base a while ago, meaning if you want to create, train, and use neural networks, probably the easiest and most popular option is to use Keras inside TensorFlow. Aside from creating an exceptionally useful and popular library, Francois is also a world-class AI researcher and software engineer at Google, and he's definitely an outspoken, if not controversial personality in the AI world, especially in the realm of ideas around the future of artificial intelligence. This is the Artificial Intelligence podcast. If you enjoy it, subscribe on YouTube, give us five stars on iTunes, support it on Patreon, or simply connect with me on Twitter at Lex Fridman, spelled F-R-I-D-M-A-N. And now here's my conversation with Francois Chollet. You're known for not sugarcoating your opinions and speaking your mind about ideas in AI, especially on Twitter. It's one one of my favorite Twitter accounts. So what's one of the more controversial ideas you've expressed online and gotten some heat for? How do you pick?
- FCFrançois Chollet
(laughs) How do I pick? Yeah, no, I think if you have, um, if you go through the trouble of maintaining a Twitter account, you might as well speak your mind, you know? Otherwise it's, you know, what- what's even the point of having a Twitter account? It's like having a nice car and just leaving it- leave it in the- in the garage. Uh, yes, so what's one thing for which I got a lot of pushback, perhaps, you know, uh, that time I wrote something about, uh, the idea of intelligence explosion, and I was questioning, uh, the idea and the reasoning behind this idea, and, uh, I got a lot of pushback on that, uh, got a lot of flak for it. So yeah, so intelligence explosion, I'm sure you're familiar with the idea, but it's the idea that if you were to build general AI problem-solving algorithms, well, the problem of building such an AI, that itself is a problem that could be solved by your AI, and maybe it could be solved better than, uh, than what humans can do.
- LFLex Fridman
Right.
- FCFrançois Chollet
So your AI could start tweaking its own algorithm, could, uh, start being a better version of itself and so on, iteratively, in a- in a recursive fashion, and so you would end up with, um, an AI with exponentially increasing intelligence.
- LFLex Fridman
That's right.
- FCFrançois Chollet
And I was basically questioning this idea, first of all, because the notion of intelligence explosion uses an implicit definition of intelligence that doesn't sound quite right to me. It considers intelligence as a property of a brain that you can consider in isolation, like the height of a building for instance.
- LFLex Fridman
Right.
- FCFrançois Chollet
But that's not really what intelligence is. Intelligence, uh, emerges from the interaction between a brain, a body, like embodied intelligence, and an environment. And if you're missing one of these pieces, then you cannot really define intelligence anymore. So just tweaking a brain to make it smarter and smarter doesn't actually make any sense to me.
- LFLex Fridman
So first of all, you're crushing the dreams of many people, right? So there's, uh, let's look at, like, Sam Harris, actually a lot of physicists, Max Tegmark, people who think, you know, the universe is a information processing system, our brain is kind of an information processing system.
- FCFrançois Chollet
It is.
- LFLex Fridman
So what's the theoretical limit? Like, wh- it doesn't make sense that there should be some, uh... It seems naive to think that our own brain is somehow the limit of the capabilities of this information, it's just I'm playing devil's advocate here, uh, this information processing system, and then if you just scale it, if you're able to build something that's on par with the brain, you just, the process that builds it just continues and it'll improve exponentially. So that- that's the logic that's used actually by almost everybody that is worried about superhuman intelligence.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
So you're- you're trying to make... So most people who are skeptical of that are kind of like, (sighs) "This doesn't..." Their thought process is, "This doesn't feel right." Like, that's for me as well. So I'm more like, it doesn't... (sighs) The whole thing is shrouded in mystery where you- you can't really say anything concrete, but you could say this doesn't feel right, this doesn't feel like that's how the brain works. And you're trying to, with your blog post and now making it a little more explicit. So one idea is that the brain isn't... It exists alone, it exists within the environment. So you can't exponentially... You would have to somehow exponentially improve the environment and the brain together almost, yeah, in order to create something that's much smarter in some kind of, uh... Of course, we don't have a definition of intelligence. But-
- 5:30 – 8:05
Intelligence as potential vs. expressed ability: the “right problem at the right time”
- FCFrançois Chollet
That's correct. That's correct. I- I- I don't think... If you look at very smart people today, even humans, not even talking about AIs, I don't think their brain and the performance of their brain is the bottleneck to their- to their expressed intelligence, to their achievements. You cannot just tweak one part of the system, like of this brain-body-environment system and expect the capabilities, like what emerges out of this system, to just, you know, uh, explode exponentially.... because, um, any time you improve one part of a system with many interdependencies like this, uh, there's a new bottleneck that arises, right? And I don't think even today, for very smart people, their brain is not the bottleneck, uh, to the sort of problems they can solve, right? In fact, many various smart people today, uh, you know, they're, they're not actually solving any big scientific problems. They're not Einstein. They're like Einstein, but, you know, the, the patent clerk days. Um-
- LFLex Fridman
(laughs)
- FCFrançois Chollet
... like Einstein became Einstein because this was a meeting of a genius with a big problem at the right time, right? But maybe this meeting could have, you know, never happened and then Einstein would have just been a patent clerk, right? It's ... and in fact many people today are probably like genius level smart, but you wouldn't know because they're not expressing any of that.
- LFLex Fridman
Wow, that's brilliant. So, we can think of the world, Earth, but also the universe as just ... as a space of problems. So all these problems and tasks are roaming it of various difficulty and there's agents, creatures like ourselves and animals and so on that are also roaming it, and then you, you get coupled with a problem and then you solve it. But without that coupling, you can't demonstrate your "intelligence."
- FCFrançois Chollet
Exactly. Intelligence is the meeting of great problem-solving capabilities-
- LFLex Fridman
Mm.
- FCFrançois Chollet
... with a great problem, and if you don't have the problem, you're not really expressing intelligence. All y- all you're left with is potential intelligence-
- LFLex Fridman
Mm.
- FCFrançois Chollet
... like the performance of your brain or, you know, how high your IQ is, which in itself, uh, is just a number, right?
- LFLex Fridman
Right. So, you mentioned problem-solving capacity.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
What, what do you think of as problem-solving capa- ... what ... can you try to define intelligence? Like what does it mean to be more or less intelligent? Is it completely coupled to a particular problem? Or is there something a little bit more universal?
- 8:05 – 9:47
Specialization, priors, and the limits of human generality
- FCFrançois Chollet
Yeah, I do believe all intelligence is specialized intelligence, even human intelligence has some degree of generality. Well, all intelligent systems have some degree of generality, but they're always specialized in, in one category of problems. So the, the human intelligence is specialized in the human experience and that shows at various levels. That shows in some prior knowledge that's innate that we have at birth, knowledge about, um, things like agents, uh, goal-driven behavior, uh, visual priors about what makes an object, priors about time, and so on. Uh, that shows also in the way we learn. For instance, it's very, very easy for us to pick up language, it's very, very easy for us to learn certain things because we are basically hardcoded to learn them, and we are specialized in solving certain kinds of problem and we are quite l- useless when it comes to other kinds of problems. For instance, we, we are not really designed to handle very long-term problems. We have no capability of seeing the, the, the very long-term. Um, we don't have, um, very much working memory, you know?
- LFLex Fridman
So, how do you think about long-term? Do you think long-term planning? Are we talking about a scale of years, millennia? What do you mean by l- long-term, we're not very good?
- FCFrançois Chollet
Well, human intelligence is specialized in the human experience and human experience is, is very short. Like one lifetime is short. Even within one lifetime, uh, we have a, a, a, a very hard time envisioning, you know, uh, things on a scale of years. Like it's very difficult to project yourself at a, at a scale of five year, at a scale of 10 years, and so on.
- LFLex Fridman
Right.
- 9:47 – 13:35
Civilization and science as distributed, superhuman problem-solving systems
- FCFrançois Chollet
We can solve only fairly narrowly scoped problems. So, when it comes to solving bigger problems, larger scale problems, we are not actually doing it on an individual level. So it's not actually our brain doing it. We, we, we have this thing called civilization, right, which is itself a sort of problem-solving system, a sort of, uh, artificial intelligent system, right?
- LFLex Fridman
Right.
- FCFrançois Chollet
And it's not running on one brain; it's running on a network of brains. In fact, it's running on, on much more than a network of brains. It's running on a lot of, uh, infrastructure like books and computers and the internet and human institutions and so on, and that is capable of, uh, handling problems, uh, on a, on a much greater scale than any individual human. If you look at, um, computer science for instance, that's an institution that solves problems and it's, it is superhuman, right? Uh, it operates on a, on a greater scale, it can solves, uh, it can solve much bigger problem than, uh, an individual human could. And science itself, science as a system, as an institution-
- LFLex Fridman
Mm.
- FCFrançois Chollet
... is a kind of a artificially intelligent, um, problem-solving algorithm that is superhuman.
- LFLex Fridman
Yeah. It's, uh, it's a (laughs) ... well, at least computer science is like a theorem prover (laughs) at a scale of, uh, thousands, maybe hundreds of thousands of human beings. At that scale, what do you think is a intelligent agent? So there's us humans at the individual level, there is, uh, millions, maybe billions of bacteria on our skin, uh, there is ... that's at the smaller scale. You can even go to the particle level as systems that behave, uh, you can say intelligently in some ways, uh, y- and then you can look at Earth as a single organism, you could look at our galaxy and even the universe as a single organism. Do you think ... how do you think about scale in defining intelligence systems? And we're here at Google, there is millions of devices doing computation dis- ... in a distributed way.
- FCFrançois Chollet
Right. (clears throat)
- LFLex Fridman
How do you think about intelligence versus scale?
- FCFrançois Chollet
You can always characterize, uh, anything as a system.
- LFLex Fridman
Right.
- FCFrançois Chollet
Um, I think people who-Talk about things like intelligence explosion tend to focus on one agent is basically one brain, like one brain considered in isolation, like a brain in a jar that's controlling a body in a very, like, top to bottom kind of fashion, and that body is pursuing goals, uh, into an environment. So it's a very hierarchical view. You have the brain at the top of the pyramid, then you have the body just plainly receiving orders, and then the body's manipulating objects and environment and so on. So everything is subordinate to this one thing, this epicenter, which is the brain.
- LFLex Fridman
Right.
- FCFrançois Chollet
But in real life, intelligent agents don't really work like this, right? There is no strong delimitation between the brain and the body to start with. You have to look not just at the brain but at the nervous system, but then the nervous system and the body are not really two separate entities. So you have to look at an entire animal as one agent, but then you start realizing as you observe an animal over a- a- a- a- a- any length of time that a lot of the intelligence of an animal is actually externalized. That's especially true for humans. A lot of our intelligence is externalized. When you write down some notes, that is externalized intelligence. When you write a computer program, you are externalizing cognition. So it's externalized in books. It's externalized in- in computers, the internet, in other humans. It's externalized in language and so on. So it's... There is no, like, hard delimitation of what makes an intelligent agent. It's all about context.
- 13:35 – 25:24
Why recursive self-improvement doesn’t imply an explosion: bottlenecks and ‘exponential friction’
- LFLex Fridman
Okay, but, uh, AlphaGo is better at Go than the best human player. W- uh, you know, there's levels of skill here. So do you think there's such a ability as- a such a concept as a intelligence explosion in a specific task? And then... Well, yeah, d- do you think it's possible to have a category of tasks on which you do have something like an exponential growth of ability to solve that particular problem?
- FCFrançois Chollet
I think if you consider a specific vertical, uh, it's probably possible to some extent. I also don't think we have to speculate about it because we have real world examples-
- LFLex Fridman
Mm.
- FCFrançois Chollet
... of recursively self-improving intelligent systems, right?
- LFLex Fridman
Right.
- FCFrançois Chollet
So for instance, science is a problem-solving system, a knowledge generation system, like a system that experiences the world in some sense and then gradually understands it and can act on it, and that system is superhuman and it is clearly, uh, recursively self-improving because science feeds into technology. Technology can be used to build better tools, better computers, better instrumentation and so on, which in turn, uh, can make science faster, right? So science is probably the closest thing we have today to a recursively self-improving superhuman AI. And you can just observe, you know, is science, is scientific progress today exploding? Which, you know, itself is- is- is an interesting question. You can use that as a basis to try to understand what will happen with a superhuman AI that has s- uh, uh, science-like behavior.
- LFLex Fridman
Right. Let me linger on it a little bit more. What is your intuition why an intelligence explosion is not possible? Like taking the scientific... all the s- semi-scientific revolutions, why can't we slightly accelerate that process?
- FCFrançois Chollet
So you- you can absolutely, uh, uh, accelerate any problem-solving process.
- LFLex Fridman
Yep.
- FCFrançois Chollet
So, uh, recursively, uh, uh, recursive self-improvement is absolutely a real thing, but what happens with a recursively self-improving system is typically not explosion because no system exists in isolation. And so tweaking one part of the system means that suddenly another part of the system becomes a bottleneck, and if you look at science for instance, which is clearly recursively self-improving, clearly a problem-solving system, scientific progress is not actually exploding. If you look at science, what you see is the picture of a system that is consuming an exponentially increasing amount of resources-
- LFLex Fridman
Right.
- FCFrançois Chollet
... but ha- uh, it's having a linear output in terms of scientific progress, and may- maybe that- that will seem like a very strong claim. Many people are- are- are actually saying that, you know, s- scientific progress is exponential, but when they're claiming this, they are actually looking at indicators of, uh, uh, resource consumptions, resource consumption by science. For instance, the number of, um, uh, papers being published, the number of pa- patents being filed and so on, which are just- just completely correlated with how many people are working o- on the, uh, on- on science today.
- LFLex Fridman
Yeah.
- FCFrançois Chollet
Right? So it's actually an indicator of resource consumption, but what you should look at is the output, is um, progress in terms of the knowledge that science generates, in terms of the- the scope and significance of the problems that we solve, and, uh, some people have actually been trying to measure that.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Like, um, Michael Nielson, for instance.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
He had a- a very nice paper, uh, I think it was last year about it. So his approach to measure scientific progress was to, uh, look at the timeline, uh, of scientific discoveries over the past, you know, 100, 150 years, and for, um, each measured discovery, ask a panel of experts to rate the significance of the discovery.
- LFLex Fridman
Hmm.
- FCFrançois Chollet
And if the output of science as an institution were exponential, you would expect the temporal density of significance to go up exponentially, maybe because there- there's a faster rate of discoveries, maybe because the discoveries are, you know, increasingly more important.And, uh, what actually happens if you, if you plot this temporal density of significance measured in this way is that you see very much a flat graph. You see a flat graph across all disciplines, across physics, biology, medicine, and so on. And it actually makes a lot of the- of sense if you think about it, because think about the progress of physics, uh, uh, 110 years ago, right? It was a time of crazy change.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Think about the progress of technology, you know, uh, 130 years ago when we started having, you know, replacing horses with cars, when we started having electricity and so on.
- LFLex Fridman
Yeah.
- FCFrançois Chollet
It was a time of incredible change, and today is also a time of very, very fast change, but it would be, uh, an unfair characterization to say that today technology and science are moving way faster than they did 50 years ago or 100 years ago. And if you do try to rigorously plot the temporal density of the significance-
- LFLex Fridman
Significant ideas, yeah.
- FCFrançois Chollet
... yeah, of significance idea, of significant idea, sorry, you do see very flat curves.
- LFLex Fridman
That's fascinating.
- FCFrançois Chollet
And, and, and you can check out the paper that Michael Nielson had, uh, about this idea. And so the way I interpret it is as you make progress in a, in a given field or in any given sub-field of science, it becomes exponentially more difficult to make further progress.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Like, the very first, uh, uh, person to work on information theory, if you enter a new field and it's still the very early years, there's a lot of, uh, low-hanging fruit you can pick.
- 25:24 – 28:03
Narratives, identity, and why singularity talk attracts pushback
- FCFrançois Chollet
... of anything. That's not my point. Like, I'm not, I'm not trying to prove anything. I'm just trying to make an argument to question the narrative-
- LFLex Fridman
Yeah.
- FCFrançois Chollet
... of intelligence explosion, which is quite a dominant narrative, and you do get a lot of pushback if you go against it because... So, for many people, right, AI is not just a subfield of computer science. It's more like a belief system.
- LFLex Fridman
Right.
- FCFrançois Chollet
Like, this belief that the world is headed towards an event, the singularity, past which, uh, you know, uh, AI will become, uh, will- will go exponential very much, and the world will be transformed, and humans, uh, will become obsolete. And if you, if you go against this narrative beca- because it is not really a scientific argument, but more of a belief system, uh, it is part of the identity of many people. If you go against this narrative, it's like you're attacking the identity of people who believe in it. It's almost like saying God doesn't exist or something.
- LFLex Fridman
Right.
- FCFrançois Chollet
Uh, so you do get a lot of pushback if you try to question these ideas.
- LFLex Fridman
First of all, I believe most people, uh, they might not be as eloquent or explicit as you're being, but most people in computer science or most people who actually have built anything that you could call AI, quote-unquote, would agree with you. They might not be describing in the same kind of way. It's more, uh... So, the pushback you're getting is from people who get attached to the narrative from, not from a place of science, but from a place of imagination and science fiction?
- FCFrançois Chollet
That is correct. That is correct.
- LFLex Fridman
So, why do you think that's so appealing? Because th- the usual dreams that people have when you create a superintelligence system past the singularity, uh, what people imagine is somehow always destructive. Do you have... If you were to put on your psychology hat, what's, why is it so appealing to imagine the ways that all of human civilization will be destroyed?
- FCFrançois Chollet
I think it's a good story, you know? It's a good story, and very interestingly, it mirrors religious stories, right? Religious mythology. If you look at the mythology of most civilizations, it's about the world being headed towards some final events in which the world will be destroyed and some new world order wi- will arise that will be mostly spiritual, like the apocalypse followed by, uh, uh, paradise probably, right?
- LFLex Fridman
Yeah.
- FCFrançois Chollet
It's a very appealing story-
- LFLex Fridman
Yeah.
- FCFrançois Chollet
... on a fundamental level. And we all need stories. We all need stories to structure the way we see the world, especially at time scales that are, uh, uh, beyond our ability to make predictions.
- 28:03 – 30:59
Deep learning surprises (2013–2014) and the meaning of “human-level intelligence”
- LFLex Fridman
Right. So, on a more serious, non-exponential explosion question, do you think there will be a time when, uh, we'll create something like human-level intelligence or intelligent systems that will make you sit back and be just surprised at, "Damn, how smart this thing is"? Uh, that doesn't require exponential growth or exponential improvement, but what- what's your sense and a timeline and so on, uh, that's wh- you'll be really surprised at certain capabilities? And we'll talk about limitations in deep learning and so on. When do you th- do you think in your lifetime you'll be really damn surprised?
- FCFrançois Chollet
Around 2013, 2014, I was many times surprised by the capabilities of deep learning, actually.
- LFLex Fridman
Right. Yeah.
- FCFrançois Chollet
That was before we had assessed exactly what deep learning could do and could not do, and it felt like a time of, uh, immense potential, and then we started, you know, narrowing it down. But I was very surprised, so we'd say it's, it's, it's, it has already happened.
- LFLex Fridman
Was there a moment... There must have been a day in there where your surprise was almost bordering on the belief of the narrative that we just discussed. Wha- was there a moment, 'cause you've written quite eloquently about the limits of deep learning, was there a moment that you thought that maybe deep learning is limitless?
- FCFrançois Chollet
No, I don't think I've ever believed this. What was really shocking is that it, it worked, right?
- LFLex Fridman
It worked at all, yeah.
- FCFrançois Chollet
Yeah. But there- there's a, there's a big jump between being able to do really good computer vision and human-level intelligence. So, I- I don't think, at any point, I was under the impression that the results we got in computer vision meant that we were very close to human-level intelligence. I don't think we are very close to human-level intelligence. I do believe that there's no reason why we won't achieve it at some point. I also believe that... You know, it's... The problem with, with talking about human-level intelligence is that implicitly you're considering like an axis of intelligence with different levels.
- LFLex Fridman
Right.
- FCFrançois Chollet
But that's not really how intelligence works. Intelligence is very, uh, multidimensional. And so does the question of...... uh, capabilities, but there's, uh, also the question of being human-like and it's two very different things. Like, you can build potentially very, uh, uh, advanced intelligent agents that are not human-like at all, and you can also build very, uh, human-like agents. And these are very, two very different things, right?
- LFLex Fridman
Right. Let's go from the philosophical to the practical. Uh, can you give me a history of Keras and all the major deep learning frameworks that you kind of remember in relation to Keras and in general? TensorFlow, Theano, the old days. Can you give a brief overview, Wikipedia-style history and your role in it before we return to AGI discussions? (laughs)
- 30:59 – 35:18
Keras origin story and early deep learning frameworks (Caffe, Theano, Torch)
- FCFrançois Chollet
Yeah, that's, that's a broad topic. So I started working on Keras ... It wasn't named Keras at the time. I actually picked the name, like, uh, just the day I was gonna release it. So I started working on it in February 2015, and so at the time, there weren't too many people working on deep learning, maybe like fewer than 10,000. The software tooling was not really developed. So the main deep learning library was Caffe, which was mostly C++.
- LFLex Fridman
Why, why do you say Caffe was the main one?
- FCFrançois Chollet
Caffe was vastly more popular than Theano in, uh, in late 2014, early 2015. Caffe was the one library that everyone was using for computer vision.
- LFLex Fridman
And computer vision was the most popular problem-
- FCFrançois Chollet
Absolutely.
- LFLex Fridman
... in deep learning at the time.
- FCFrançois Chollet
Uh, com- Like, convnets was, like, the subfield of deep learning that everyone was working on.
- LFLex Fridman
Right.
- FCFrançois Chollet
So myself ... So in,in, in late 2014, I was actually interested in, uh, RNNs, in recurrent neural networks, which was a very niche topic at the time, right? It really, it really took off arou- around 2016. And so I was looking for good tools. I had l- I had used, uh, Torch 7. I had used Theano, used Theano a lot, uh, in, uh, Kaggle competitions. Hmm, I had used Caffe, and, uh, th- there was no, like, good solution for RNNs at the time. Like, there was no reusable open source implementation of an LSTM, for instance. So I decided to build my own, and at first, uh, the pitch for that was it was gonna be mostly around, uh, LSTM, recurrent neural networks. It was gonna be in Python. An important decision, uh, at the time that was kinda not obvious is that the models w- would be defined via, uh, Python code, which was kind of, like, going against, uh, the mainstream at the time because Caffe, PyTorch and so on, like, all the big libraries, were actually, uh, going with the approach of having static configuration files in YAML to define models.
- LFLex Fridman
Yeah.
- FCFrançois Chollet
So some libraries were using, uh, code to define models, like Torch 7 obviously, but that was not Python. Lasagne was, like, a Theano-based, uh, very early library that was, I think, developed, I'm not sure exactly, probably late 2014.
- LFLex Fridman
It's Python as well.
- FCFrançois Chollet
It's Python as well. It was, it was, like, on top of Theano. And so I started working on something and the, and the value proposition at the time was that not only did the, uh, what I think was the first reusable open source implementation of LSTM, you could combine RNNs and convnets with the same library, which was not really possible before. Like, Caffe was only doing convnets. And it was kinda easy to use because ... So before I was using Theano, I was actually using scikit-learn, and I loved scikit-learn for its usability, so, uh, I drew a lot of inspiration from scikit-learn when, when I made Keras. It's almost like scikit-learn for neural networks.
- LFLex Fridman
Yep. The fit function.
- FCFrançois Chollet
Exactly, the fit function. Like, reducing, uh, a complex training loop to a single function call, right?
- LFLex Fridman
Yeah.
- FCFrançois Chollet
And of course, you know, some people will say this is hiding a lot of details, but that's exactly the point, right?
- LFLex Fridman
Right.
- FCFrançois Chollet
The magic is the point.
- LFLex Fridman
Right.
- FCFrançois Chollet
So it's magical, but in a good way. It's magical in the sense that it's delightful, right?
- LFLex Fridman
Yeah, yeah. I'm, I'm actually quite surprised. I didn't know that it was born out of desire to, uh, implement RNNs and LSTMs.
- FCFrançois Chollet
It was. It was.
- LFLex Fridman
That's fascinating. So you were actually one of the first people to really try to attempt, um, to get the major architectures together, and it's also interesting, you made me realize that that was a design decision at all is defining the model in code. Just I'm, I'm putting myself in your shoes, whether the YAML, especially if Caffe was the most popular.
- FCFrançois Chollet
It was the most popular by far at the time.
- LFLex Fridman
If I was ... If I were ... Yeah, I don't ... It ... I didn't like the YAML thing, but it makes more sense that you would put in a configuration file the definition of a model. That's an interesting gutsy move to stick with defining it in code, just if, if you look back.
- FCFrançois Chollet
O- O- Other libraries were, were doing it as well, but it was definitely the more niche option.
- LFLex Fridman
Yeah. Okay, Keras and then-
- 35:18 – 39:47
Joining Google, porting to TensorFlow, and the path to tf.keras integration
- FCFrançois Chollet
Keras ... So I released Keras in March 2015, and it got users pretty much from the start. So the deep learning community was very, very small at the time. Uh, lots of people were starting to be interested in LSTM, so it was kinda released at, at the right time because it was offering an easy-to-use LSTM implementation exactly at the time where lots of people started to be, uh, intrigued by the capabilities of RNN, uh, RNNs for NLP, so it, it grew from there. Then, uh, I joined Google ba- about six months later, and that was actually completely unrelated to, to Keras.
- LFLex Fridman
(laughs)
- FCFrançois Chollet
I actually joined a research team working on, uh, image classification mostly, like computer vision. So I was doing computer vision research, like, at Google initially, and, uh, immediately when I joined Google, I was exposed to the early internal version of TensorFlow.
- LFLex Fridman
Okay.
- FCFrançois Chollet
And the way it appeared to me at the time, and it was definitely the way it was at the time, is that this was an improved version of Theano.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
So I immediately knew I had ...... to port Keras, to this new TensorFlow thing.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
And I- I was actually very busy as a, as a Noogler, as a new Googler.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Uh, so I had not time to work on that. But then in November, I think it was November, uh, 2015, uh, TensorFlow, uh, got released and it was kind of like my- my wake-up call that, "Hey, I- I have to actually, you know, go and make it happen." So in December, I- I- I ported Keras to run on top of TensorFlow, but it was not exactly a port, it was more like a refactoring, where I was, uh, abstracting away all the backend functionality into one module, so that the same code base could run on top of multiple backends, right?
- LFLex Fridman
Yeah.
- FCFrançois Chollet
So on- on top of TensorFlow or Theano. And for the next year, Theano, you know, stayed as the default option. It was, uh, you know, it was easier to use, uh, somewhat less buggy, uh, it was much faster especially when it came to
- LFLex Fridman
And TensorFlow, the early TensorFlow has similar architectural decisions as Theano, right?
- FCFrançois Chollet
Yeah.
- LFLex Fridman
So what- is there- it was a natural- it was a natural transition?
- FCFrançois Chollet
Yeah, absolutely.
- LFLex Fridman
So what, uh ... I mean, that's still Keras is a- is a side, almost fun project, right?
- FCFrançois Chollet
Yeah, so it, uh, it was not my- my job assignment. It was not. I was doing it on the side. So I'm, uh ... And even though it- it grew to have, you know, uh, a lot of users for a deep learning library at the time, like throughout 2016, but I wasn't doing it as my main job.
- LFLex Fridman
Right.
- FCFrançois Chollet
So things started changing in, I think it's must have been maybe October 2016, so one year later. So Rajat, who was the lead in TensorFlow, basically showed up one day, uh, in- in- in our building-
- LFLex Fridman
Yeah.
- FCFrançois Chollet
... uh, where I was doing like ... So I was doing research and things alike, so I- I did a lot of computer vision research, um, also collaborations with, uh, Christian Szegedy on, uh, uh, deep learning for theorem proving, which was, um, uh, uh, a really interesting research topic.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
And so, uh, R- Rajat was saying, "Hey, uh, um, we saw Keras, we like it, we saw that you're at Google, why don't you come over for like a quarter and, uh, and- and work with us?" And I was like, "Yeah, that's sounds like a great opportunity, let's do it." And so I started working on, uh, uh, integrating, uh, the Keras API into TensorFlow more tightly, um, so what followed up is, um, a sort of like temporary, uh, TensorFlow only version of Keras, uh, that was in, uh, tensorflow.contrib for a while, uh, and finally moved to TensorFlow Core and, you know, I- I've never actually gotten back to my old team doing research. (laughs)
- LFLex Fridman
(laughs) Well, it's- it's- it's kind of funny that somebody, uh, like you who dreams of ... or at least sees the- the power of AI systems that reason and theorem proving we'll talk about, has also created a system that makes the- the most basic kind of Lego building that is deep learning super accessible, super easy, so beautifully so. It's- it's a- it's a funny irony that you're b- there's- there's both-
- 39:47 – 43:07
TensorFlow 2.0: usability plus flexibility (eager execution, custom loops, workflow spectrum)
- LFLex Fridman
you're responsible for both things, but, uh, so TensorFlow 2.0 is kind of, there's a sprint, I don't know how long it'll take, but there's a sprint towards the finish. What do you look ... What are you, uh, working on these days? What are you excited about? What are you excited about in 2.0? I mean, eager execution, there's so many things that just make it a lot easier to- to work.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
Uh, what are you excited about and what's also really hard? What are the problems you have to kind of solve?
- FCFrançois Chollet
So I've spent the past year and a half working on TensorFlow 2 and it's been a long journey. I'm ac- actually extremely excited about it. I think it's a great product. It's a delightful product compared to TensorFlow 1. We've made, uh, huge progress. So on the Keras side, uh, what I'm really excited about is that ... So, you know, previously, Keras has been this very, uh, easy-to-use, high-level interface to do deep learning, uh, but if you wanted to, you know, uh ... if you wanted a lot of flexibility, uh, the Keras framework, you know, was probably not the optimal way to do things compared to just writing everything from scratch.
- LFLex Fridman
Right.
- FCFrançois Chollet
So in some way, the framework was getting in the way, and in TensorFlow 2.0, you don't have this at all actually. You have the usability of the high-level interface, but you have the flexibility of this lower level interface, and you have this spectrum of workflows-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... where you can get more or less usability and flexibility trade-offs depending on your needs, right? You can, uh, uh, write everything from scratch and you get a lot of help doing so, uh, by, you know, subclassing models and writing, uh, custom train loops using eager execution. It's very flexible, it's very easy to debug, it's very powerful. But all of this integrates seamlessly with higher level features-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... up to, you know, the classic Keras workflows, which- which are very Scikit-learn-like and, uh, and, uh, uh, you know, ideal for a- a data scientist, machine learning engineer type of profile. So now you- you can have the same framework offering the same set of APIs that enable a spectrum of workflows that are more or less lower level, more or less high level, that are suitable for, you know, profiles ranging from, uh, researchers to data scientists and everything in between.
- LFLex Fridman
Yeah, so that's super exciting. I mean, it's not just that, it's connected to all kinds of tooling. You can go on mobile, you can go with TensorFlow Lite, you can go on the cloud with serving and so on, and all is connected together. Now some of the best software written ever is-... often done by one person, sometimes two. So with- at Google, you're now seeing sort of Keras having to be integrated, and TensorFlow I'm sure has a ton of engineers working on, so. And there's, I am sure, were a lot of tricky design decisions to be made. How does that process usually happen, from at least your perspective? What are the- what are the debates like? What are- is there a lot of thinking, considering different options and so on?
- 43:07 – 46:23
API design at scale: constraints, simplicity, and matching users’ mental models
- FCFrançois Chollet
Yes. So a- a lot of the time I spend at Google is actually, uh, discussing design discussions, right? Uh, writing design docs, participating in design review meetings, and so on. Uh, this is, you know, as important as actually writing the code.
- LFLex Fridman
Right. What's, uh-
- FCFrançois Chollet
So there's a lot of thoughts, there's a lot of thought and- and a lot of care that is, uh, taken in coming up with these decisions, and taking into account, uh, all of our users, because TensorFlow has this extremely diverse user base, right? It's not- it's not like just one user segment where everyone has the same needs.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Uh, we have small scale production users, large scale production users, we have, uh- uh- uh, startups, we have researchers, um, you know, it's all over the place and we have- we have to cater to all of their needs.
- LFLex Fridman
If I just look at the standard, uh, debates of C++ or Python, there's some heated debates. Do you have those at Google? I mean, they're not heated in terms of emotionally, but there's probably multiple ways to do it, right? So, how do you arrive through those design meetings at the best way to do it? Especially in deep learning where the field is evolving as you're doing it. Is there some magic to it? Is there some magic to the process?
- FCFrançois Chollet
I don't know if there's magic to the process, but there definitely is a process. So, making design decisions is about satisfying a set of constraints, but also trying to do so in the simplest way possible, because this is, uh- uh, what can be maintained, this is what can be, uh, expanded in the future. So, you don't want to naively, uh, satisfy the- the constraints by just, you know, for each capability you need available, you're gonna come up with one argument in your API and so on.
- LFLex Fridman
Right.
- FCFrançois Chollet
You want to design APIs, um, that are, uh- uh, modular and hierarchical so that they- they- they have, uh, an- an API surface that is, uh, as small as possible, right?
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
And- and you want, uh- uh, this modular hierarchical architecture to reflect the way that domain experts think about the problem. 'Cause like as- as a domain expert, when- when you're reading about a new API, you're reading a tutorial or- or some docs pages, um, you already have a way that you're thinking about the problem.
- LFLex Fridman
Right.
- FCFrançois Chollet
You already have like, uh- uh, certain concepts in mind, uh, and- and- and you're thinking about, uh, how they relate together, and when you're reading docs you're trying to build, uh, as quickly as possible a mapping between the concepts...
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... featured in new API and the concepts in your mind, so you're trying to map your mental model as a domain expert to the way things, uh, uh, work in the API.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
So, you need a- an API and an underlying implementation that are reflecting the way people think about these things.
- LFLex Fridman
So you're minimizing the time it takes to do the mapping?
- FCFrançois Chollet
Yes.
- LFLex Fridman
All right.
- FCFrançois Chollet
Minimizing the time, the cognitive load there is in ingesting this new knowledge about your API. An API should not be self-referential or- or ref- referring to implementation details; it should only be referring to domain-specific concepts that people already kn- uh, understand.
- 46:23 – 47:41
Beyond Keras: AutoML, hyperparameter tuning, and ‘model building itself’
- LFLex Fridman
Brilliant. So, what's the future of Keras and TensorFlow look like? What does TensorFlow 3.0 look like?
- FCFrançois Chollet
So, that's kind of too far in the future for me to answer, especially, uh- uh, since I'm not- I'm not even the one making these decisions.
- LFLex Fridman
Okay.
- FCFrançois Chollet
But, so from my perspective, which is, you know, just one perspective among many different perspectives on the TensorFlow team, I'm really excited by developing, uh, even higher level APIs. Higher level than Keras. I'm really excited by hyper-parameter tuning, by, uh, automated machine learning, AutoML. I think the future is not just, you know, defining a model like, uh- uh, like you were assembling Lego blocks and then
- NANarrator
(laughs)
- FCFrançois Chollet
... calling fit on it. It's more like an automagical model that will just look at your data and optimize the objective you- you're after, right? So that's- that's, uh- uh, what- uh, what I'm looking into.
- LFLex Fridman
Yeah, so you, uh, put the baby into a room with the problem and come back a few hours later, uh, with a f- with a fully solved problem.
- FCFrançois Chollet
Exactly. It's not like a box of Legos.
- LFLex Fridman
Right.
- FCFrançois Chollet
It's more like the combination of a kid that's really good at Legos...
- LFLex Fridman
(laughs) Yeah.
- FCFrançois Chollet
... and a box of Legos.
- LFLex Fridman
Yeah. Exactly.
- FCFrançois Chollet
And just building the thing on its own.
- 47:41 – 51:20
Limits of deep learning: interpolation, data hunger, and the need for abstract rules
- LFLex Fridman
Nice. Uh, very nice. So that's- that's an exciting future and I think there's a huge amount of applications and, uh, revolutions to be had, uh, under the constraints of the discussion we previously had. But what do you think are the current limits of deep learning? If we look specifically at these, uh, function approximators that try to generalize from data. So you've, uh, you've talked about local versus extreme generalization. You mentioned the neural networks don't generalize well and humans do, so there's this gap. So w- and you've also mentioned that ex- generalization, extreme generalization requires something like reasoning to fill those gaps. So how can we start trying to build systems like that?
- FCFrançois Chollet
Right. Yeah, so this is- this is by design, right? Deep learning models are like huge biometric models.... differentiable, so continuous, uh, that go from an input space to an output space. And they're trained with gradiente descent. So they are trained pretty much point by point.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Uh, they are learning a continuous geometric morphing from, from an, uh, an input vector space to an output vector space. All right? And because this is done point by point, a deep neural network can only make sense of points in experience space that are very close to things that, that it has already seen in its training data. At best, it can do interpolation across points.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
But that means, you know, it means in order to train your network, you need a dense sampling of the input cross-output space, almost a point-by-point sampling, which can be very expensive if you're dealing with complex real-world problems like autonomous driving, for instance, or, or robotics. Uh, it's, it's doable if you're looking at the subset of the, of the visual space, but even then, it's still fairly expensive. You still need millions of examples. And it's only gonna be able to make sense of things that are very close to what it has seen before. And in contrast to that, well, of course, you have human intelligence, but even if you're not looking at human intelligence, um, you can look at very simple rules, algorithms. Uh, if you have a symbolic rule, it can actually, uh, apply to, um, a very, very large set of inputs because it is abstract. It is not obtained, um, uh, by doing a point-by-point mapping, right? Uh, for instance, if you try to learn a sorting algorithm using a deep neural network, well, you're very much limited to learning point by point what the sorted representation of this specific list-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... is like, but instead, you could, uh, uh, have a very, very, uh, simple sorting algorithm written in a few lines. Maybe it's just, you know, uh, two nested loops. Um, and it can process any, uh, list at all because it is abstract, because it is a set of rules. So deep learning is really like point-by-point geometric morphings-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... uh, morphings trained with gradient descent. And meanwhile, abstract rules can generalize, uh, uh, much better, and I think the future is really to combine the two.
- LFLex Fridman
So how do we, do you think, combine the two? How do we combine good point-by-point functions with programs, which is what the symbolic AI-type systems?
- FCFrançois Chollet
Yeah.
- LFLex Fridman
At which levels does the combination happen? I mean, o- obviously, we're, we're jumping into the realm of where there's no good answers, it just kind of ideas and intuitions-
- FCFrançois Chollet
Yeah.
- LFLex Fridman
... and so on.
- 51:20 – 1:00:35
Hybrid AI and program synthesis: learning rules, search, and genetic programming prospects
- FCFrançois Chollet
Well, if you look at the really successful AI systems today, I think they are already hybrid systems that are combining symbolic AI with deep learning.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
For instance, successful robotics systems are already mostly model-based, rule-based-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... uh, things like planning algorithms and so on. At the same time, uh, they're using deep learning as perception modules. Sometimes they're using deep learning as a way to inject a fuzzy intuition into a rule-based process. If you look at a system like in a self-driving car, it's not just one big end-to-end neural network, you know, that wouldn't work at all, precisely because in order to train that, you would need a dense sampling of, uh, experience space when it comes to driving, which is completely unrealistic, obviously. Uh, instead, the self-driving car, uh, is mostly symbolic, uh, you know, it's, it's, it's software, it's programmed by hand, so it's mostly, uh, uh, based on explicit models, in this case, mostly 3D models of the, of the environment around the car-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... but it's interfacing with the real world using deep learning modules. Right?
- LFLex Fridman
Right. So the deep learning there serves as a way to convert the raw sensory information to something usable by symbolic systems. Okay, well, let's linger on that a little more. Uh, so dense sampling from input to output. You said it's obviously very difficult. Is it possible?
- FCFrançois Chollet
In the case of self-driving, you mean?
- LFLex Fridman
Uh, let's say self-driving, right? Self-driving for ma- for many people, let- let's not even ta- talk about self-driving, let's talk about steering.
- FCFrançois Chollet
Mm-hmm. Mm-hmm.
- LFLex Fridman
So staying inside the lane.
- FCFrançois Chollet
Lane following, yeah, it's definitely a problem you can solve with an end-to-end deep learning model, but that's like one small subset-
- LFLex Fridman
H- hold on a second. Yeah, I don't know-
- FCFrançois Chollet
... of self-driving.
- LFLex Fridman
... how you're jumping from the extreme so easily 'cause I disagree with you on that. Uh, I think, well, it's, it's not obvious to me that you can solve lane following.
- FCFrançois Chollet
It's, no, it's not, it's not obvious. I think it's doable. I think, in general, you know, there is no, uh, hard limitations to what you can learn with a deep neural network as long as, uh, th- the search space, like, um, is, is rich enough, is flexible enough, and as long as you have this dense sampling-
- LFLex Fridman
Yeah.
- FCFrançois Chollet
... of the input cross-output space. The problem is that, you know, this dense sampling could mean anything from 10,000 examples to like trillions and trillions.
- LFLex Fridman
So the- that's, that's my question. So what's your intuition? And if you could just give it a chance and think what kind of problems can be solved by getting a huge amounts of data and thereby creating, uh, a dense mapping. So let's think about natural language dia- dialogue, the Turing test. Do you think the Turing test can be solved with a neural network alone?
- FCFrançois Chollet
Well, the, the, the Turing test is all about tricking people into believing they're talking to a human, and I don't think that's actually very difficult because it's more about exploiting...... uh, human perception and not so much about intelligence. There's a big difference between mimicking intelligent behavior and actual intelligent behavior.
- LFLex Fridman
So, okay, let's look at maybe the Alexa Prize and so on, the different formulations of the natural language conversation that are less about mimicking and more about maintaining a fun conversation that lasts for 20 minutes.
- FCFrançois Chollet
Mm-hmm. Mm-hmm.
- LFLex Fridman
That's a little less about mimicking, and that's more about... I mean, that's still mimicking, but it's more about being able to carry forward a conversation with all the tangents that happen, dialogue, and so on. Do you think that problem is learnable with this kind of, uh, with a neural network that does the point-to-point mapping?
- FCFrançois Chollet
So, I think it would be very, very challenging to do this with deep learning. I don't think it's out of the question either. Uh, I wouldn't rule it out.
- LFLex Fridman
Th- the space of problems that could be solved with a large neural network, what's your sense about the space of those problems?
- FCFrançois Chollet
So-
- LFLex Fridman
Useful problems for us.
- FCFrançois Chollet
In theory, it's, it's infinite, right? You can solve any problem. In practice... Well, deep learning is, is great fit for, uh, perception problems. In general, any, any problem which is naturally amenable to explicit handcrafted rules or, uh, uh, rules that you can generate by exhaustive search over some program space. So, perception, uh, artificial intuition, uh, as long as you have a, a sufficient training dataset.
- LFLex Fridman
And that's the qu- question, I mean, perception, there's interpretation and understanding of the scene-
- 1:00:35 – 1:08:38
Data, priors, and ‘The Bitter Lesson’: when compute stops being the bottleneck
- LFLex Fridman
Hmm. So let's talk a little about, about data. You've tweeted (laughs) -
- FCFrançois Chollet
(laughs)
- LFLex Fridman
... uh, about 10,000 deep learning papers have been written about hard coding priors about a specific task in a neural network architecture. It works better than a lack of a prior. Basically summarizing all these efforts, they- they put a name to an architecture, but really what they're doing is hard coding some priors that improve the performance of the system.
- FCFrançois Chollet
Yes, yes.
- LFLex Fridman
But we're... (laughs) Uh, get straight to the point is- is probably true, so you say that you can always buy performance, "buy" in quotes, performance by either training on more data, better data, or by injecting task information-
- FCFrançois Chollet
Yeah.
- LFLex Fridman
... to the architecture as a pre-processing. Uh, however, this isn't informative about the generalization power of the techniques used, the fundamental ability to generalize. Do you think we can go far by coming up with better methods for this kind of cheating, for better methods of large-scale annotation of data, so building better priors?
- FCFrançois Chollet
If you- if you'd have made it, it's not cheating anymore.
- LFLex Fridman
Right. Uh, um, I'm joking-
- FCFrançois Chollet
(laughs)
- LFLex Fridman
... about the cheating, but large scale... So basically, I'm asking, um, about something that hasn't, uh, from my perspective, been researched too- too much is, uh, exponential improvement in annotation of data.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
Do you- have you often think about that?
- FCFrançois Chollet
I think it's- it's actually been- been researched quite a bit, you just don't see publications about it because, you know, people who publish papers are gonna publish about known benchmarks.
- LFLex Fridman
Right.
- FCFrançois Chollet
Sometimes they're going to raise a new benchmark.
- LFLex Fridman
Right.
- FCFrançois Chollet
People who actually have real-world large-scale-
- LFLex Fridman
Rules, yeah.
- FCFrançois Chollet
... deep learning problems, they're gonna spend a lot of resources into data annotation and good data annotation pipelines, but you don't see any papers about it.
- LFLex Fridman
That's interesting. So do you think there are certainly resources, but do you think there's innovation happening?
- FCFrançois Chollet
Oh, yeah, definitely.
- LFLex Fridman
... as
- FCFrançois Chollet
To clarify, uh, uh, and the point in the twist, so machine learning in general is the science of generalization. You want to generate knowledge that can be reused across different datasets, across different tasks.
- LFLex Fridman
Right.
- FCFrançois Chollet
And if instead you're looking at one dataset and then you are hard coding, uh, knowledge about this task into your architecture, this is no more useful than training a network and then saying, "Oh, I found these weight values, uh, uh, perform well." Right?
- LFLex Fridman
Right.
- FCFrançois Chollet
So, uh, uh, David Ha, I don't know if- if you know w- uh, uh, David, he had a paper the other day about, uh, weight agnostic neural networks.
- LFLex Fridman
(laughs)
- FCFrançois Chollet
And this was- this was very interesting paper because it really illustrates the fact that an architecture, even without weights, an architecture is a- a knowledge about a task, it encodes knowledge. And when it comes to, uh, architectures that are hand crafted by researchers, they're... in some cases, it is- it is very, very clear that all they are doing is, uh, artificially, um, uh, re-encoding the template that corresponds to the- the proper way to solve a- a- a- a task encoded in any given dataset. For instance, um, I don't know if- if- if you've looked at the, uh, uh, Baby dataset, which is about, uh, natural language question answering, it is generated by an- by an algorithm. So this is, uh, question-answer pairs that are generated by an algorithm. The algorithm is following a certain template. Turns out if you craft a network that literally encodes this template, you can solve this dataset with nearly 100% accuracy.
- 1:08:38 – 1:16:28
Near-term AI risks: surveillance, recommender systems, and mass behavioral manipulation
- LFLex Fridman
Do you have concerns about short-term or long-term threats from AI, from artificial intelligence?
- FCFrançois Chollet
Yes, definitely to some extent. This-
- LFLex Fridman
And what's the shape of those concerns?
- FCFrançois Chollet
Thi- this is actually something I've, uh, I've briefly, uh, uh, written about, but the, the capabilities of deep learning technology can be used in many ways that are concerning from, uh, you know, mass surveillance with things like facial recognition, uh, in general, you know, tracking lots of data about everyone and then being able to making sense of this data-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... to do identification, to do prediction. That's concerning, that's something that's, uh, being very aggressively, uh, pursued by totalitarian states like, you know, China. One thing I am, I am very much concerned about is that, you know, our lives, uh, uh, uh, are increasingly, uh, online, are increasingly digital, made of information, made of, uh, information consumption and information production-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... uh, or, or, or digital, um, footprint, I would say. And if you absorb all of this data and, uh, and you are in control of, uh, where you consume information, you know, social networks and so on, uh, recommendation engines, uh, then you can build a sort of reinforcement loop, uh, for human behavior. You can observe the state of your mind at time T, uh, you can predict how you would react to different pieces of content, how to get you to move your mind, uh, you know, in a certain direction and then you, then, uh, you can feed, can feed you, uh, uh, the specific piece of content that would move you in a, in a specific direction. And you can do this at scale, you know, uh, uh, uh, at scale in terms of, uh, uh, doing it continuously in real time, you can also do it at scale in terms of, uh, scaling this to many, many people, to entire populations. So, uh, potentially, artificial intelligence even in its current state, if you combine it with the internet, with the fact that we have, uh, all of our lives are, are moving to, uh, digital devices and digital information consumption and creation, um, what you get is the possibility to, to, to achieve mass manipulation of behavior and mass, uh, mass psy- psychological control. And this is a very real possibility.
- LFLex Fridman
Yeah. So you're talking about any kind of recommender system.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
Or let's look at the YouTube algorithm, Facebook, anything that recommends content you should watch next.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
And it's fascinating to think that there's some aspects of human behavior that you can, you know, um, say a problem of, is this person hold republican beliefs or democratic beliefs? And it's just, uh, trivial, uh, that's an objective function, and you can optimize and you can measure, and you can turn everybody into a republican or everybody into a democrat.
- FCFrançois Chollet
Yeah, absolutely. Yeah.
- LFLex Fridman
And that's what-
- FCFrançois Chollet
I do believe it's true, b- so the human mind is, is very... If, if, if you look at the human mind as a kind of computer program, it has a very large, uh, exploit surface, right? It has many, many v- vulnerabilities.
- LFLex Fridman
Exploit surfaces, yeah.
- FCFrançois Chollet
Ways, ways you can, uh, um, uh, control it. For instance, when it comes to your political beliefs-
- LFLex Fridman
Yes.
- FCFrançois Chollet
... this is very much tied to your identity. So, for instance, if I'm in control of, uh, your newsfeed on your favorite social media platforms-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... uh, this is actually where, uh, you're getting your news from. And I can, of course, I can, I can choose to only, uh, show you news that will, uh, make you see the world in this, in a specific way, right?
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
But I can also, um, uh, you know, uh, create incentives for you to, to, to post about some political beliefs. And then when I, when I get you to express a statement, if it's a statement that, uh, me as the, as the controller, I, I want to, I want to reinforce, I can just show it to people who will agree and they will like it, and that will reinforce the statement in your mind. If this is a statement I want you to-... uh, this is a belief I, I want you to abandon, I can, uh, uh, on the other hand, show it to, uh, opponents, right, who will attack you. And then because they attack you, at the very least, next time, you will think twice, uh, uh, about posting it, but maybe you will even, uh, uh, you know, stop believing this because you, you got pushback, right? So, there, there are many ways in which, uh, um, social media platforms can potentially control your opinions, and today, uh, the... So all of these things are already being, uh, controlled by AI algorithms. These algorithms do not have a- any explicit political goal today.
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
While potentially they could. Like, uh, if some totalitarian government takes over, you know, social media, uh, platforms and decides that, you know, now we are gonna use this not just for mass surveillance but also for mass opinion control and behavior control-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... you know, very bad things could happen. Um, but what's really, uh, uh, fascinating and, and actually quite concerning is that even without an explicit intent to manipulate, you're already seeing very dangerous dynamics in terms of, uh, how this content recommendation, uh, algorithms behave because right now, uh, the, the goal, the objective function of these algorithms-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... is to maximize engagement, right, which seems very, uh, uh, innocuous at first, right? Uh, however, uh, it is not because content that will, uh, maximally engage people, you know, uh, get people to, to react in an emotional way, get people to click on something, it is very often, um, uh, content that, uh, you know, is not, um, uh, healthy to, to the public discourse. For instance, um, fake news are far more likely to get you to click on them than real news, simply because they are not, um, constrained to reality, so they can be as atrocious, as, as, as, as surprising, as, as good stories as you want because they're artificial, right?
- 1:16:28 – 1:29:11
Giving users control: objective functions, interface design, and ‘loss function engineering’
- LFLex Fridman
Is there, is it even possible to have that kind of philosophical discussion?
- FCFrançois Chollet
Um, I think you can definitely try. So from my perspective, I would feel rather uncomfortable with, uh, companies that are in control of these, uh, newsfeed algorithms, uh, with them making explicit decisions to manipulate, uh, uh, people's opinions or behaviors, even if the intent is good, because that's- that's a very totalitarian mindset. So instead, what I would like to see, and it's probably never gonna happen because it's- it's not super realistic, but that's actually something I really care about, I would like, uh, all these algorithms, uh, to present, uh, configuration settings to their users so that the users-
- LFLex Fridman
Ah, yeah.
- FCFrançois Chollet
... can actually make the decision about how they want to be impacted, uh, by these, uh, uh, information recommendation, content recommendation algorithms. For instance, as a user of something like YouTube or Twitter-
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
... maybe I want to maximize learning-
- LFLex Fridman
Learning.
- FCFrançois Chollet
... about a specific topic, right? So I want the, the algorithm, um, to, uh, uh, feed my curiosity, right, which is in itself a very interesting problem. So instead of maximizing my engagement, it will maximize how fast and how much I'm learning, and it will also take into account the accuracy, uh, hopefully, you know, of the information I'm learning. So yeah, uh, the user should be able to determine exactly how these algorithms are affecting their lives. I, I don't want actually any entity, uh, making decisions about, uh, in which direction they're gonna, uh, uh, try to manipulate me, right? I want, uh, I want technology. So AI, these algorithms are increasingly gonna be our interface to a world that is increasingly made of information.
- LFLex Fridman
Right.
- FCFrançois Chollet
And I want... I want everyone to be in control of this interface, to interface with the world on their own terms. So if someone wants, uh, these algorithms to serve, you know, their own personal growth goals, they should be able to configure these algorithms in such a way.
- LFLex Fridman
Yeah, but so I know it's painful to have explicit decisions, but there is underlying explicit decisions, which is some of the most beautiful fundamental philosophy that, uh, that we have before us, which is personal growth. If I want to watch videos from which I can learn, what does that mean? So if I have a checkbox that wants to emphasize learning...... there's still an algorithm with explicit decisions in it that would promote learning. What does that mean for me? Like for example, I've watched a documentary on, um, flat earth theory, I guess. It- it was very like, th- I learned a lot. I- I'm really glad I watched it. It was ... a friend recommended it to me. Not (laughs) 'cause I don't have s- such an allergic reaction to c- to crazy people as my fellow colleagues do, but it was very wi- it was very eye-opening and for others it might not be. For others, they- they might just get, uh, turned off with that.
- FCFrançois Chollet
Mm-hmm.
- LFLex Fridman
Same with Republican-Democrat, like what ... It's a non-trivial problem. And f- first of all, if it's done well, I don't think it's something that wouldn't happen, that, uh, that YouTube wouldn't be promoting or Twitter wouldn't be. It's just a really difficult problem, how do we do, how to give people control?
- FCFrançois Chollet
Well, it's mostly an- an interface design problem.
- LFLex Fridman
Right. (laughs)
- FCFrançois Chollet
The- the way I see it, you want to create technology that's like, um, a mentor or a coach-
- LFLex Fridman
Right.
- FCFrançois Chollet
... or an assistant.
- LFLex Fridman
Right.
- FCFrançois Chollet
So that it's not your boss, right? You are in- in control of it.
- LFLex Fridman
Right.
- FCFrançois Chollet
You are telling it what to do for you. And if you feel like it's manipulating you, it's not actually, it's not actually doing what you want, you should be able to switch to a different algorithm, you know?
- LFLex Fridman
Right. So that fine-tune control, and you kinda learn, you're trusting the human collaboration. I mean that's how I see autonomous vehicles too, is giving as much information as possible and you learn that dance yourself.
- FCFrançois Chollet
Mm-hmm.
- LFLex Fridman
Yeah, um, Adobe, I don't know if you use Adobe product for like photos-
- FCFrançois Chollet
Yeah, I use Photoshop.
- LFLex Fridman
Yeah. They're trying to see if they can inject YouTube into their interface, but basically allow you to show you all these videos that ... 'Cause everybody's confused about what to do, uh, with features, so basically teach people by linking to ... In that way it's an assistant-
- FCFrançois Chollet
Mm-hmm. Mm-hmm.
- LFLex Fridman
... that shows, uses videos as a basic element of information.
- FCFrançois Chollet
Yeah.
- 1:29:11 – 1:40:24
AGI, consciousness, embodiment, and a benchmark for intelligence via generalization efficiency
- LFLex Fridman
Does the possibility of creating an AGI system excite you or scare you or bore you?
- FCFrançois Chollet
So intelligence can never really be general, you know? At best it can have some degree of generality, like human intelligence. It's also always has some specialization in the same way that human intelligence is specialized in a certain category of problems, is specialized in the human experience. And when people talk about AGI, I'm never quite sure if they're talking about very, very smart AI, so smart that it's even smarter than humans or they're talking about human-like intelligence because it- these are different things.
- LFLex Fridman
Let's say presumably I'm impressing you today with my humanness. So imagine that I was in fact a robot. So what does that mean? Uh, I'm impressing you with natural language processing. Maybe if you weren't able to see me, maybe this is a phone call.
- FCFrançois Chollet
Yeah.
- LFLex Fridman
So that kind of system.
- FCFrançois Chollet
Okay. So-
- LFLex Fridman
Companion.
- FCFrançois Chollet
So that- that's very much about building human-like AI and you're asking me, you know, is this- is this an exciting perspective?
- LFLex Fridman
Yes.
- FCFrançois Chollet
I think so, yes. Not so much because of- of- of what, uh, uh, artificial human-like intelligence could do but, you know, from an intellectual perspective, I think if you could build truly human-like intelligence, that means you could actually understand human intelligence, which is fascinating, right? Uh, human-like intelligence is gonna require emotions, it's gonna require consciousness, which is not things that- that would normally be required by an intelligent, uh, system. If you look at, you know, we were mentioning earlier, like science as- as super- superhuman problem-solving, uh, uh, um, agent or system, it does not have consciousness, it doesn't have emotions. In general, so emotions, I see consciousness as being on the same spectrum as emotions. It is, uh, a component of the subjective experience that is meant very much to, uh, guide, uh, behavior generation, right?
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
It's meant to guide your behavior. Uh, in general, um, human intelligence and animal intelligence, uh, has evolved for the purpose of behavior generation, right?
- LFLex Fridman
Mm-hmm.
- FCFrançois Chollet
Uh, including in a social context, so that's why we actually need emotions. That's why we need consciousness.... an artificial intelligence system developed in different context may well never need them, may well- may well never be conscious, like science.
Episode duration: 1:59:49
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode Bo8MY4JpiXE