Lex Fridman PodcastAndrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI | Lex Fridman Podcast #333
EVERY SPOKEN WORD
150 min read · 30,031 words- 0:00 – 0:58
Introduction
- AKAndrej Karpathy
... think it's possible that physics has exploits and we should be trying to find them, uh, arranging some kind of a crazy quantum mechanical system that somehow gives you buffer overflow, uh, somehow gives you a rounding error in the floating point. Synthetic intelligences are kind of like the next stage of development. And I don't know where it leads to. Like, at some point, I suspect the universe is some kind of a puzzle. These synthetic AIs will uncover that puzzle and solve it.
- LFLex Fridman
The following is a conversation with Andrej Karpathy, previously the director of AI at Tesla, and before that, at OpenAI and Stanford. He is one of the greatest scientist-engineers and educators in the history of artificial intelligence. This is the Lex Fridman podcast. To support it, please check out our sponsors. And now, dear friends, here's Andrej Karpathy.
- 0:58 – 6:01
Neural networks
- LFLex Fridman
What is a neural network and why does it seem to, uh, do s- such a surprisingly good job of learning?
- AKAndrej Karpathy
What is a neural network? It's a mathematical abstraction of the brain. I would say that's how it was originally developed. At the end of the day, it's a mathematical expression and it's a fairly simple mathematical expression when you get down to it. It's basically a sequence of, uh, matrix multiplies, which are really dot products mathematically and, uh, some non-linearities thrown in. And so it's a very simple mathematical expression and it's got knobs in it.
- LFLex Fridman
Many knobs.
- AKAndrej Karpathy
Many knobs. And these knobs are loosely related to basically the synapses in your brain. They're trainable, they're modifiable. And so the idea is, like, we need to find the setting of the knobs that makes the neural net, uh, do whatever you want it to do, like classify images and so on. And so there's not too much mystery I would say in it. Like, um, you might think that basically don't want to endow it with too much meaning with respect to the brain and, uh, how it works. Uh, it's really just a complicated mathematical expression with knobs and those knobs need a proper setting, uh, for it to do something, uh, desirable.
- LFLex Fridman
Yeah, but poetry is just a collection of letters with spaces-
- AKAndrej Karpathy
(laughs) .
- LFLex Fridman
... but it can make us feel a certain way.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
And in that same way when you get a large number of knobs together, whe- whether it's in a, inside the brain or inside a computer, they seem to, they seem to surprise us with the, with their power.
- AKAndrej Karpathy
Yeah. I think that's fair. So basically, uh, I'm underselling it by a lot because-
- LFLex Fridman
Yes (laughs) .
- AKAndrej Karpathy
... you definitely do get very surprising emergent behaviors out of these neural nets when they're large enough and trained on complicated enough problems. Like say, for example, the next, uh, word prediction in a massive data set from the internet. And, uh, then these neural nets take on, uh, pretty surprising magical properties. Yeah, I think it's kind of interesting how much you can get out of even very simple mathematical formalism.
- LFLex Fridman
When your brain right now is talking, is it doing next word prediction?
- AKAndrej Karpathy
Um-
- LFLex Fridman
Or is it doing something more interesting?
- AKAndrej Karpathy
Well, it's definitely some kind of a generative model that's GPT-like and prompted by you. Um-
- LFLex Fridman
Yes.
- AKAndrej Karpathy
... so you're giving me a prompt and (laughs) -
- LFLex Fridman
And-
- AKAndrej Karpathy
... I'm kind of, like, responding to it in a generative way.
- LFLex Fridman
... and by yourself perhaps a little bit? Like, a- are you adding extra prompts from your own memory inside your head?
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
Or no?
- AKAndrej Karpathy
Well, it definitely feels like you're referencing some kind of a declarative structure of, like, memory and so on. And then, uh, you're putting that together with your prompt and giving away some answers.
- LFLex Fridman
Like, how much of what you just said has been said by you before?
- AKAndrej Karpathy
Uh, nothing basically, right?
- LFLex Fridman
No, but if you actually look at all the words you've ever said in your life and you do a search, you probably said a lot of the same words in the same order before.
- AKAndrej Karpathy
Yeah. Could be. I mean, I'm using phrases that are common, et cetera, but I'm remixing it into a pretty, uh, sort of unique sentence at the end of the day. But you're right, definitely, there's, like, a ton of remixing.
- LFLex Fridman
Why... You didn't... You (laughs) ... It's like Magnus Carlsen said, uh, "I'm, I'm rated 2,900 whatever, which is pretty decent."
- AKAndrej Karpathy
(laughs) .
- 6:01 – 11:32
Biology
- LFLex Fridman
impressive thing is biology do- doing to you that computers are not yet? That gap.
- AKAndrej Karpathy
I would say I'm definitely on- I'm much more hesitant with the analogies to the brain than I think you would see, potentially, in the field. Um, and I kind of feel like certainly the way neural networks started is everything stemmed from inspiration of the br- by the brain but at the end of the day, the artifacts that you get after training, uh, they are arrived at by a very different optimization process than the optimization process that gave rise to the brain. And so I think, uh, I kind of think of it as a very complicated alien artifact. Um, it's something different.
- LFLex Fridman
The brain?
- AKAndrej Karpathy
I'm sorry, the, uh, the neural nets that we're training.
- LFLex Fridman
Okay.
- AKAndrej Karpathy
They are a complicated, uh, alien artifact. Uh, I do not make analogies to the brain because I think the optimization process that gave rise to it is very different from the brain. So there was no multi-agent self-play kind of, uh, setup, uh, and evolution.
- LFLex Fridman
(laughs) .
- AKAndrej Karpathy
It was an optimization that is basically a, what amounts to a-
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
... compression objective on a massive amount of data.
- LFLex Fridman
Okay. So artificial neural networks are doing compression and biological neural networks are not-
- AKAndrej Karpathy
Are trying to survive and they're-
- LFLex Fridman
... are not really doing any, they're-
- AKAndrej Karpathy
(laughs) .
- LFLex Fridman
... they're an agent in a multi-agent, self-play system that's been running for a very, very long time.
- AKAndrej Karpathy
Yes. That said, evolution has found that it is very useful to, to predict and have a predictive model in the brain. And so I think our brain utilizes something that looks like that as, as a part of it but it has a lot more, you know, gadgets and gizmos and, uh, value functions and ancient nuclei that are all trying to, like, make you survive, and reproduce, and everything else.
- LFLex Fridman
And the whole thing through embryogenesis is built from a single cell. I mean it's just, the code is inside the DNA.
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
And it just builds it up, like the entire organism with arms.
- AKAndrej Karpathy
It's totally crazy.
- LFLex Fridman
And a head and legs.
- AKAndrej Karpathy
Yes.
- LFLex Fridman
And, like, it does it pretty well.
- AKAndrej Karpathy
It should not be possible.
- LFLex Fridman
So there's some learning going on, there's some, there's some, there's some kind of computation going through that building process. I mean, I, I, I don't know where, if you were just to look at the entirety of history of life on Earth, where do you think is the most interesting invention? Is it the origin of life itself? Is it just jumping to eukaryotes? Is it mammals? Is it humans themselves, homo sapiens?
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
The, the, the origin of intelligence or highly complex intelligence? Or, what do, what, or is it all just a continuation of the same kind of process?
- AKAndrej Karpathy
Hmm. Certainly, I would say it's an extremely remarkable story that I'm only, like, briefly learning about recently. Uh, all the way from, um, actually, like, you almost have to start at the formation of Earth and all of its conditions and the entire solar system and how everything is arranged with Jupiter, and Moon, and the habitable zone, and everything. And then you have an active Earth that's turning over material.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
And, um, and then you start with abiogenesis and everything. And so it's all, like, a pretty remarkable story. I'm not sure that I can pick, like, a u- single unique piece of it, uh, that I find most interesting. Um, I guess for me as an artificial intelligence researcher, it's probably the last piece. We have lots of animals that, uh, you know, are, are not building technological society but we do. And, um, it seems to have happened very quickly, it seems to have happened very recently, and, uh, something very interesting happened there that I don't fully understand. I almost understand everything else, kind of, I think, intuitively, uh, but I don't understand exactly that part and how quick it was.
- 11:32 – 21:43
Aliens
- LFLex Fridman
Gotta ask you, how many intelligent alien civilizations do you think are out there and, uh, is their intelligence different or similar to ours?
- AKAndrej Karpathy
Yeah. I've been preoccupied with this question quite a bit recently. Uh, basically, the firm- Fermi paradox and just thinking through. And, and the reason actually that I am very interested in, uh, the origin of life is fundamentally trying to understand how common it is that there are technological societies out there, uh, um, in space. And the more I study it, the more I s- I think that, um...Uh, there should be quite a few, quite a lot.
- LFLex Fridman
Why haven't we heard from them? 'Cause I, I agree with you. It feels like I just don't see why what we did here on earth is so difficult to do.
- AKAndrej Karpathy
Yeah, and especially when you get into the details of it. I used to think origin of life was very, um, it, it was this magical rare event but then you read books, like for example, Nick Lane, um, uh, The Vital Question, uh, Life Ascending, et cetera, and he really gets in and he really makes you believe that this is not that rare.
- LFLex Fridman
Basic chemistry.
- AKAndrej Karpathy
You have an active earth and you have your alkaline vents and you have lots of alkaline waters, uh, mixing with the ocean and you have your proton gradients and you have the little porous pockets of these alkaline vents that concentrate chemistry. And, um, basically as he steps through all of these little pieces you start to understand that actually this is not that crazy. You could see this happen on other systems. Um, and he really takes you from just a geology to primitive life and he makes it feel like it's actually pretty plausible. And also, like, uh, the origin of life, um, didn't, uh, was actually fairly fast after formation of earth. Um, if I remember correctly, just a few hundred million years or something like that after basically when it was possible, life actually arose. And so that makes me feel like that is not the constraint, uh, that is not the limiting variable and that life should actually be fairly common. Um, and then, you know, where the drop-offs are is, is very, um, is very interesting to think about. I c- currently think that there's no major drop-offs basically.
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
And so there should be quite a lot of life. And basically, what it, uh, where that brings me to then is the only way to reconcile the fact that we haven't found anyone and so on is that, um, we just can't, we can't see them, we can't observe them.
- LFLex Fridman
Just a quick brief comment. Nick Lane and a lot of biologists I talk to, they really seem to think that the jump from bacteria to more complex organisms is the hardest jump.
- AKAndrej Karpathy
Mm-hmm. The eukaryotic life, basically.
- LFLex Fridman
Y- yeah. Which I don't... I get it, they're much more knowledgeable, uh, than me about, like, the intricacies of biology but that seems, like, crazy 'cause how much, how many single cell organisms are there, like, and how much time you have. Surely, it's not that difficult.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
Like and, and a billion years is not even that long of a time, really. Just all these bacteria under constrained resources, battling it out. I'm sure they can invent more complex... Like, I don't understand-
- AKAndrej Karpathy
(laughs) .
- LFLex Fridman
It's like how to move from a hello world program to, like, uh, like invent a function or something like that. I don't-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
(laughs) So I don't... Yeah, so I'm with you. I just feel like I don't see any... If the origin of life, that would be my intuition, that's the hardest thing but if that's not the hardest thing 'cause it happened so quickly, then it's gotta be everywhere. And yeah, maybe we're just too dumb to see it.
- AKAndrej Karpathy
Well, it's just, uh, we don't have really good mechanisms for seeing this life. I mean, uh, by what... How, um... So I'm not an expert, just to preface this, but just from what I think about it-
- LFLex Fridman
On aliens? (laughs)
- AKAndrej Karpathy
(laughs) .
- LFLex Fridman
Who's... I wanna meet an expert on alien intelligence and how they communicate.
- AKAndrej Karpathy
I'm very suspicious of our ability to, to find these intelligences out there and to find these, or it's like, uh, radio waves for example, are, are terrible. Uh, their power drops off as basically one over R squared. Uh, so I remember reading that our current radio waves would not be, uh, the ones that we, we are broadcasting, would not be, uh, measurable by our devices today, only like, was it like one-tenth of a light year away? Like not even... Basically, a tiny distance because, uh, you'd really need, like, a targeted transmission of massive power directed somewhere for this to be picked up on long, long distances. And so I just think that our ability to measure is, um, is not amazing. I think there's probably other civilizations out there. And then the big question is why don't they build Von Neumann probes and why don't they interstellar travel across the entire galaxy? And my current answer is, it's probably interstellar travel is, like, really hard. Uh, you have the interstellar medium if you wanna move at close to the speed of light, you're going to be encountering bullets along the way, uh, because even, like, tiny hydrogen atoms and little particles of dust are basically have im- like massive kinetic energy at those speeds. And so basically you need some kind of shielding. You need, you have all the cosmic radiation. Uh, it's just like brutal out there. It's really hard. And so my thinking is maybe interstellar travel is just extremely hard and you have to be very slow.
- LFLex Fridman
Like billions of years to build hard? It feels like, uh, it feels like we're not a billion years away from doing that.
- AKAndrej Karpathy
It just might be that it's very, you have to go very slowly, potentially, as an example, through space. Um-
- LFLex Fridman
Right. As opposed to close to the speed of light.
- AKAndrej Karpathy
Yeah. So I'm suspicious basically of our ability to measure life and I'm suspicious of, uh, the ability to, um, just permeate all of space in the galaxy or across galaxies. And that's the only way that I can c- I can currently see a way around it.
- LFLex Fridman
Yeah. It's kinda mind-blowing to think that there's trillions of intelligent alien civilizations out there kinda slowly traveling through space.
- AKAndrej Karpathy
Mm-hmm. Maybe.
- LFLex Fridman
To meet each other. And some of them meet, some of them go to war, some of them collaborate.
- AKAndrej Karpathy
Mm-hmm.
- 21:43 – 33:34
Universe
- AKAndrej Karpathy
- LFLex Fridman
So you, uh, famously tweeted, "It looks like if you bombard Earth with photons for a while, you can emit a roadster." So if like in Hitchhiker's Guide to the Galaxy we would summarize the story of Earth. So in, in that book it's mostly harmless. Uh, what do you think is the, all the possible stories, like a paragraph long or a sentence long, that Earth could be summarized as a- once it's done its computation?
- AKAndrej Karpathy
Mm.
- LFLex Fridman
So, like, all the possible full... if Earth is a book, right?
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
Uh, it could... probably there has to be an ending.
- AKAndrej Karpathy
Mm.
- LFLex Fridman
I mean, there's going to be an end to Earth and w- it could end in all kinds of ways. It can end soon, it can end later.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
What do you think are the possible s- stories?
- AKAndrej Karpathy
Well, definitely there seems to be... yeah, you're sort of... it's pretty incredible that these self-replicating systems will basically arise from the dynamics-
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
... and then they perpetuate themselves and become more complex and eventually become conscious and build a society. And I kind of feel like in some sense it's kind of like a deterministic wave, uh, that, you know, that kind of just like happens on any, you know, any sufficiently well-arranged system like Earth. And so I kind of feel like there's a certain sense of inevitability in it. Um, and it's really beautiful.
- LFLex Fridman
And it ends somehow, right? So it's a, it's a chemically a diverse environment where complex dynamical systems can, uh, evolve-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... and become more, more, further and further complex but then there's a certain, um, what is it? There's certain terminating conditions. (laughs)
- AKAndrej Karpathy
Yeah. I don't know what the terminating conditions are but definitely there's a trend line of something and we're part of that story and, like, where does that... where does it go? So, you know, we're famously described often as a biological boot loader for AIs.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
And that's because humans, I mean, you know, we're an incredible, uh, biological system and we're capable of computation and, uh, you know, a- and love and so on. Um, but, uh, we're extremely inefficient as well. Like, we're talking to each other through audio. It's just kind of embarrassing honestly that we're manipulating, like, seven symbols, uh, serially, w- we're using vocal chords. It's all happening over, like, multiple seconds.
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
It's just, like, kind of embarrassing when you step down to the, um, uh, frequencies at which com- computers operate or are able to c- operate on. And so basically it does seem like, um, synthetic intelligences are kind of like the next stage of development. And, um, I don't know where it leads to, like, at some point I suspect, uh, the universe is some kind of a puzzle and, uh, these, uh, synthetic AIs will uncover that puzzle and, um, solve it.
- LFLex Fridman
... and then what happens after, right? Like what, 'cause if you just like fast-forward earth many billions of years, it's like, uh, it's- it's quiet and then it's like t- turmoil, you see like city lights and stuff like that.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
And then what happens at, like at the end? Like is it like a , or is it-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... like a calming? Is it explosion? Is it like earth like open- like a giant... 'cause you said, uh, emit roasters.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
Like will it start emitting like a, like a giant number of-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... like satellites and systems on fire?
- 33:34 – 41:50
Transformers
- LFLex Fridman
very meta.
- AKAndrej Karpathy
(laughs)
- LFLex Fridman
Looking back, what is the most beautiful or surprising idea in deep learning or AI in general that you've come across? You've seen this field explode, uh, and grow in interesting ways. Just what- what cool ideas, like- like what made- made you sit back and go, "Hmm." Small, big or small?
- AKAndrej Karpathy
Well, the one that I've been thinking about recently the most probably is the- the transformer architecture. Um, so basically, uh, neural networks have, uh, a lot of architectures that were trendy, have come and gone for different, uh, in- sensory modalities like for vision, audio, text. You would process them with different looking neural nets. And recently we've seen the- this con- this convergence towards one architecture, the transformer. And, uh, you can feed it video or you can feed it, you know, images, or speech, or text and it just gobbles it up. And it's kind of like a, bit of a general-purpose, uh, computer that is also trainable and very efficient to run on our hardware. And so, uh, this paper came out in 2016, I wanna say, um-
- LFLex Fridman
Attention Is All You Need.
- AKAndrej Karpathy
Attention Is All You Need.
- LFLex Fridman
You criticized the paper title in retrospect that it wasn't, um, it didn't foresee the bigness of the impact-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... that it was going to have.
- AKAndrej Karpathy
Yeah. I'm not sure if the authors were aware of the impact that that paper would go on to have. Probably they weren't. Uh, but I think they were aware of some of the motivations and design decisions behind the transformer and they chose not to, I think, uh, expand on it in that way in the paper. And so I think they had an idea that- that there was more, um, than just the surface of just like, oh, we're just doing translation and here's a better architecture. You're not just doing translation. This is like a really cool differentiable, optimizable, efficient computer that you've proposed. And maybe they didn't have all of that foresight but I think it's really interesting.
- LFLex Fridman
Isn't it funny, sorry to interrupt, that that title is memeable, that they went, for such a profound idea, they went with a, I don't think anyone used that kind of title before, right?
- AKAndrej Karpathy
Attention Is All You Need? Yeah.
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
It's like a meme or something simply, yeah.
- LFLex Fridman
Yeah. Isn't that funny, that one? Like, uh, maybe if it was a more serious title-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... it wouldn't have the impact.
- AKAndrej Karpathy
Honestly, I-
- LFLex Fridman
(laughs)
- AKAndrej Karpathy
Yeah, there is an element of me that honestly agrees with you and prefers it this way.
- LFLex Fridman
Yes. (laughs)
- AKAndrej Karpathy
Uh, (laughs) if it was too grand, it would overpromise and then underdeliver potentially.
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
So you want to just, uh, meme your way to greatness. (laughs)
- LFLex Fridman
(laughs) That should be a T-shirt. So you- you tweeted that, "Transformer is a magnificent neural network architecture because it is a general-purpose differentiable computer. It is simultaneously expressive, in the forward pass, optimizable via bra- prop- back-propagation and gradient descent, and efficient, high parallelism compute graph." Can you discuss some of the- those details? Expressive, optimizable, efficient-
- AKAndrej Karpathy
Yeah. Um-
- LFLex Fridman
... from memory or- or in general. Whatever comes to your heart.
- AKAndrej Karpathy
You want to have a general-purpose computer that you can train on arbitrary problems, uh, like say the task of next word prediction or detecting if there's a cat in the image or something like that. And you want to train this computer so you want to set its- its weights. And I think there's a number of design criteria that sort of overlap in the transformer simultaneously that made it very successful. And I think the authors were kind of, um, deliberately trying to, uh, make this really...... uh, powerful architecture. And, um, so in a, basically it's very powerful in the forward pass because it's able to express, um, very, uh, general co- uh, computation as a sort of something that looks like message passing. Uh, you have nodes and they all store vectors. And, uh, these nodes get to basically look at each other and it's, uh, each other's vectors, and they get to communicate and basically, nodes get to broadcast, "Hey, I'm looking for certain things." And then other nodes get to broadcast, "Hey, these are the things I have." Those are the keys and the values.
- LFLex Fridman
So it's not just attention?
- AKAndrej Karpathy
Yeah, exactly. Transformer is much more than just the attention component. It's got many pieces, architectural, that went into it. The residual connection of the way it's arranged, there's a multi-layer perceptron in there, the way it's, uh, stacked and so on. Um, but basically there's a message-passing scheme where nodes get to look at each other, decide what's interesting, and then update each other. And, uh, so I think the, um, when you get to the details of it, I think it's a very expressive function. Uh, so it can express lots of different types of algorithms in a forward pass. Not only that but the way it's designed with the residual connections, layer normalizations, the softmax attention and everything, it's also optimizable. This is a really big deal because there's lots of computers that are powerful that you can't optimize, um, or they are not easy to optimize using the techniques that we have, which is back-propagation and gradient ascent. These are first-order methods, very simple optimizers really. And so, um, you also need it to be optimizable. Um, and then lastly, you want it to run efficiently on our hardware. Our hardware i- is a massive throughput machine like GPUs. Uh, they prefer lots of parallelism, so you don't want to do lots of sequential operations, you want to do a lot of operations serially. And the transformer is designed with that in mind as well. And so it's designed, uh, for our hardware and it's designed to both be very expressive in a forward pass but also very optimizable in the backward pass.
- 41:50 – 52:01
Language models
- LFLex Fridman
What do you think about one flavor of it which is language models? Have you been surprised? Uh, has your sort of imagination been captivated by, you- you mentioned GPT and all the bigger and bigger and bigger language models, and, uh, what are the limits o- of those models do you think? So just for the task of natural language.
- AKAndrej Karpathy
Basically, the way GPT is trained, right, is you just download a massive amount of, uh, text data from the internet and you try to predict the next, uh, word in the sequence roughly speaking. Uh, you're predicting little word chunks, uh, but, uh, roughly speaking, that's it. Um, and what's been really interesting to watch is...... uh, basically, it's a language model. Language models have actually existed for a very long time. Um, there's papers on language modeling from 2003, even earlier.
- LFLex Fridman
Can you explain, in that case, what a language model is?
- AKAndrej Karpathy
Uh, yeah. So language model, just, uh, basically, the rough idea is, um, just predicting the next word in a sequence-
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
... roughly speaking. Uh, so there's a paper from, for example, uh, Bengio, uh, and the team from 2003, where for the first time, they were using a, uh, neural network to take, say, like three or five words and predict the, um, next word. And they're doing this on much smaller data sets. And the neural net is not a transformer, it's a multi-layer perceptron. But it, uh, but it's the first time that a neural network has been applied in that setting. But even before neural networks, there were, um, language models, except they were u- uh, using, um, n-gram models. So n-gram models are just, uh, count-based models. So, um, if you try to, if you start to take two words and predict a third one, uh, you'll just count up how many times you've seen any, uh, two-word combinations and what came next.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
And, uh, what you predict as coming next is just what you've seen the most of in the training set. And so, uh, language modeling has been around for a long time. Neural networks have done language modeling for a long time. So really what's, uh, new, or interesting, or exciting is just realizing that when you scale it up, uh, with a powerful enough neural net, a transformer, you have all these emergent properties where, uh, basically what happens is if you have a large enough data set of text, you are, in the task of predicting the next, uh, word, you are multitasking a huge amount of different kinds of problems. You are multitasking understanding of, you know, chemistry, physics, human nature. Lots of things are sort of clustered in that objective. It's a very simple objective but actually, you have to understand a lot about the world to pr- to make that prediction.
- LFLex Fridman
You just said the U-word, understanding. Uh, w-
- AKAndrej Karpathy
(laughs) .
- LFLex Fridman
... are you, uh, in terms of chemistry and physics and so on, w- w- what do you feel like it's doing? Is it searching for the right context, uh, in, in, like what-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... what is it, what is the actual process happening here?
- AKAndrej Karpathy
Yeah, so basically, it gets 1,000 words and it's trying to predict the 1,001st. And, uh, in order to do that very, very well over the entire data set available on the internet, you actually have to basically kind of understand, uh, the context of, of what's going on in there.
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
Um, and, uh, it's a sufficiently hard problem that you, uh, if you have a powerful enough computer, like a transformer, you end up with, uh, interesting solutions. (laughs) . And, uh, you can ask it, uh, to, uh, do all kinds of, uh, things and, um, it i- it shows a lot of, uh, emergent properties like in-context learning. That was the big deal with GPT and the original paper when they published it, is that you can just sort of, uh, prompt it in various ways and ask it to do various things and it will just kind of complete the sentence. But in the process of just completing the sentence, it's actually solving all kinds of really, uh, interesting problems that we care about.
- LFLex Fridman
Do you think it's doing something like understanding?
- AKAndrej Karpathy
Um, yeah.
- LFLex Fridman
Like when we use the word understanding for us humans.
- AKAndrej Karpathy
I think it's doing some understanding. It ha- in its way, it's- it understands, I think, a lot about the world and it has to in order to predict the next word in a sequence.
- LFLex Fridman
So it's trained on the data from the internet. Uh, what do you think about this, this approach in terms of data sets, of using data from the internet? Do you think the internet has enough structured data to teach AI about human civilization?
- AKAndrej Karpathy
Yeah, so I think the internet has a huge amount of data. I'm not sure if it's a complete enough set. I don't know that, uh, text is enough for having a sufficiently powerful AGI as an outcome. Um-
- LFLex Fridman
Of course, there is audio, and video, and images-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... and all that kind of stuff.
- AKAndrej Karpathy
Yeah, so text by itself, I'm a little bit suspicious about. There's a ton of things we don't put in text, in writing, uh, just because they're obvious to us about how the world works and the physics of it and that things fall. We don't put that stuff in text because why would you? We share that understanding. And so text is a communication medium between humans. And it's not a, uh, all-encompassing medium of knowledge about the world but as you pointed out, we do have video, and we have images, and we have audio. And so I think that, uh, that definitely helps a lot. But we haven't trained models, uh, sufficiently, uh, across both, across all those modalities yet. Uh, so I think that's what a lot of people are interested in.
- LFLex Fridman
But I wonder what that shared understanding of c- like what we might call common sense has to be learned, inferred in order to complete the sentence correctly. So maybe the fact that it's implied on the internet, the model's gonna have to learn that. Not by reading about it, by inferring it in the representation. So, like, common sense, just like we, I don't think we learn common sense, like nobody says, tells us explicitly, we just figure it all out-
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
... by interacting with the world.
- AKAndrej Karpathy
Right.
- 52:01 – 58:21
Bots
- AKAndrej Karpathy
today.
- LFLex Fridman
Do you worry about bots on the internet? Given, given these ideas, given how exciting they are, do you worry about bots on Twitter being not the, the stupid bots that we see now or the crypto bots, but the bots that might be out there actually that we don't see, that they're interacting in interesting ways? So this kind of system feels like it should be able to pass the, "I'm not a robot" click button, whatever.
- AKAndrej Karpathy
Hmm.
- LFLex Fridman
Um, which do you actually understand how that test works? I don't quite...
- AKAndrej Karpathy
(laughs)
- LFLex Fridman
Like, uh, there's, there's a, there's a checkbox or whatever that you click.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
It's presumably tracking...
- AKAndrej Karpathy
Oh, I see.
- LFLex Fridman
... s- like, mouse movement and the timing and so on.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
So e- exactly this kind of system we're talking about should be able to pass that. So w- yeah, what do you feel about, um, bots that are language models plus have some interactability and are able to tweet and reply and so on, do you worry about that world?
- AKAndrej Karpathy
Uh, yeah, I think it's always been a bit of an arms race, uh, between sort of the attack and the defense. Uh, so the attack will get stronger but the defense will get stronger as well, uh, our ability to detect that.
- LFLex Fridman
How do you defend? How do you detect? How do you know that your Karpathy account on Twitter is, is human? How would you approach that? Like if people were claim, you know, uh... How would you defend yourself in the court of law that I'm a human?
- AKAndrej Karpathy
Um...
- LFLex Fridman
This account is human.
- AKAndrej Karpathy
Yeah, at some point, I think, uh-
- LFLex Fridman
(laughs)
- AKAndrej Karpathy
... it might be... I think the society, uh, society will evolve a little bit, like, we might start signing, digitally signing, uh, some of our correspondence or, you know, things that we create. Uh, right now, it's not necessary but maybe in the future it might be. I do think that we are going towards a world where we share, we share the digital space with, uh, AIs.
- LFLex Fridman
Synthetic beings.
- AKAndrej Karpathy
Yeah. And, uh, they will get much better and they will share our digital realm and they'll eventually share our physical realm as well. It's much harder. Uh, but that's kind of, like, the world we're going towards. And most of them will be benign and helpful and some of them will be malicious and it's going to be an arms race trying to detect them.
- LFLex Fridman
So, I mean, the worst isn't the AIs, the worst is the AIs pretending to be human.
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
So it might... I don't know if it's always malicious. There's o- obviously a lot of malicious applications but...
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... it could also be... You know, if I was an AI, I would try very hard to pretend to be human because-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... we're in a human world.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
I wo- I wouldn't get any respect as an AI.
- 58:21 – 1:05:44
Google's LaMDA
- AKAndrej Karpathy
- LFLex Fridman
There was a Google engineer that claimed that, uh, LaMDA was sentient. Do you think there's any inkling of truth to what he felt? And more importantly, to me at least, do you think language models will achieve sentience or the illusion of sentience soon-ish? Ish.
- AKAndrej Karpathy
Yeah. To me it's a little bit of a canary in a coal mine kind of moment, honestly, a little bit. Uh, because, uh, so this engineer spoke to like a chatbot at Google-
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
... and, uh, became convinced that, uh, this bot is sentient.
- LFLex Fridman
Yeah, asked it some existential philosophical questions.
- AKAndrej Karpathy
Right, and it gave like reasonable answers and looked real and, uh, and so on. So to me it's a, uh... He was, he was, uh, he wasn't sufficiently trying to stress the system, I think, and, uh, exposing the truth of it as it is today.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
Um, but, uh, I think this will be increasingly harder over time. Uh, so, uh, yeah, I think more and more people will basically, uh, become, um... Yeah, I think more and more, uh, there will be more people like that over time as- as this gets better.
- LFLex Fridman
Like form an emotional connection to-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... to- to an AI chat bot?
- AKAndrej Karpathy
Yeah, perfectly plausible in my mind. I think these AIs are actually quite good at human c- human connection, human emotion. A ton of text on the internet is about humans and connection and love and so on. So I think they have a very good understanding in some, in some sense of- of how people speak to each other about this.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
And, um, they're very capable of creating a lot of that kind of text. The, um, there's a lot of like sci-fi from the '50s and '60s that imagined AIs in a very different way. They are calculating, cold Vulcan-like machines. That's not what we are getting today. We're getting pretty emotional AIs (laughs) that actually, uh, are very competent and capable of generating, you know, plausible sounding text with respect to all these topics.
- LFLex Fridman
See, I'm really hopeful about AI systems that are like companions that help you grow, develop as a human being, uh, help you maximize long-term happiness. But I'm also very worried about AI systems that figure out from the internet that humans get attracted to drama and so these will just be like shit talking AIs.
- AKAndrej Karpathy
(laughs)
- LFLex Fridman
They're just constantly, "Did you hear?" Like they'll do gossip. They'll do, uh, they'll try to plant seeds of suspicion to like other humans that you love and trust and, uh, just kind of mess with people, uh, and you know, 'cause- 'cause that's going to get a lot of attention. So drama, maximize drama-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... on the path to maximizing, uh, engagement and us humans will feed into that machine.
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
And get, it'll be a giant drama shit storm of-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... (laughs) ... so I'm worried about that. So it's the objective function really defines, uh, the way that human civilization progresses with AIs in it.
- AKAndrej Karpathy
Yeah. I think right now, at least, today, they are not sort of, it's not correct to really think of them as goal-seeking agents that want to do something.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
They have no long-term memory or anything. They, it's literally, a good approximation of it is you get 1,000 words and you're trying to predict the 1,001st, and then you continue feeding it in. And you are free to prompt it in whatever way you want. So in text. So you say, "Okay. Uh, you are a psychologist, and you are very good, and you love humans. And, uh, here's a conversation between you and another human. Human, colon, something."
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
"You something." And then it just continues the pattern. And suddenly you're having a conversation with a fake psychologist who's like trying to help you.
- LFLex Fridman
Mm-hmm.
- 1:05:44 – 1:16:44
Software 2.0
- AKAndrej Karpathy
- LFLex Fridman
So you've spoken a lot about the idea of Software 2.0.
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
Um, all good ideas become like cliches so quickly, like the terms-
- AKAndrej Karpathy
Yeah. (laughs)
- LFLex Fridman
... it's, it's, it's kind of hilarious. Um, it's like I think Eminem once said that, like, if he gets annoyed by a song he's written very quickly, that means it's gonna be a big hit-
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
... 'cause it's, it's too catchy.
- AKAndrej Karpathy
Mm-hmm.
- LFLex Fridman
But, uh, can you describe this idea and how your thinking about it has evolved over the months and years since, since you coined it?
- AKAndrej Karpathy
Yeah. Yeah, so I had a blog post on Software 2.0, I think, several years ago now. Um, and the reason I wrote that post is because I kept, I kind of saw something remarkable happening in, like software development and how a lot of code was being transitioned to be written not in sort of like C++ and so on, but it's written in the weights of a neural net.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
Basically just saying that neural nets are taking over software, the realm of software-
- LFLex Fridman
Yeah.
- AKAndrej Karpathy
... and, um, taking more and more and more tasks. And at the time, I think not many people understood, uh, this, uh, deeply enough that this is a big deal, this is a big transition. Uh, neural networks were seen as one of multiple classification algorithms you might use for your data set problem on Kaggle. Like, this is not that. This is a change in how we program.... computers. And, uh, I saw neural nets as, uh, this is going to take over. Uh, the way we program computers is going to change. It's not going to be people writing, uh, software in C++ or something like that and directly programming the software. It's going to be accumulating, uh, training sets and data sets and crafting these objectives by which we train these neural nets. And at some point there's going to be a compilation process from the data sets, and the objective, and the architecture specification into the binary which is really just, uh, the neural net, uh, you know, weights and the forward pass of the neural net. And then you can deploy that binary. And so I was talking about that sort of transition and, uh, that's what the post is about. And I saw this sort of play out in a lot of, uh, fields, uh, you know, autopil- autopilot being one of them but also just a simple image classification. People thought originally, you know, in the '80s and so on, that they would write the algorithm for detecting a dog in an image. And they had all these ideas about how the brain does it, and first, we detect corners, and then we detect lines, and then we stitched them up. And they were, like, really going at it. They were, like, thinking about how they're gonna write the algorithm and this is not the way you build it. (laughs) Um, and there was a smooth transition where, okay, uh, first we thought we were gonna build everything. Then we were building the features, uh, so like, HOG features and things like that, uh, that detect these little statistical patterns from image patches, and then there was a little bit of, uh, learning on top of it, like, a support vector machine or binary classifier, uh, for cat versus dog in images on top of the features. So we wrote the features but we trained the last layer, sort of the- the classifier. And then people are like, "Actually, let's not even design the features because we can't. Honestly, we're not very good at it. So let's also learn the features." And then you end up with basically a convolutional neural net where you're learning most of it, you're just specifying the architecture and, uh, the architecture has tons of the fill-in-the-blanks, which is all the knobs, and you let the optimization write most of it. And so this transition is happening across the industry everywhere. And, uh, suddenly we end up with a ton of code that is written in neural net weights. And I was just pointing out that the analogy is actually pretty strong and we have a lot of developer environments for Software 1.0, like we have, uh, IDEs, um, how you work with code, how you debug code, how do you- how do you run code, uh, how do you maintain code? We have GitHub. So I was trying to make those analogies in the neural net, like what is the GitHub of Software 2.0? Turns out that something that looks like Hugging Face right now.
- LFLex Fridman
(laughs)
- AKAndrej Karpathy
Uh... You know? And so I think some people took it seriously and built cool companies and, uh, many people originally attacked the post. It actually was not well-received when I wrote it.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
And I think maybe it has something to do with the title but the post was not well-received and I think more people sort of have been coming around to it over time.
- LFLex Fridman
Yeah. So y- you were the director of AI at Tesla where I think this idea was really implemented at scale which is how you have engineering teams doing Software 2.0. So can you sort of linger on that idea of... I think we're in the really early stages of everything you just said which is like GitHub IDEs, like how- how do we build engineering teams that- that work in Software 2.0 systems? And, uh, the an- the data collection and the data annotation which is all part of that Software 2.0, like, what do you think is the task of programming in Software 2.0? Is it debugging in the space of hyper parameters or is it also debugging in the space of data?
- AKAndrej Karpathy
Yeah. The way by which you program the computer and influence its algorithm is not by writing the commands yourself, you're changing mostly the data set, uh, you're changing the, um, loss functions of, like, what the neural net is trying to do, how it's trying to predict things. But, yeah, basically, the data sets and the architecture, so the neural net. And, um, um, so in the case of the autopilot, a lot of the data sets had to do with, for example, detection of objects, and lane line markings, and traffic lights, and so on. So you accumulate massive data sets of here's an example, here's the desired label, and then, uh, here's roughly how the archite- here's roughly what the algorithm should look like and that's a convolutional neural net. So the specification of the architecture is like a hint as to what the algorithm should roughly look like, and then the fill-in-the-blanks, uh, process of optimization is, uh, is the training process. And then you take your neural net that was trained, it gives all the right answers on your data set, and you deploy it.
- LFLex Fridman
So there's, uh, in that case, perhaps, in all machine learning cases, there's a lot of tasks. So is coming up formulating a task like a- for a multi-headed neural network, is formulating a task part of the programming?
- AKAndrej Karpathy
Yeah. Very much so.
- LFLex Fridman
How do you break down a problem-
- AKAndrej Karpathy
Yeah.
- LFLex Fridman
... into a set of tasks?
- AKAndrej Karpathy
Yeah. Um, on a high level I would say if you look at the software running in- in the autopilot, I gave a number of talks on this topic, I would say originally a lot of it was written in Software 1.0. There's, uh, imagine lots of C++, uh, right? And then, uh, gradually there was a tiny neural net that was, for example, predicting given a single image is there, like, a traffic light or not or is there a lane line marking or not? And this neural net didn't have, uh, too much to do in the s- in the scope of the software. It was making tiny predictions on individual little image and then the rest of the system stitched it up. So, okay, we're actually- we don't have just a single camera, we have eight cameras. We actually have eight cameras over time. And so what do you do with these predictions? How do you put them together? How do you do the fusion of all that information and how do you act on it? All of that was written by humans, um, in C++. And then we decided, okay, we don't actually want, uh, uh, to do all of that fusion in, uh, C++ code because we're actually not good enough to write that algorithm.
- LFLex Fridman
Mm-hmm.
- AKAndrej Karpathy
We want the neural nets to write the algorithm and we want to port, uh, all of that software into the 2.0 stack. And so then we actually had neural nets that now take all the eight camera images simultaneously and make predictions for all of that. Uh, so, um, and- and it's- and actually they don't make predictions in the- in the space of images, they now make predictions directly in 3D.
- LFLex Fridman
Mm-hmm.
Episode duration: 3:28:47
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode cdiD-9MMpb0
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome