Jim Keller: The Future of Computing, AI, Life, and Consciousness | Lex Fridman Podcast #162

Jim Keller is a legendary microprocessor engineer, previously at AMD, Apple, Tesla, Intel, and now Tenstorrent. Please support this podcast by checking out our sponsors: - Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil - Brooklinen: https://brooklinen.com and use code LEX to get $25 off + free shipping - ExpressVPN: https://expressvpn.com/lexpod and use code LexPod to get 3 months free - Belcampo: https://belcampo.com/lex and use code LEX to get 20% off first order EPISODE LINKS: Jim's Twitter: https://twitter.com/jimkxa Jim's Wiki: https://en.wikipedia.org/wiki/Jim_Keller_(engineer) Tenstorrent: https://www.tenstorrent.com/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41 OUTLINE: 0:00 - Introduction 1:33 - Good design is both science and engineering 7:33 - Javascript 11:40 - RISC vs CISC 15:39 - What makes a great processor? 17:09 - Intel vs ARM 18:58 - Steve Jobs and Apple 21:36 - Elon Musk and Steve Jobs 27:21 - Father 31:03 - Perfection 37:18 - Modular design 42:52 - Moore's law 49:50 - Hardware for deep learning 56:44 - Making neural networks fast at scale 1:04:22 - Andrej Karpathy and Chris Lattner 1:08:36 - How GPUs work 1:12:43 - Tesla Autopilot, NVIDIA, and Mobileye 1:17:23 - Andrej Karpathy and Software 2.0 1:23:43 - Tesla Dojo 1:26:20 - Neural networks will understand physics better than humans 1:28:33 - Re-engineering the human brain 1:33:26 - Infinite fun and the Culture Series by Iain Banks 1:35:20 - Neuralink 1:40:43 - Dreams 1:44:37 - Ideas 1:54:49 - Aliens 1:59:46 - Jordan Peterson 2:04:44 - Viruses 2:07:52 - WallStreetBets and Robinhood 2:15:55 - Advice for young people 2:17:45 - Human condition 2:20:14 - Fear is a cage 2:25:04 - Love 2:31:27 - Regrets SOCIAL: - Twitter: https://twitter.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/LexFridmanPage - Instagram: https://www.instagram.com/lexfridman - Medium: https://medium.com/@lexfridman - Reddit: https://reddit.com/r/lexfridman - Support on Patreon: https://www.patreon.com/lexfridman

Lex FridmanhostJim Kellerguest

Feb 18, 20212h 39mWatch on YouTube ↗

EVERY SPOKEN WORD

150 min read · 30,003 words

0:00 – 1:33
Introduction
1. LFLex Fridman
  The following is a conversation with Jim Keller. His second time on the podcast. Jim is a legendary microprocessor architect and is widely seen as one of the greatest engineering minds of the computing age. In a peculiar twist of space time in our simulation, Jim is also a brother-in-law of Jordan Peterson. We talk about this and about computing, artificial intelligence, consciousness, and life. Quick mention of our sponsors: Athletic Greens all-in-one nutrition drink, Brooklinen Sheets, ExpressVPN, and Belcampo grass-fed meat. Click the sponsor links to get a discount and to support this podcast. As a side note, let me say that Jim is someone who, on a personal level, inspired me to be myself. There was something in his words, on and off the mic, or perhaps that he even paid attention to me at all, that almost told me, "You're all right, kid." A kind of pat on the back that can make the difference between a mind that flourishes and a mind that is broken down by the cynicism of the world. So, I guess that's just my brief few words of thank you to Jim, and in general, gratitude for the people who have given me a chance on this podcast, in my work, and in life. If you enjoy this thing, subscribe on YouTube, review it on Apple Podcasts, follow on Spotify, support on our Patreon, or connect with me on Twitter @lexfridman. And now, here's my conversation with Jim Keller.
1:33 – 7:33
Good design is both science and engineering
1. LFLex Fridman
  What's the value and effectiveness of theory versus engineering, this dichotomy, in, uh, building good software or s- hardware systems?
2. JKJim Keller
  Well, it's... Good design is both. I guess that's pretty obvious. By engineering, do you mean, you know, reduction to practice of known methods? And then science is the pursuit of discovering things that people don't understand or solving unknown problems.
3. LFLex Fridman
  Definitions are interesting here, but I was thinking more in theory, constructing models that kind of generalize about how things work.
4. JKJim Keller
  Mm-hmm.
5. LFLex Fridman
  Engineering is, uh, like actually building stuff. The pragmatic, like-
6. JKJim Keller
  Mm-hmm.
7. LFLex Fridman
  ... okay, we have these nice models, but how do we actually get things to work? Maybe economics is a nice example. Like economists have all these models of how the economy works and how different policies will have an effect, but then there's the actual, okay, let's call it engineering of like-
8. JKJim Keller
  Yeah.
9. LFLex Fridman
  ... actually deploying the policies.
10. JKJim Keller
  So, computer design is almost all engineering and reduction to practice of known methods. Now, because of the complexity of the computers we build, you know, you- you could think you're, well, we'll just go write some code and then we'll verify it and then we'll put it together, and then you find out that the combination of all that stuff is complicated, and then you have to be inventive to figure out how to do it. Right? So that's- that's definitely ha- happens a lot. And then every so often some big idea happens, but it might be one person.
11. LFLex Fridman
  And that idea is in what? In the space of engineering or is it a to- in the space of ideas?
12. JKJim Keller
  Well, I'll give you an example. So, one of the limits of computer performance is branch prediction. So... And there's- there's a whole bunch of ideas about how good you could predict a branch. And people said there's a limit to it, and that's been taught a curve, and somebody came up with a better way to do branch prediction. It was a lot better. And he published a paper on it, and every computer in the world now uses it. And it was one idea. So the- the engineers who build branch prediction hardware were happy to drop the one kind of training array and put it in another one.
13. LFLex Fridman
  Mm-hmm.
14. JKJim Keller
  So, it was- it was a real idea.
15. LFLex Fridman
  And branch prediction is- is one of the key problems underlying all of sort of the lowest level of software. It boils down to branch prediction.
16. JKJim Keller
  Boils down to uncertainty. Computers are limited by... You know, single thread computer's limited by two things. The- the predictability of the path of the branches and the predictability of the locality of- of data. So, we have predictors that now predict both of those pretty well.
17. LFLex Fridman
  Yeah.
18. JKJim Keller
  So, memory is, you know, a couple hundred cycles away. Local cache is a couple cycles away. When you're executing fast, virtually all the data has to be in the local cache. So, a simple program says, you know, add one to every element in an array. It's really easy to see what the stream of data will be.
19. LFLex Fridman
  Mm-hmm.
20. JKJim Keller
  But you might have a more complicated program that's, you know, says get a- get an element of this array, look at something, make a decision, go get another element, it's kind of random. And you can think, that's really unpredictable. And then you make this big predictor that looks at this kind of pattern and you realize, well, if you get this data and this data, then you probably want that one. And if you get this one and this one and this one, you probably want that one.
21. LFLex Fridman
  And is that theory or is that engineering? Like the paper that was written, was it, uh-
22. JKJim Keller
  Well, that was prob-
23. LFLex Fridman
  ... asymptotic kinda- kinda discussion or is it more like here's a hack that works well?
24. JKJim Keller
  Um, it's a little bit of both. Like there's information theory in it, I think somewhere.
25. LFLex Fridman
  Okay. So it's-
26. JKJim Keller
  Yeah.
27. LFLex Fridman
  ... it's actually trying to prove some kind of stuff.
28. JKJim Keller
  Yeah. But once- once you know the method, implementing it is an engineering problem. Now, there's a flip side of this which is in a big design team, what percentage of people think their- their- their, uh, their- their- their plan or their life's work is engineering versus design... inventing things? So, lots of companies will reward you for filing patents.
29. LFLex Fridman
  Yes.
30. JKJim Keller
  Some... Many big companies get stuck because to get promoted you have to come up with something new. And then what happens is everybody's trying to do some random new thing, 99% of which doesn't matter, and the basics get neglected. And... Or they get to... There's a dichotomy. They think like the cell library and the basic CAD tools, you know, or basic, you know, software validation methods, that's simple stuff, you know. They want to work on the exciting stuff. And then they- they- they spend lots of time trying to figure out how to patent something.And that's mostly useless.
7:33 – 11:40
Javascript
1. JKJim Keller
2. LFLex Fridman
  I don't know if you know Brendan Eich, he wrote JavaScript in 10 days.
3. JKJim Keller
  Uh-huh.
4. LFLex Fridman
  And that's an interesting story. It makes me wonder... And it was, you know, famously for many years considered to be a pretty crappy programming language.
5. JKJim Keller
  Mm-hmm.
6. LFLex Fridman
  It still is perhaps. It's been improving sort of consistently. But the interesting thing about that guy is, you know, he doesn't get any awards. (laughs)
7. JKJim Keller
  (laughs)
8. LFLex Fridman
  You don't get a Nobel Prize or a Fields Medal or-
9. JKJim Keller
  Uh-huh.
10. LFLex Fridman
  ... uh, he might not even-
11. JKJim Keller
  For inventing a- a crappy piece of, you know, software code that-
12. LFLex Fridman
  That- well, that is currently the number one programming language in the world and runs, now is cons- i- increasingly running the backend of the internet, the frontend of the internet.
13. JKJim Keller
  Well, does he, does he know why everybody uses it? Like, that would be an interesting thing. Was it the right thing at the right time? 'Cause like when stuff like JavaScript came out, like there was a move from, you know, writing C programs in C++ to, let's call it what they call managed code frameworks.
14. LFLex Fridman
  Mm-hmm.
15. JKJim Keller
  Where you write simple code, it might be interpreted, it has lots of libraries, productivity is high and you don't have to be an expert. So, you know, Java was supposed to solve all the world's problems. It was complicated. JavaScript came out, you know, after a bunch of other scripting languages. I'm not an expert on it but-
16. LFLex Fridman
  Yeah.
17. JKJim Keller
  ... but was it the right thing at the right time?
18. LFLex Fridman
  The right thing at the right-
19. JKJim Keller
  Or was there something, you know, clever? 'Cause he wasn't the only one.
20. LFLex Fridman
  There's a few elements. One is-
21. JKJim Keller
  And maybe if he figured out what it was-
22. LFLex Fridman
  No, I think-
23. JKJim Keller
  ... then he'd get a prize. (laughs)
24. LFLex Fridman
  (laughs)
25. JKJim Keller
  Like that's-
26. LFLex Fridman
  Constructive theory. (laughs)
27. JKJim Keller
  Yeah. You know, maybe this problem is he hasn't defined this, or he just needs a good promoter.
28. LFLex Fridman
  (laughs) Well, I think there's a bunch of blog posts written about it, which is like wrong is right. Which is like doing the crappy thing fast, just like hacking together the thing that answers some of the needs-
29. JKJim Keller
  Mm-hmm.
30. LFLex Fridman
  ... and then iterating over time, listening to developers, like listening to people who actually use the thing.
11:40 – 15:39
RISC vs CISC
1. JKJim Keller
2. LFLex Fridman
  Well, I mean, isn't- isn't that also the story of RISC versus CISC? I mean, is that simplicity? There's something about simplicity that, uh, us in this evolutionary process is valued.
3. JKJim Keller
  Yeah.
4. LFLex Fridman
  If it's simple, it's, uh, gets... It spreads faster, it seems like.
5. JKJim Keller
  Yeah.
6. LFLex Fridman
  Or is that not always true?
7. JKJim Keller
  That's not always true. Yeah. It could be simple is good, but too simple is bad.
8. LFLex Fridman
  So why did RISC win, you think, so far?
9. JKJim Keller
  Did RISC win?
10. LFLex Fridman
  (laughs)
11. JKJim Keller
  We don't know.
12. LFLex Fridman
  In the long arc of history, maybe not. (laughs)
13. JKJim Keller
  We, we don't know.
14. LFLex Fridman
  So, who, who's gonna win? What, what's RISC, what's CISC, and who's gonna win in that space in these instruction sets?
15. JKJim Keller
  Well, A-, AI software's gonna win, but there'll be little computers that run little programs like normal all over the place. But, but we're, we're going through another transformation, so.
16. LFLex Fridman
  B- but y- you think instruction sets underneath it all will change?
17. JKJim Keller
  Yeah, they evolve slowly. They, they don't matter very much.
18. LFLex Fridman
  They don't matter very much, okay.
19. JKJim Keller
  Yeah. I mean, the, the limits of performance are, you know, predictability of instructions and data. I mean, that's the big thing. And then the usability of it is some, you know, quality of design, quality of tools, availability. Like, right now, x86 is proprietary with Intel and AMD, but they can change it any way they want independently.
20. LFLex Fridman
  Mm-hmm.
21. JKJim Keller
  Right? Arm is proprietary to Arm, and they won't let anybody else change it. So, it's like a sole point. And RISC-V is open source, so anybody can change it, which is super cool. But that also might mean it gets changed in too many random ways that there's no common sub- subset of it that people can use.
22. LFLex Fridman
  Do you like open or do you like closed? Like, if you were to bet all your money on one or the other, RISC-V versus A?
23. JKJim Keller
  No idea.
24. LFLex Fridman
  It's case dependent?
25. JKJim Keller
  Well, x86, oddly enough, when Intel first started developing it, they licensed it to, like, seven people. So, it was the open architecture.
26. LFLex Fridman
  (laughs)
27. JKJim Keller
  And then they moved faster than others and also bought one or two of them. But there was seven different people making x86. 'Cause at the time, there was 6502 and Z80s and, you know, 8086. And you could argue everybody thought Z80 was the better instruction set, but that was propriety to, proprietary to one place. Oh, and the 6800.
28. LFLex Fridman
  So-
29. JKJim Keller
  There was, like, five differe- four or five different microprocessors. Intel went open, got the market share 'cause people felt like they had multiple sources from it, and then over time, it narrowed down to two players.
30. LFLex Fridman
  So, why ... You as a historian, uh, wh- (laughs) why did Intel win for so long with, uh, with their processors? I mean, I-
15:39 – 17:09
What makes a great processor?
1. JKJim Keller
  And al-
2. LFLex Fridman
  What, what makes a great processor in that? What, you know?
3. JKJim Keller
  Oh, if you just look at its performance versus everybody else, it's, you know, the size of it, the, you know, usability of it.
4. LFLex Fridman
  So, it's not specific, some kind of element that makes it beautiful, it's just, like, literally just raw performance. Is that how you think about processors, as just, like, raw performance?
5. JKJim Keller
  Of course. (laughs) It's like a horse race.
6. LFLex Fridman
  So-
7. JKJim Keller
  The fastest one wins. Now-
8. LFLex Fridman
  You don't care how. (laughs) Just as long as it wins.
9. JKJim Keller
  Well, well, there's the, the fastest in an environment, like-
10. LFLex Fridman
  Right.
11. JKJim Keller
  ... you know, for years, you made the fastest one you could, and then people started to have power limits. So, then you made the fastest at the right power point.
12. LFLex Fridman
  Yeah.
13. JKJim Keller
  And then, and then when we started doing multiprocessors, like, if you could scale your processors more than the other guy, you could be 10% faster on, like, a single thread, but you have more threads. So, there's lots of variability. And then Arm really explored, like, you know, they have the A series and the R series and the M series, like a family of processors for all these different design points, from, like, unbelievably small and simple. And so then when you're doing the design, it's sort of like this big palette of CPUs.
14. LFLex Fridman
  Mm-hmm.
15. JKJim Keller
  Like, they're the only ones with a credible, you know, top-to-bottom palette, and ...
16. LFLex Fridman
  Wh- wh- what do you mean a credible, uh, top-to-bottom palette?
17. JKJim Keller
  Well, there's people who make microcontrollers that are small, but they don't have a fast one. There's people who make fast processors, but don't have a litt- a medium one or a small one.
18. LFLex Fridman
  Is that hard to do, that full palette? That's, that seems like a ...
19. JKJim Keller
  Yeah, it's a lot of different
17:09 – 18:58
Intel vs ARM
1. JKJim Keller
  ...
2. LFLex Fridman
  So, what's the difference between, uh, the Arm folks and Intel in terms of the way they're approaching this problem?
3. JKJim Keller
  Well, Intel, almost all their processor designs were, you know, very custom, high-end, you know, for the last 15, 20 years.
4. LFLex Fridman
  For the fastest horse possible.
5. JKJim Keller
  Yeah.
6. LFLex Fridman
  (laughs) In one horse race.
7. JKJim Keller
  And ... Yeah, and they, they, architecturally, they're really good, but the company itself was fairly insular to what's going on in the industry with CAD tools and stuff. And there's this debate about custom design versus synthesis, and how do you approach that? I, I'd say Intel was slow on the getting to synthesize processors. Arm came in from the bottom, and they generated IP, which went to all kinds of customers, so they had very little say in how the customer implemented their IP. So, Arm is super friendly to the synthesis IP environment, whereas Intel said, "We're gonna make this great client chip or server chip with our own CAD tools, with our own process, with our own, you know, other supporting IP, and everything only works with our stuff."
8. LFLex Fridman
  So, was that, um... Is Arm winning the mobile platform space in terms of process?
9. JKJim Keller
  Of course, yeah.
10. LFLex Fridman
  And so, i- in that, in... What you're describing is why they're winning.
11. JKJim Keller
  Well, they had lots of people doing lots of different experiments, so they controlled the processor architecture and IP, but they let people put in lots of different chips. And there was a lot of variability in what happened there. Whereas Intel, when they made their mobile, their foray into mobile, they had one team doing one part, right? So, it wasn't 10 experiments. And then their mindset was PC mindset, Microsoft software mindset, and that brought a whole bunch of things along that, uh, the mobile world and the embedded world don't do.
12. LFLex Fridman
  Do you think it was possible for Intel to pivot hard and win the mobile market?
18:58 – 21:36
Steve Jobs and Apple
1. LFLex Fridman
2. JKJim Keller
  Sure.
3. LFLex Fridman
  That's a hell of a difficult thing to do, right? For a huge company to just pivot. I mean, it's so interesting to... 'Cause we'll talk about your current work. It's like, it's clear that PCs were dominating for several decades, like desktop computers, and then mobile, it's unclear.
4. JKJim Keller
  It's a, it's a leadership question. Like Ap- like Apple under Steve Jobs, when he came back, they pivoted multiple times.
5. LFLex Fridman
  Yeah.
6. JKJim Keller
  You know, they built iPads and iTunes and phones and tablets and great Macs. Like, like, who knew computers could be made out of aluminum? Nobody knew that. But they're great. It's super fun.
7. LFLex Fridman
  That was Steve?
8. JKJim Keller
  Yeah, Steve Jobs. Like, they pivoted multiple times, and, uh, you know, the old Intel, they, they did that multiple times. They made DRAMs and processors and processes and...
9. LFLex Fridman
  I gotta ask this. What was it like working with Steve Jobs?
10. JKJim Keller
  I didn't work with him.
11. LFLex Fridman
  Did you interact with him?
12. JKJim Keller
  Twice.
13. LFLex Fridman
  (laughs)
14. JKJim Keller
  I said hi to him twice in the cafeteria.
15. LFLex Fridman
  What did he say? Hi?
16. JKJim Keller
  He said, "Hey, fellas."
17. LFLex Fridman
  (laughs)
18. JKJim Keller
  He was friendly.
19. LFLex Fridman
  (laughs)
20. JKJim Keller
  He was wandering around, uh, with somebody. He couldn't find a table 'cause the cafeteria was, was packed, and I gave him my table. But I worked for Mike Colbert, who talked to... Like, Mike, Mike was the unofficial CTO of Apple and a brilliant guy, and he worked for Steve for 25 years, maybe more. And he talked to Steve multiple times a day, and he was one of the people who could put up with Steve's, let's say, brilliance and intensity. And, and Steve really liked him, and Steve trusted Mike to translate the shit he thought up into engineering products that worked. And then Mike ran a group called Platform Architecture, and I was in that group. So, many times, I'd be sitting with Mike, and the phone would ring, and it'd be Steve. And Mike would hold the phone like this 'cause Steve would be yelling about something or other.
21. LFLex Fridman
  Yeah, and then he would translate.
22. JKJim Keller
  And he'd translate, and then he would say, "Steve wants us to do this." So...
23. LFLex Fridman
  Was Steve a good engineer or no?
24. JKJim Keller
  I don't know. He was a great idea guy.
25. LFLex Fridman
  Idea person.
26. JKJim Keller
  And he was a really good selector for talent.
27. LFLex Fridman
  Yeah. That seems to be-
28. JKJim Keller
  So, again-
29. LFLex Fridman
  ... one of the key elements of leadership, right?
30. JKJim Keller
  And then he was a really good first principles guy. Like, like, somebody would say something couldn't be done, and he would just think, "That's obviously wrong," right? But, you know, maybe it's hard to do. Maybe it's expensive to do. Maybe we need different people. You know, there's like a whole bunch of... You know, like, if you want to do something hard, you know, maybe it takes time. Maybe you have to iterate. There's a whole bunch of things y- you could think about. But saying it can't be done is stupid.
21:36 – 27:21
Elon Musk and Steve Jobs
1. JKJim Keller
2. LFLex Fridman
  How would you compare... So, it seems like Elon Musk is more engineering centric, but is also... I think he considers himself a designer, too. He has a design mind.
3. JKJim Keller
  Yeah.
4. LFLex Fridman
  Steve Jobs feels like he is much more idea space, design space versus engineering.
5. JKJim Keller
  Yeah.
6. LFLex Fridman
  Just make it happen. Like, the world should be this way. Just figure it out.
7. JKJim Keller
  But, but he used computers. You know, he had computer people talk to him all the time. Like, Mike was a really good computer guy. He knew what computers could do.
8. LFLex Fridman
  Computer meaning computer hardware, like low-level stuff?
9. JKJim Keller
  Yeah, hardware, software, all the pieces.
10. LFLex Fridman
  The whole thing.
11. JKJim Keller
  And then he would, you know, have an idea about what could we do with this next that was grounded in reality. It wasn't like he was, you know, just finger painting on the wall and wishing somebody would interpret it. Like, so he had this interesting connection because, you know, he wasn't a computer architect or a designer, but he had an intuition from the computers we had to what could happen. And...
12. LFLex Fridman
  It's interesting you say intuition, because it seems like he was pissing off a lot of engineers in his intuition about what can and can't be done. Those, the, like, the... What is it? All these stories about, like, floppy disks and all that kind of stuff like that.
13. JKJim Keller
  Yeah, so in, in Steve's the first round, like, he'd go into a lab and look at what's going on and hate it and, and, uh, fire people or, or ask somebody in the elevator what they're doing for Apple and, you know, not be happy. When he came back, my impression was, is he surrounded himself with a relatively small group of people.
14. LFLex Fridman
  Yes.
15. JKJim Keller
  And didn't really interact outside of that as much. And then the joke was you'd see like a little, somebody moving a prototype through the, the quad with a, with a black blanket over it.
16. LFLex Fridman
  (laughs)
17. JKJim Keller
  And that was 'cause it was secret, you know, partly from Steve 'cause they didn't want Steve to see it until it was ready.
18. LFLex Fridman
  Yeah, the dynamic with Jony Ive and Steve is interesting. It's like you don't wanna... He ruins as many ideas as he generates.
19. JKJim Keller
  Yeah. Yeah.
20. LFLex Fridman
  It's a dangerous kind of line to walk. I, I-
21. JKJim Keller
  And if you have a lot of ideas, like... Like, Gordon Bell was famous for ideas, right? And it wasn't that the percentage of good ideas was way higher than anybody else.
22. LFLex Fridman
  (laughs)
23. JKJim Keller
  It was he had so many ideas, and, and he was also good at talking to people about it and getting the filters right and, you know, seeing through stuff. Whereas Elon was like, "Hey, I want to build rockets." So, Steve would hire a bunch of rocket guys, and Elon would go read rocket manuals.
24. LFLex Fridman
  So, Elon is a better engineer, in a sense. Like, or, like, more, uh...... like, a love and passion for the manuals. (laughs)
25. JKJim Keller
  Yeah. And the details-
26. LFLex Fridman
  The details.
27. JKJim Keller
  ... and the data and the other stuff.
28. LFLex Fridman
  The craftsmanship too, right? Well, I guess Steve had craftsmanship too, but of a different kind.
29. JKJim Keller
  Yeah.
30. LFLex Fridman
  What do you make of the, just to stay in there for just a little longer, what do you make of, like, the anger and the passion and all of that? The, the firing and the mood swings and the madness, the im-, you know, being emotional and all of that, that's Steve? And I, I guess Elon too. So, what, is that a, is that a bug or a feature?
27:21 – 31:03
Father
1. LFLex Fridman
  Well, you, you're, you're probably looking for somebody's approval.
2. JKJim Keller
  Mm.
3. LFLex Fridman
  Uh, i- i- even still.
4. JKJim Keller
  Yeah, maybe. I should think about that.
5. LFLex Fridman
  Maybe somebody who's no longer with us kind of thing.
6. JKJim Keller
  Mm.
7. LFLex Fridman
  I don't know.
8. JKJim Keller
  I used to call up my dad and tell him what I was doing. He was, he was very excited about engineering and stuff.
9. LFLex Fridman
  You got his approval?
10. JKJim Keller
  Uh, yeah, a lot. I was lucky. Like, he, he decided I was smart and unusual as a kid, and that was okay, when I was really young. So when I d- like did poorly in school, I was dyslexic, I didn't read until I was third or fourth grade and they didn't care. My parents were like, "Oh, he'll be fine." So-
11. LFLex Fridman
  Cool.
12. JKJim Keller
  ... I was lucky. That was cool.
13. LFLex Fridman
  Is he still with us?
14. JKJim Keller
  No.
15. LFLex Fridman
  You miss him?
16. JKJim Keller
  Mm-hmm. Sure, yeah. He had Parkinson's and then cancer. His last 10 years were tough. And I killed him. Killing a man like that's hard.
17. LFLex Fridman
  The mind?
18. JKJim Keller
  Well, it was pretty good. Um, Parkinson's caused the slow dementia. And, uh, the c- the chemotherapy, I think, accelerated it. But it was like hallucinogenic dementia. So he was clever and funny and interesting and was, it was pretty unusual.
19. LFLex Fridman
  Do you remember conversations, uh-
20. JKJim Keller
  Oh, yeah, of course.
21. LFLex Fridman
  ... from that time? Like what, do you have fond memories of the guy?
22. JKJim Keller
  Yeah. Oh, yeah.
23. LFLex Fridman
  Anything come to mind?
24. JKJim Keller
  Uh, a friend told me one time I could draw a computer on the whiteboard faster than anybody he'd ever met. And I said, "You should meet my dad." Like, when I was a kid, he'd come home and say, "I was driving by this bridge, and I was thinking about it." And he'd pull out a piece of paper and he'd draw the whole bridge.
25. LFLex Fridman
  (laughs)
26. JKJim Keller
  He was a mechanical engineer.
27. LFLex Fridman
  Yeah.
28. JKJim Keller
  And he would just draw the whole thing and then he would tell me about it and then tell me how he would've changed it. And he had this, you know, idea that he could understand and conceive anything. And I, I just grew up with that, so that was natural. So if, you know, like, when I interview people, I ask them to draw a picture of something they did on a whiteboard.
29. LFLex Fridman
  Mm-hmm.
30. JKJim Keller
  And it's really interesting. Like, some people draw a little box, you know, and then they'll say, "And then this talks to this, and..."
31:03 – 37:18
Perfection
1. JKJim Keller
2. LFLex Fridman
  (laughs) Do you think the perfect is the enemy of the good in hardware and software engineering? It's like, we were talking about JavaScript a little bit, and the messiness of the 10-day building process.
3. JKJim Keller
  Yeah, it's- that's, you know, creative tension, right?
4. LFLex Fridman
  Hmm.
5. JKJim Keller
  Th- the, so creative tension is when you have two different ideas that you can't do both, right?
6. LFLex Fridman
  Right.
7. JKJim Keller
  And the- and, but the fact that you want to do both causes you to go try to solve that problem. That's the creative part. So, if you're building computers, like some people say, "We have this schedule, and anything that doesn't fit in the schedule, we can't do." Right? And so they, they throw out the perfect because they have a schedule. I hate that.
8. LFLex Fridman
  (laughs)
9. JKJim Keller
  Right? Then there's other people that say, "We need to get this perfectly right and no matter what." You know, more people, more money, right? And there's a really clear idea about what you want, and some people are really good at articulating it, right? So, so let's call that the perfect, yeah.
10. LFLex Fridman
  Yeah.
11. JKJim Keller
  All right, but that's also terrible because then you never ship anything and you never hit any goals. So, now you have the, now you have your framework.
12. LFLex Fridman
  Yes.
13. JKJim Keller
  You can't throw out stuff because you can't get it done today because maybe you'll get it done tomorrow with the next project, right? You can't... Y- so you have to... I worked with a guy that I really liked working with, but he over filters his ideas.
14. LFLex Fridman
  Over filters?
15. JKJim Keller
  He'd start thinking about something, and as soon as he figured out what was wrong with it, he'd throw it out.
16. LFLex Fridman
  Hmm.
17. JKJim Keller
  And then I start thinking about it, and like, you know, you come up with an idea, and then you find out what's wrong with it, and then you le- give it a little time to set because sometimes, you know, you figure out how to tweak it, or maybe that idea helps some other idea. So, idea generation is really funny. So, you have to give your ideas space, like spaciousness of mind is key, but you also have to execute programs and get shit done. And then it turns out computer engineering is fun because it takes, you know, 100 people to build a computer, 200 or 300, whatever the number is. And people are so variable about, you know, temperament and, you know, skill sets and stuff that in a, in a big organization, you find the, the people who love the perfect ideas and the people that want to get stuff done yesterday, and people like that- that come up with ideas, and people who like to, let's say, shoot down ideas. And it takes the whole... It takes a large group of people.
18. LFLex Fridman
  So, some are good at generating ideas, some are good at filtering ideas, and then all th- in that, uh, giant mess, you're somehow... I guess the goal is for that giant mess of people to, uh, find the perfect path through the-
19. JKJim Keller
  Mm-hmm.
20. LFLex Fridman
  ... the tension, the creative tension. But like, how do you know when... You said there's some people good at articulating what perfect looks like, what a good design is.
21. JKJim Keller
  Mm-hmm.
22. LFLex Fridman
  Like, if you're sitting in a, in a room, and, uh, you have a set of ideas about like how to design, uh, a better processor. How do you know this is, this is something special here, this is a good idea, let's try this?
23. JKJim Keller
  So, have you ever brainstormed an idea with a couple of people that were really smart? And you kind of go into it, and you, you don't quite understand it, and you're working on it, and then you start, you know, talking about it, putting it on the whiteboard. Maybe it takes days or weeks, and then your brains start to kind of synchronize. It's really weird.
24. LFLex Fridman
  (laughs) What's the-
25. JKJim Keller
  And like you start to see what each other is thinking.
26. LFLex Fridman
  Yeah.
27. JKJim Keller
  And, and it starts to work. Like, you can see it work. Like, my talent in computer design is I can, I can see how computers work in my head, like really well. And I know other people can do that too. And when you're working with people that can do that, like, it, it is kind of a, an amazing experience. And then... And every once in a while, you, you get to that place, and then you find the flaw in it, which is kind of funny because you, you can, you can fool yourself in, but-
28. LFLex Fridman
  The two of you kind of drifted a- a long, uh-
29. JKJim Keller
  Yeah, yeah, you got these-
30. LFLex Fridman
  ... into a direction that was useless. (laughs)
37:18 – 42:52
Modular design
1. LFLex Fridman
  What, uh, computing hardware or, um, just any kind, even software design are you, uh, do you find beautiful? From your own work, from o- o- o- other people's work that you're just, uh... We were just talking about the, the battleground of flaws and mistakes and errors, but things that were just beautifully done. Is there something that pops to mind?
2. JKJim Keller
  Well, when things are beautifully done, usually there's a well set, thought out set of abstraction layers. Like-
3. LFLex Fridman
  So, the whole thing works qu- like, in unison nicely.
4. JKJim Keller
  Yes. And, and when I, when I say abstraction layer, that means two different components, when they work together, they work independently. They don't have to know what the other one is doing.
5. LFLex Fridman
  Hmm. So, that decoupling.
6. JKJim Keller
  Yeah. So, the, the famous one was, uh, the network stack. Like, there's a seven-layer network stack.
7. LFLex Fridman
  Yep.
8. JKJim Keller
  You know, data transport and protocol and all the layers. And the innovation was, is when they really rote got that right, 'cause networks before that didn't define those very well. The layers could innovate independently, and occasionally the layer boundary would, would, you know, the interface would be upgraded. And that, that let, you know, the, the design space breathe.
9. LFLex Fridman
  Mm-hmm.
10. JKJim Keller
  And pe- you could do something new in layer seven without having to worry about how layer four worked.
11. LFLex Fridman
  Right.
12. JKJim Keller
  And so good design does that. And you see it in processor designs. When we did, um, the Zen design at AMD, we made several components very modular. And, you know, my insistence at the top was I wanted all the interfaces defined before we wrote the RTL for the pieces. One of the verification leads said, "If we do this right, I can test the pieces so well independently, when we put it together, we won't find all these interaction bugs 'cause the floating point knows how the cache works." And I was a little skeptical, but he was mostly right, that the, the modularity of the design greatly improved the quality.
13. LFLex Fridman
  Is that universally true in general? Would you say about good designs, the modularity is, uh, like usually modular?
14. JKJim Keller
  Well, we talked about this before. Humans are only so smart, like, like-
15. LFLex Fridman
  (laughs)
16. JKJim Keller
  ... and we're not getting any smarter, right? But the complexity of things is going up.
17. LFLex Fridman
  Yeah.
18. JKJim Keller
  So, you know, a, a beautiful design can't be bigger than the person doing it. It's just, you know, their piece of it. Like, the odds of you doing a really beautiful design of something that's way too hard for you is low, right? If it's way too simple for you, it's not that interesting. It's like, "Well, anybody could do that." But when you get the right match of your, your expertise and, you know, mental power to the right design size, that's cool, but that's not big enough to make a meaningful impact on the world. So now, you have to have some framework to design the pieces-
19. LFLex Fridman
  Yes.
20. JKJim Keller
  ... so that the whole thing is big and harmonious, but, you know, when you put it together, it's, you know, it's sufficently- sufficiently interesting to, to be used and, you know. So that's what a good, beautiful design is.
21. LFLex Fridman
  Matching the limits of that human cognitive capacity to, uh, to the modular you can create and creating a nice interface between those modules. And thereby, do you think there's a limit to the kind of beautiful complex systems we can build with this kind of modular design? It's like, uh, you know, if, if we build increasingly more complicated... You can think of, like, the internet. Okay, let's scale it down.
22. JKJim Keller
  Well-
23. LFLex Fridman
  Like, you can think of, like, social network, like Twitter-
24. JKJim Keller
  Mm-hmm.
25. LFLex Fridman
  ... as one computing system.
26. JKJim Keller
  Mm-hmm.
27. LFLex Fridman
  And, but those are little modules.
28. JKJim Keller
  Yeah.
29. LFLex Fridman
  Right? That's-
30. JKJim Keller
  But it's built on, it's built on so many components nobody at Twitter even understands.
42:52 – 49:50
Moore's law
1. JKJim Keller
2. LFLex Fridman
  Well, let's go... Let's talk about Moore's Law a little bit. It's, uh-
3. JKJim Keller
  Mm-hmm.
4. LFLex Fridman
  Uh, at the broad view of Moore's Law was just exponential improvement of, uh, computing capability. Uh, like, OpenAI, for example, recently, uh, published this kind of... Papers looking at the exponential improvement in the training efficiency of neural networks.
5. JKJim Keller
  Mm-hmm.
6. LFLex Fridman
  For, like, ImageNet and all that kind of stuff, we just got better on this... And this is purely software side-
7. JKJim Keller
  Mm-hmm.
8. LFLex Fridman
  ... just figuring out better tricks and algorithms for training neural networks, and that seems to be improving, uh, significantly faster than the Moore's Law prediction, you know?
9. JKJim Keller
  Mm-hmm.
10. LFLex Fridman
  So that's in the software space. Like, what do you think... If Moore's Law continues or if the general version of Moore's Law continues, do you think that comes mostly from the hardware, from the software, some mix of the two? Some interesting totally, uh... So not, not the reduction of the size of the transistor kind of thing, but more in the, uh, uh, in the totally interesting kinds of innovations-
11. JKJim Keller
  Mm-hmm.
12. LFLex Fridman
  ... in the hardware space, all that kind of stuff?
13. JKJim Keller
  Well, there's, like, a half a dozen things going on in that graph. So one is, there's initial innovations that had a lot of hea- room to be exploited. So, you know, the efficiency of the networks has improved dramatically. And then the decomposability of those and the, the, the use go... You know, they started running on one computer, then multiple computers, then multiple GPUs, and then arrays of GPUs, and they're up to thousands. And at some point... So, so it's sort of like they were consume- they were going from, like, a single computer application to a thousand computer application. So that's not really a Moore's Law thing, that's an independent vector. How many computers can I put on this problem?
14. LFLex Fridman
  Yeah.
15. JKJim Keller
  'Cause the computers themselves are getting better on, like, a Moore's Law rate, but their ability to go from 1 to 10 to 100 to 1,000-
16. LFLex Fridman
  Yeah.
17. JKJim Keller
  ... you know, was something. And then multiplied by, you know, the amount of computes it took to resolve like AlexNet to ResNet to transformers. It's, it's been quite, you know, steady improvements.
18. LFLex Fridman
  But those are like S-curves, aren't they?
19. JKJim Keller
  Yeah.
20. LFLex Fridman
  That's the exactly kind of-
21. JKJim Keller
  Yeah.
22. LFLex Fridman
  ... S-curves that are underlying Moore's Law from the very beginning.
23. JKJim Keller
  Yeah, so-
24. LFLex Fridman
  So what, what's the biggest... What's the most, uh, productive, uh, rich source of S-curves in the, in the future do you think? Is this hardware? Is it software? Or is it's-
25. JKJim Keller
  So hardware is gonna move along relatively slowly, like, you know, double performance every two years.
26. LFLex Fridman
  (laughs)
27. JKJim Keller
  There, there's still-
28. LFLex Fridman
  I like how you call that slow.
29. JKJim Keller
  Yeah, it's the slow version. The snail's pace of Moore's Law. Maybe we should, we should, uh-
30. LFLex Fridman
  (laughs)
49:50 – 56:44
Hardware for deep learning
1. JKJim Keller
2. LFLex Fridman
  But, but speaking about this, uh...
3. JKJim Keller
  Yeah.
4. LFLex Fridman
  ... uh, this walk along the path of innovation towards, uh, the dumb things being smarter than humans, you are now-
5. JKJim Keller
  Mm-hmm.
6. LFLex Fridman
  ... the CTO of (laughs) of, uh, Tenstorrent.
7. JKJim Keller
  Mm-hmm.
8. LFLex Fridman
  Two- as of two months ago. They, uh, build hardware for deep learning.
9. JKJim Keller
  Mm-hmm.
10. LFLex Fridman
  Uh, how do you build scalable and efficient deep learning? This is such a fascinating space.
11. JKJim Keller
  Yeah, yeah. So it's interesting. So, um, up until recently, I thought there was two kinds of computers. There are serial computers that run like C programs, and then there's parallel computers. So the way I think about it is, you know, parallel computers y- have given parallelism. Like, GPUs are great because you have a million pixels.
12. LFLex Fridman
  Mm-hmm.
13. JKJim Keller
  And modern GPUs run a program on every pixel. They call it a shader program, right? So, or, like, finite element analysis. You, you build something, you know, you make this into little tiny chunks, you give each chunk to a computer, so you're given all these chunks, you have parallelism like that. But most C programs, you write this linear narrative, and you have to make it go fast. To make it go fast, you predict all the branches, all the data fetches, and you run that more in parallel, but that's found parallelism.
14. LFLex Fridman
  Mm-hmm.
15. JKJim Keller
  Um, AI is... I'm still trying to decide how fundamental this is. It's a given parallelism problem.
16. LFLex Fridman
  Mm-hmm.
17. JKJim Keller
  But the way people describe the neural networks and then how they write them in PyTorch, it makes graphs.
18. LFLex Fridman
  Yeah. That might be fundamentally different than the GPU kind of-
19. JKJim Keller
  Parallelism? Yeah, it might be. Because the, when you run the GPU program on all the pixels, you're running like, you know, depends, you know, this group of pixels, say it's background blue and it runs a really simple program. This pixel is, you know, some patch of your face, so you have some really interesting shader program to give you impression of translucency, but the pixels themselves don't talk to each other. There's no graph, right? So you, you do the image and then you do the next image and you do the next image and you run eight million pixels, eight million programs every time, and modern GPUs have like 6,000-
20. LFLex Fridman
  Mm-hmm.
21. JKJim Keller
  ... thread engines in 'em. So, you know, they got eight million pixels. Each one runs a program on, you know, 10 or 20 pixels, and that's how, uh, th- that's how they work. There's no graph.
22. LFLex Fridman
  But you think graph might be a totally, uh, new way to think about hardware?
23. JKJim Keller
  So, Raja Koduri and I have been having this good conversation about given versus found parallelism, and then the kind of walk as we got more transistors, like, you know, computers way back when did stuff on scale or data, then we did it on vector data, famous vector machines. Now we're making computers that operate on matrices, right? And then the, the ca- the category we, we said that was next was spatial. Like, imagine you have so much data that, you know, you want to do the compute on this data, and then when it's done, it says send the result of this pile of data, run some software on that.
24. LFLex Fridman
  Mm-hmm.
25. JKJim Keller
  And it's better to, to think about it spatially than to move all the data to a central processor and do all the work.
26. LFLex Fridman
  So spatially, you mean moving in the space of data as opposed to moving the data?
27. JKJim Keller
  Yeah. You know, you have a, you have a petabyte data space spread across some huge array of computers, and when you do a computation somewhere, you send the result of that computation or maybe a pointer to the next program to some other piece of data and do it. But I think the... a better word might be graph, and all the AI neural networks are graphs. Do some computations, send the result here, do another computation, do a data transformation, do a merging, do a pooling, do another computation.
28. LFLex Fridman
  Is it possible to compress and say how we make this thing efficient, this whole process efficient? There's different...
29. JKJim Keller
  So first, uh, the fundamental elements in the graphs are things like matrix multiplies, convolutions-
30. LFLex Fridman
  Okay.
56:44 – 1:04:22
Making neural networks fast at scale
1. JKJim Keller
  working on now.
2. LFLex Fridman
  So, uh, the... I think it's called the Grace Call Processor-
3. JKJim Keller
  Mm-hmm.
4. LFLex Fridman
  ... uh, introduced last year. It's, uh, you know, there's a bunch of measures of performance, we were talking about-
5. JKJim Keller
  Mm-hmm.
6. LFLex Fridman
  ... horses.
7. JKJim Keller
  Mm-hmm.
8. LFLex Fridman
  It seems to outperform 368 trillion operations per second, seems to out- outperformed NVIDIA's Tesla T4 system.
9. JKJim Keller
  Mm-hmm.
10. LFLex Fridman
  So these are just numbers.
11. JKJim Keller
  Mm-hmm.
12. LFLex Fridman
  What do they actually mean in real world perform- like what are the metrics for you that you're chasing? In, in your horse race, like what do you care about?
13. JKJim Keller
  Well, first the... So the, the native language of, you know, people who write AI network programs is PyTorch now, PyTorch, TensorFlow, there's a couple others. So-
14. LFLex Fridman
  Do you think PyTorch is won over TensorFlow, or is it just a-
15. JKJim Keller
  I'm not an expert on that.
16. LFLex Fridman
  Okay.
17. JKJim Keller
  I, I know many people who have switched from TensorFlow to PyTorch.
18. LFLex Fridman
  Yeah.
19. JKJim Keller
  And there's technical reasons for it, and openess-
20. LFLex Fridman
  I use both. Both are still awesome.
21. JKJim Keller
  Both are still awesome.
22. LFLex Fridman
  But the deepest love is for PyTorch currently.
23. JKJim Keller
  Yeah. There, there's more love for that. And that, that may change. So the first thing is when they write their programs, can the hardware execute it pretty much as it was written?
24. LFLex Fridman
  Mm-hmm.
25. JKJim Keller
  Right? So PyTorch turns into a graph, we have a graph compiler that makes that graph, then, like, it fractions the graph down, so if you have big matrix multiply, we turn it into right size chunks to run on the processing elements. It hooks all the graph up, it lays out all the data. There's a couple of mid-level in- representations of it that are also simulatable so that if you're writing the code you can see how it's gonna go through the machine, which is pretty cool. And then at the bottom it schedules kernels, like m- math, data manipulation, data movement kernels, which do this stuff. So we don't have to run, write a little program to do matrix multiply, 'cause we have a big matrix multiplier. Like there's no SIMD program for that. But, uh, there is scheduling for that, right? So the, the... One of the goals is, if you write a piece of PyTorch code that looks pretty reasonable, you should be able to compile it and run it on the hardware without having to tweak it and, and do all kinds of crazy things to get performance.
26. LFLex Fridman
  There's not a lot of intermediate steps.
27. JKJim Keller
  Right.
28. LFLex Fridman
  It's running directly as written.
29. JKJim Keller
  Like on a GPU, if you write a large matrix multiply naively, you'll get 5 to 10% of the peak performance of the GPU.
30. LFLex Fridman
  Hmm.
1:04:22 – 1:08:36
Andrej Karpathy and Chris Lattner
1. LFLex Fridman
  Good. I love the idea of you inside a room with, uh, Karpathy, Andraj Karpathy and Chris Lattner.
2. JKJim Keller
  Mm-hmm.
3. LFLex Fridman
  Uh, v- very, um, very interesting, very brilliant people, very out of the box thinkers-
4. JKJim Keller
  Mm-hmm.
5. LFLex Fridman
  ... but also, like, first principles thinkers.
6. JKJim Keller
  Well, they both get stuff done. They only get stuff done to get their own projects done. They, they talk about it clearly, they educate large numbers of people, and they've created platforms for other people to go do their stuff on.
7. LFLex Fridman
  Yeah, the, the clear thinking that's able to be communicated-
8. JKJim Keller
  Yeah.
9. LFLex Fridman
  ... is kind of i- impressive.
10. JKJim Keller
  It's kind of remarkable, the... Yeah, I'm a fan.
11. LFLex Fridman
  Well, l- l- let me ask, 'cause, um, I, I talk to Chris actually a lot these days.
12. JKJim Keller
  Mm-hmm.
13. LFLex Fridman
  He's been, uh... One, one of the c- just to give him a shout out in, in this-
14. JKJim Keller
  Mm-hmm.
15. LFLex Fridman
  He's been so supportive as a human being. So everybody's quite different, like great engineers are different, but he's been, like, sensitive to the human element-
16. JKJim Keller
  Mm-hmm.
17. LFLex Fridman
  ... in a way that's been fascinating. Like, he was one of the early people on, on this stupid podcast that I do to say, like-
18. JKJim Keller
  Yeah.
19. LFLex Fridman
  ... "Don't quit this thing."
20. JKJim Keller
  Mm-hmm.
21. LFLex Fridman
  And also, "Talk to whoever the hell you want to talk to."
22. JKJim Keller
  Mm-hmm.
23. LFLex Fridman
  That kind of, from a legit engineer, to get, like, props-
24. JKJim Keller
  Mm-hmm.
25. LFLex Fridman
  ... and be like, "You can do this."
26. JKJim Keller
  Mm-hmm.
27. LFLex Fridman
  That was, I mean, that's what-
28. JKJim Keller
  That's good.
29. LFLex Fridman
  ... a good leader does, right? Is they just kinda-
30. JKJim Keller
  Mm-hmm.
1:08:36 – 1:12:43
How GPUs work
1. JKJim Keller
2. LFLex Fridman
  On the, either the TPU or maybe the NVIDIA GPU side, how does Tenstorrent, you think, or the ideas underlying it... It doesn't have to be Tenstorrent, just this kind of graph-focused, uh, graph-centric hardware, deep learning-centric hardware beat NVIDIA's? Do, do you think it's possible for it to basically overtake NVIDIA?
3. JKJim Keller
  Sure.
4. LFLex Fridman
  What's, what's that process look like? What's that, uh, journey look like, you think?
5. JKJim Keller
  Well, GPUs were built to run shader programs on millions of pixels, not to run graphs.
6. LFLex Fridman
  Yes.
7. JKJim Keller
  So there's a hypothesis that says the way the graphs, you know, are built is going to be really interesting to, to be inefficient on computing this. And then the, the primitives is not as simple program, it's matrix multiply convolution, and then the data manipulations are, are fairly extensive about... Like how do you do a fast transpose with a program? I don't know if you've ever written a transpose program. They're ugly and slow, but in hardware you can do really well. Like, I'll give you an example. So when GPU accelerators first started doing triangles, like if you have a triangle which maps on the set of pixels.
8. LFLex Fridman
  Mm-hmm.
9. JKJim Keller
  So you build... It's very easy, straightforward to build a hardware engine that'll find all those pixels.
10. LFLex Fridman
  Mm-hmm.
11. JKJim Keller
  And it's kind of weird because you walk along the triangle till you get to the edge, and then you have to go back down to the next row and walk along, and then you have to decide on the edge if the line of the triangle is like half on the pixel.
12. LFLex Fridman
  Mm-hmm.
13. JKJim Keller
  What's the pixel color? Because it's half of this pixel and half the next one. That's called rasterization.
14. LFLex Fridman
  Y- 'Cause... Y- you're saying that can be done in, uh, in hardware
15. JKJim Keller
  messages? No, I'm just... That's an example of that operation as a software program is really bad. I've written a program that did rasterization. The hardware that does it has actually less code than the software program that does it, and it's way faster, right? So there are certain times when the abstraction you have, rasterize a triangle-
16. LFLex Fridman
  Mm-hmm.
17. JKJim Keller
  ... you know, execute a graph, you know, components of a graph, the, the right thing to do in the hardware-software boundary is for the hardware to naturally do it.
18. LFLex Fridman
  So the GPU is really optimized for the rasterization of triangles? (laughs)
19. JKJim Keller
  Well, no, that's just... Well, like in a modern, you know... That's a small piece of modern GPUs.
20. LFLex Fridman
  Mm-hmm.
21. JKJim Keller
  What they did is... That... They still rasterize triangles when you're running the game, but for the most part, most of the computation area of the GPU is running shader programs, but they're single threaded programs on pixels, not graphs.
22. LFLex Fridman
  I have to be honest and say I don't actually know the, the math behind shader, uh, sh- shading and lighting and all that kind of stuff. I don't know what...
23. JKJim Keller
  They look like little simple floating point programs, or complicated ones. You can have 8,000 instructions in a shader program.
24. LFLex Fridman
  But I, I don't have a good intuition why it could be parallelized so easily.
25. JKJim Keller
  No, it's because you have eight million pixels in every single... So when you have a light, right?
26. LFLex Fridman
  Yeah.
27. JKJim Keller
  That comes down, the angle... You know, the amount of light... Like, like say this is a line of pixels across this table, right? The amount of light on each pixel is subtly different, right?
28. LFLex Fridman
  And each pixel is responsible for figuring out what
29. JKJim Keller
  ... figuring out. So that pixel says, "I'm this pixel. I know the angle of the light, I know the occlusion, I know the color I am."
30. LFLex Fridman
  Mm-hmm.
1:12:43 – 1:17:23
Tesla Autopilot, NVIDIA, and Mobileye
1. JKJim Keller
  And, and NVIDIA invested for years in CUDA, first for HPC and then they got lucky with the AI trend.
2. LFLex Fridman
  But do you think they're going to essentially not be able to hardcore pivot out of their...
3. JKJim Keller
  We'll see.That's always interesting. How often do big companies hardcore pivot? Occasionally.
4. LFLex Fridman
  How much do you know about NVIDIA, folks?
5. JKJim Keller
  Some.
6. LFLex Fridman
  Some?
7. JKJim Keller
  Yeah.
8. LFLex Fridman
  Well, it's, um, I'm, I'm curious as well, who's ultimately... Is, uh-
9. JKJim Keller
  Oh, they've, they've innovated several times, but they've also worked really hard on mobile, they worked really hard on radios, you know. You know, they're fundamentally a GPU company.
10. LFLex Fridman
  Well, they tried to pivot. There's an in- interesting little, uh, game and play in autonomous vehicles, right, with, or, uh, semi-autonomous, like playing with Tesla and so on and seeing that's a, dipping a toe into that kind of pivot.
11. JKJim Keller
  They came out with this platform, which was interesting technically.
12. LFLex Fridman
  Yeah.
13. JKJim Keller
  But it was like a 3,000-watt, you know, it was 1,000-watt, three- $3,000, you know, GPU platform.
14. LFLex Fridman
  I don't know if it's interesting technically. It's interesting philosophically. I, I, technically, I don't know if it's the execution, the craftsmanship was there. I'm not sure. But I, I didn't get a sense-
15. JKJim Keller
  I think they were repurposing GPUs for an automotive solution.
16. LFLex Fridman
  Right, it's not a real pivot.
17. JKJim Keller
  They didn't, they didn't build a ground-up solution.
18. LFLex Fridman
  Right.
19. JKJim Keller
  Like the, the, like the chips inside Tesla are pretty cheap, like Mobileye's been doing this. They're, they're doing the classic work from the simplest thing.
20. LFLex Fridman
  Yeah.
21. JKJim Keller
  You know, they were building 40-mil- square-millimeter chips. In NVIDIA, their solution had two 800-millimeter chips and two 200-millimeter chips, and you know, like boatloads of really expensive DRAMs. And, and, you know, it's a really different approach.
22. LFLex Fridman
  And-
23. JKJim Keller
  So Mobileye fit the, let's say, automotive cost and form factor, and then they added features as it was economically viable and NVIDIA said, "Take the biggest thing, and l- we're gonna go make it work," you know. And, and that's also influenced, like Waymo, there's a whole bunch of autonomous startups where they have a 5,000-watt server in their trunk.
24. LFLex Fridman
  Mm-hmm.
25. JKJim Keller
  Right? And, but that's, that's 'cause they think, "Well, 5,000 watts and, you know, $10,000 is okay, 'cause it's replacing a driver." Elon's approach was, "That board has to be cheap enough to put it in every single Tesla, whether they turn on a- autonomous driving or not." Which, and Mobileye was like, "We need to fit in the BOM and, you know, cost structure that car companies do," so they may sell you a GPS for 1,500 bucks, but the BOM for that's like $25.
26. LFLex Fridman
  Well, and, uh, for Mobileye, it seems like neural networks were not first-class citizens, like the computation. They didn't start out as a-
27. JKJim Keller
  Yeah, it was a CV problem, you know, they-
28. LFLex Fridman
  Yeah. And-
29. JKJim Keller
  ... they did classic CV and found stoplights and lines, and they were really good at it.
30. LFLex Fridman
  Yeah, and they never, I mean, I don't know what's happening now, but they never fully pivoted. I mean, it's like, it's the NVIDIA thing. And then, as opposed to-
1:17:23 – 1:23:43
Andrej Karpathy and Software 2.0
1. LFLex Fridman
  Well, the one really important thing is also what they're doing well is how to iterate that quickly, which means like it's not just about one-time deployment, one building. It's constantly iterating the network and trying to automate as many steps as possible, right?
2. JKJim Keller
  Yeah.
3. LFLex Fridman
  And that's actually the principles of the Software 2.0, like you mentioned with Andrej, is, uh, it's not just... I mean, I don't know what the actual, his description of Software 2.0 is, if it's just high-level philosophical or there are specifics, but the interesting thing about what that actually looks in the real world is it's that, uh, what I think Andrej calls a data engine. It's like, it's the iterative improvement of the thing.
4. JKJim Keller
  Mm-hmm. Yeah.
5. LFLex Fridman
  You have a neural network that, uh, does stuff, fails on a bunch of things, and learns from it over and over and over. So you're constantly discovering edge cases.
6. JKJim Keller
  Mm-hmm.
7. LFLex Fridman
  So it's very much about, uh, like data engineering, like figuring out... It's, it's, it's kind of what you were talking about with Tenstorrent, is you have the data landscape, and you have to walk along that data landscape in a way that, uh, that's constantly improving the, the, the neural network, and that, that feels like that's the central piece that they can't solve.
8. JKJim Keller
  Yeah, so, and there's two pieces of it. Like, you, you find edge cases that don't work, and then you define something that goes get your data for that.
9. LFLex Fridman
  Mm-hmm.
10. JKJim Keller
  But then the other constraint is whether you have to label it or not. Like the, the, the amazing thing about like the GPT-3 stuff is it's unsupervised.

Episode duration: 2:39:14

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode G4hL5Om4IJ4

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome