Chris Lattner: Future of Programming and AI | Lex Fridman Podcast #381

Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo. Please support this podcast by checking out our sponsors: - iHerb: https://lexfridman.com/iherb and use code LEX to get 22% off your order - Numerai: https://numer.ai/lex - InsideTracker: https://insidetracker.com/lex to get 20% off EPISODE LINKS: Chris's Twitter: https://twitter.com/clattner_llvm Chris's Website: http://nondot.org/sabre/ Mojo programming language: https://www.modular.com/mojo Modular AI: https://modular.com/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41 OUTLINE: 0:00 - Introduction 2:20 - Mojo programming language 12:37 - Code indentation 21:04 - The power of autotuning 30:54 - Typed programming languages 47:38 - Immutability 59:56 - Distributed deployment 1:34:23 - Mojo vs CPython 1:50:12 - Guido van Rossum 1:57:13 - Mojo vs PyTorch vs TensorFlow 2:00:37 - Swift programming language 2:06:09 - Julia programming language 2:11:14 - Switching programming languages 2:20:40 - Mojo playground 2:25:30 - Jeremy Howard 2:36:16 - Function overloading 2:44:41 - Error vs Exception 2:52:21 - Mojo roadmap 3:05:23 - Building a company 3:17:09 - ChatGPT 3:23:32 - Danger of AI 3:27:27 - Future of programming 3:30:43 - Advice for young people SOCIAL: - Twitter: https://twitter.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - Medium: https://medium.com/@lexfridman - Reddit: https://reddit.com/r/lexfridman - Support on Patreon: https://www.patreon.com/lexfridman

Chris LattnerguestLex Fridmanhost

Jun 2, 20233h 34mWatch on YouTube ↗

EVERY SPOKEN WORD

150 min read · 30,140 words

0:00 – 2:20
Introduction
1. CLChris Lattner
  ... on one axis, you have more hardware coming in. On the other hand, you have an explosion of innovation in AI.
2. LFLex Fridman
  Mm-hmm.
3. CLChris Lattner
  And so what happened with both Tensorflow and Pytorch is that the explosion of innovation in AI has led to... It's not just about matrix multiplication and convolution. These things have now, like, 2,000 different operators.
4. LFLex Fridman
  Mm-hmm.
5. CLChris Lattner
  And on the other hand, you have I don't know how many pieces of hardware there are out there. It's a lot. Part of my thesis, part of my belief of where computing goes if you look out 10 years from now is it's not gonna get simpler. Physics isn't going back to where we came from.
6. LFLex Fridman
  Mm-hmm.
7. CLChris Lattner
  It's only gonna get weirder from here on out, right? And so to me, the exciting part about what we're building is it's about building that universal platform, which world can continue to get weird, 'cause again, I don't think it's avoidable. It's physics. But we can help lift people, scale, do things with it, and they don't have to rewrite their code every time a new device comes out. And I think that's pretty cool.
8. LFLex Fridman
  The following is a conversation with Chris Lattner, his third time on this podcast. As I've said many times before, he's one of the most brilliant engineers in modern computing, having created LLM Compiler Infrastructure Project, the Clang compiler, the Swift programming language, a lot of key contributions to Tensorflow and TPUs as part of Google. He served as vice president of Autopilot software at Tesla, was a software innovator and leader at Apple, and now he co-created a new full stack AI infrastructure for distributed training, inference, and deployment on all kinds of hardware called Modular, and a new programming language called Mojo that is a superset of Python, giving you all the usability of Python, but with the performance of C, C++. In many cases, Mojo Code has demonstrated over 30,000X speedup over Python. If you love machine learning, if you love Python, you should definitely give Mojo a try. This programming language, this new AI framework and infrastructure, and this conversation with Chris is mind-blowing. I love it. It gets pretty technical at times, so I hope you hang on for the ride. This is the Lex Fridman Podcast. To support it, please check out our sponsors in the description. And now, dear friends, here's Chris Lattner.
2:20 – 12:37
Mojo programming language
1. LFLex Fridman
  It's been, I think, two years since we last talked, and then in that time, you somehow went and co-created a new programming language called Mojo. Uh, so it's optimized for AI. It's a superset of Python. Let's look at the big picture. What is the vision, uh, for Mojo?
2. CLChris Lattner
  For Mojo? Well, so I mean, I think you have to zoom out. So I've been working on a lot of related technologies for many, many years. So I've worked on LLVM and a lot of things in mobile and servers and things like this. But the world's changing, and what's happened with AI is we have new GPUs and new machine learning accelerators and other ASICs and things like that that make AI go real fast. At Google, I worked on TPUs. That's one of the biggest larger-scale deployed systems that exist for AI.
3. LFLex Fridman
  Mm-hmm.
4. CLChris Lattner
  And really, what you see is if you look across all of the things that are happening in the industry, there's this new compute platform coming, and it's not just about CPUs or GPUs or TPUs or NPUs or IPUs or whatever, all the PUs.
5. LFLex Fridman
  Mm-hmm.
6. CLChris Lattner
  (laughs) Right? It's about, how do we program these things, right? And so for software folks like us, right now, it doesn't do us any good if there's this amazing hardware that we can't use. And one of the things that you find out really quick is that having the theoretical capability of programming something, and then having the world's power and the innovation of all the, all the smart people in the world get unleashed on something can be quite different. And so really, where Mojo came from was starting from a problem of, we need to be able to take machine learning, take the infrastructure underneath it, and make it way more accessible, way more usable, way more understandable by normal people and researchers and other folks that are not themselves, like, experts in GPUs and things like this. And then, uh, through that journey, we realized, hey, we need syntax for this. We need to do a programming language.
7. LFLex Fridman
  Mm-hmm. So one of, one of the, the main features of the language, I say so fully in jest, is that, uh, it allows you to have the file extension to be, uh, an emoji or the fire emoji, which is one of the first emojis used as a file extension I've ever seen in my life. And then you ask yourself the question, "Why in the 21st century are we not using Unicode for file extensions?" Because, I mean, it's an epic decision. I think clearly the most important decision you made in the most... But, but you could also just use M-O-J-O as the file extension.
8. CLChris Lattner
  Also... Okay, so take-
9. LFLex Fridman
  (laughs)
10. CLChris Lattner
  ... a step back. I mean, come on, Lex. Do you think that the world's ready for this? This is a big moment in the world, right?
11. LFLex Fridman
  This is, we're releasing this onto the world. (laughs)
12. CLChris Lattner
  This is innovation. (laughs)
13. LFLex Fridman
  (laughs) I mean, it really is kinda brilliant. Emojis are such a big part of our daily lives. Why is it not in programming?
14. CLChris Lattner
  Well, and, and, like, you take a step back and look, look at what file extensions are, right? They're basically metadata, right? And so why are we spending all the screen space on them and all that stuff? Also, you know, you have them stacked up next to text files and PDF files and whatever else. Like, if you're gonna do something cool, you want it to stand out, right? And emojis are colorful. They're visual.
15. LFLex Fridman
  (laughs)
16. CLChris Lattner
  And they're, they're beautiful.
17. LFLex Fridman
  Yeah.
18. CLChris Lattner
  Right?
19. LFLex Fridman
  What's been the response so far from, uh... Is, is there a support on, like, Windows on the operating system-
20. CLChris Lattner
  Yeah.
21. LFLex Fridman
  ... in displaying, like, File Explorer?
22. CLChris Lattner
  Yeah, yeah.
23. LFLex Fridman
  Can they-
24. CLChris Lattner
  The, the one problem I've seen is that Git doesn't escape it right.
25. LFLex Fridman
  Uh-huh.
26. CLChris Lattner
  And so it thinks that the fire emoji is unprintable, and so it, like, prints out weird hex things if you use the command line Git tool.
27. LFLex Fridman
  Ah.
28. CLChris Lattner
  But everything else, as far as I'm aware, works fine, and I, I have faith that Git can be improved. So I'm not, I'm not worried.
29. LFLex Fridman
  So GitHub is fine?
30. CLChris Lattner
  GitHub is fine, yeah. GitHub is fine, Visual Studio Code, Windows, like, all the stuff, totally ready, because people have internationalization-
12:37 – 21:04
Code indentation
1. LFLex Fridman
  uh, tabs is an interesting design decision. And so you've really written a new programming language here. Yes, it is a super set of Python, but you can make a bunch of different interesting decisions here.
2. CLChris Lattner
  Totally, yeah.
3. LFLex Fridman
  And you chose actually to stick with Python as, um, in terms of some of the syntax.
4. CLChris Lattner
  Well, so, so let me explain why, right? So I mean, you can explain this in many rational ways.
5. LFLex Fridman
  Mm-hmm.
6. CLChris Lattner
  I think that the indentation is beautiful, but that's not a rational explanation, right? So, but I can defend it rationally, right? So first of all, Python 1 has millions of programmers.
7. LFLex Fridman
  Yeah.
8. CLChris Lattner
  It's huge. It's everywhere. It owns machine learning, right? So factually, it is the thing, right? Second of all, if you look at it, C code, C++ code, Java, whatever, Swift, curly brace languages also run through formatting tools and get indented.
9. LFLex Fridman
  Mm-hmm.
10. CLChris Lattner
  And so if they're not indented correctly, first of all, it will twist your brain around. (laughs) It can lead to bugs. There's notorious bugs that have happened across time where the, the indentation was wrong or misleading and it wasn't formatted right, and so it turned into an issue, right? And so what ends up happening in modern large scale code bases is people run automatic formatters. So now what you end up with is indentation and curly braces. Well, if you're gonna have, you know, the notion of grouping-
11. LFLex Fridman
  Mm-hmm.
12. CLChris Lattner
  ... why not have one thing, right? And get rid of all the clutter and have a more beautiful thing, right? Also, you look at many of these languages, it's like, okay, well, you can have curly braces or you can omit them if there's one statement, or you just like enter this entire world of complicated design space that objectively you don't need if you have Python style indentation. So...
13. LFLex Fridman
  Yeah, I would love to actually see statistics on errors made because of indentation. Like how many errors are made in Python versus in C++ that have to do with basic formatting, all that kind of stuff.
14. CLChris Lattner
  Yeah.
15. LFLex Fridman
  I would love to see.
16. CLChris Lattner
  I think it's probably pretty minor because once you get, like you use VS Code, I do too.
17. LFLex Fridman
  Mm-hmm.
18. CLChris Lattner
  So if you get VS Code set up, it does the indentation for you generally.
19. LFLex Fridman
  Yep.
20. CLChris Lattner
  Right? And so you don't, you know, it's actually really nice to not have to fight it. And then what you can see is the editor is telling you how your code will work by indenting it, which I think is pretty cool.
21. LFLex Fridman
  I honestly don't think I've ever, I don't remember having an error in Python because I indented stuff wrong.
22. CLChris Lattner
  Yeah. So, I mean, I think that there's, again, this is a religious thing, and so I can joke about it and I love, I love to kind of, you know, I realize that this is such a polarizing thing and everybody wants to argue about it. And so I like poking at the bear a little bit, right?
23. LFLex Fridman
  (laughs)
24. CLChris Lattner
  But, but frankly, right, come back to the first point, Python won.
25. LFLex Fridman
  Yeah, yeah.
26. CLChris Lattner
  Like it's huge. It's an AI. It's the right thing for us. Like we see Mojo as being an incredible part of the Python ecosystem. We're not looking to break Python or change it or "fix it".
27. LFLex Fridman
  Mm-hmm.
28. CLChris Lattner
  We love Python for what it is. Our view is that Python is just not done yet. And so if you look at, you know, you mentioned Python being slow. Well, there's a couple of different things that go into that, which we can talk about if you want. But one of them is it just doesn't have those features that you would use to do C-like programming.
29. LFLex Fridman
  Mm-hmm.
30. CLChris Lattner
  And so if you say, "Okay, well, I'm forced out of Python into C for certain use cases," well then what we're doing is we're saying, "Okay, well why, why is that? Can we just add those features that are missing from Python back up to Mojo?" And then you can have everything that's great about Python, all the things you're talking about that you love.
21:04 – 30:54
The power of autotuning
1. CLChris Lattner
  cool.
2. LFLex Fridman
  Yeah. And, and for machine learning, I think meta-programming, I think we could generally say is extremely useful. And so you, you get features, I mean, that th- I'll jump around, but there's ... the feature of auto-tuning-
3. CLChris Lattner
  Yeah.
4. LFLex Fridman
  ... and adaptive compilation just blows my mind.
5. CLChris Lattner
  Yeah. Well, so, okay, so let's come back to that.
6. LFLex Fridman
  Sure, all right.
7. CLChris Lattner
  So, so, what, what, what is, what is, what is machine learning? Like, what, or what is a machine learning model? Like, you take a PyTorch model off the internet, right?
8. LFLex Fridman
  Yeah.
9. CLChris Lattner
  Um, it's really interesting to me because what a Py- what PyTorch and what TensorFlow and all these frameworks are kind of pushing compute into is they're pushing into, like, this abstract specification of a compute problem, which then gets mapped in a whole bunch of different ways.
10. LFLex Fridman
  Mm-hmm.
11. CLChris Lattner
  Right? And so this is why it became a meta-programming problem, is that you wanna be able to say, "Cool, I have, I have this neural net. Now run it with batch size 1,000," (laughs) right? Do a, do a, do a, do a mapping across batch. Or, "Okay, I wanna take this problem, now run it across 1,000 CPUs-"
12. LFLex Fridman
  Mm-hmm.
13. CLChris Lattner
  "... or GPUs." Right? And so, like, this, this problem of, like, des- describe the compute and then map it and do things and transform it are, are, like, actually, it's very profound, and that's one of the things that makes machine learning systems really special.
14. LFLex Fridman
  Uh, maybe can you describe auto-tuning and how do you pull off ... I mean, I guess adaptive compilation is what we're talking about as meta-programming.
15. CLChris Lattner
  Yeah.
16. LFLex Fridman
  How do you pull off auto-tune? I mean, is that, is that as profound as I think it is? It just seems like a really, like, uh, you know, we'll mention list comprehensions. To me, from a quick glance at Mojo, uh, which by the way, I have to absolutely, like, dive in, uh, th- as I realize how amazing this is, I, I absolutely must dive in. Uh, it, that looks like just an incredible feature for machine learning people.
17. CLChris Lattner
  Yeah. Well, so, so what is auto-tuning? So take a step back. Auto-tuning is a feature in Mojo. It's not ... So ve- very little of what we're doing is actually research. Like, many of these ideas have existed in other systems and other places, and so what we're doing is we're pulling together good ideas, remixing them, and making them into a, hopefully a beautiful system, right? And so auto-tuning, the observation is that turns out hardware systems, algorithms are really complicated. Turns out maybe you don't actually want to know how the hardware works (laughs) , right? A lot of people don't, right? And so there are lots of really smart hardware people, I know a lot of them, uh, where they know everything about, okay, the, the cache size is this and the number of registers is that, and if you use this l- length of vector, it's gonna be super efficient because it maps directly onto what it can do, and like all this kinda stuff. Or the GPU has SMs and it has a warp size of whatever, right? All this stuff that goes into these things, where the tile size of a TPU is 128, like these, these factoids, right? My belief is that most normal people, and I love hardware people also, I'm not trying to offend literally everybody on the internet, um, but, uh, most programmers actually don't wanna know this stuff.
18. LFLex Fridman
  Mm-hmm.
19. CLChris Lattner
  Right? And so if you come at it from the perspective of how do we allow people to build both more abstracted but also more portable code, because, you know, could be that the vector length changes or the cache size changes, or it could be that the tile size of your matrix changes, or the number, you know, an A100 versus an H100 versus a Volta versus a whatever GPU have different characteristics, right? A lot of the algorithms that you run are actually the same, but the parameters, these magic numbers you have to fill in, end up being really fiddly numbers that an expert has to go figure out. And so what auto-tuning does is it says, "Okay, well, guess what? There's a lot of compute out there." Right? So instead of having humans go randomly try all the things or do a grid search or go search some complicated multidimensional space-... how about we have computers do that?
20. LFLex Fridman
  Mm-hmm.
21. CLChris Lattner
  Right? And so what auto-tuning does is you can say, "Hey, here's my algorithm." If it's a, a matrix operation or something like that, you can say, "Okay, I'm gonna carve it up into blocks. I'm gonna do those blocks in parallel, and I wanna, this, this, with 128 things that I'm running on, I wanna cut it this way or that way," or whatever. And you can say, "Hey, go see which one's actually empirically better on this system." Mm-hmm.
22. LFLex Fridman
  And then the result of that, you cache for that system.
23. CLChris Lattner
  Yep.
24. LFLex Fridman
  You save it.
25. CLChris Lattner
  And so come back to twisting your compiler brain, right? So not only does the compiler have an interpreter that's used to do meta-programming, that compiler, that interpreter, that meta-programming now has to actually take your code and go run it on a target machine.
26. LFLex Fridman
  Mm-hmm.
27. CLChris Lattner
  (laughs) See, see which one it likes the best, and then stitch it in and then keep going, right?
28. LFLex Fridman
  So part of the compilation is machine-specific.
29. CLChris Lattner
  Yeah. Well, so I mean, this is an optional feature, right? So you don't have to use it for everything.
30. LFLex Fridman
  Yes.
30:54 – 47:38
Typed programming languages
1. CLChris Lattner
2. LFLex Fridman
  Before I talk about some of the interesting stuff with parallelization, all that, let's- let- let's first talk about, like, the basics. We talked indentation, right? So this thing looks like Python. It's sexy and beautiful like Python, as we mentioned.
3. CLChris Lattner
  Yep.
4. LFLex Fridman
  Uh, is it a typed language? So what's the role of types?
5. CLChris Lattner
  Yeah. Good question. So Python has types. It has strings, it has integers, it has dictionaries, and like all that stuff, but they all live at runtime.
6. LFLex Fridman
  Mm-hmm.
7. CLChris Lattner
  Right? And so because all those types live at runtime in Python, you never s- or you don't have to spell them. (laughs) Python also has, like, this whole typing thing going on now, and a lot of people use it.
8. LFLex Fridman
  Yeah.
9. CLChris Lattner
  I'm not talking about that. That's, that's kind of a different thing. We can come back to that if you want. But- but typically, the, um, you know, you just say, "I take... I have a def, and my def takes two parameters. I'm gonna call them A and B," and I don't have to write a type. Okay? So that is great, but what that does is that forces what's called a consistent representation. So these things have to be a pointer to an object with the object header, and they all have to look the same.
10. LFLex Fridman
  Mm-hmm.
11. CLChris Lattner
  And then when you dispatch a method, you go through all the same different paths no matter what the, the receiver, whatever that type is. So what Mojo does is, is it allows you to have more than one kind of type. And so what it does is it allows you to say, "Okay, cool. I have, I have an object," and objects behave like Python does, and so it's fully dynamic, and that's all great. And for many things, classes, like, that's all very powerful and very important. But if you wanna say, "Hey, it's an integer, and it's 32 bits or 64 bits," or whatever it is, or it's a floating point value and it's 64 bits, well, then the compiler can take that, and it can use that to do way better optimization, and it turns out, again, getting rid of the indirections. That's huge. It means you can get better code completion because you have, um, 'cause the compiler knows what the type is, and so it knows what operations work on it. And so that's actually pretty huge. And so what Mojo does is ha- allows you to progressively adopt types into your program. And so you can start, again, it's compatible with Python, and so then you can add however many types you want wherever you want them. And if you don't wanna deal with it, you don't have to deal with it.
12. LFLex Fridman
  Right.
13. CLChris Lattner
  And so one of- one of, you know, our opinions on this (laughs) is that it's not that types are the right thing or the wrong thing. It's that they're a useful thing.
14. LFLex Fridman
  Well, so it's kind of optional. It's not strict typing. Like. you don't have to specify a type.
15. CLChris Lattner
  Exactly.
16. LFLex Fridman
  Okay, so it's starting from the thing that Python's kinda reaching towards right now with trying to inject types into it, but it's doing-
17. CLChris Lattner
  Yeah, with a very different approach, but yes.
18. LFLex Fridman
  Okay.
19. CLChris Lattner
  Yeah, so-
20. LFLex Fridman
  What's the- what's the different approach? I- I'm actually one of the people (sighs) that have not been using types very much in Python. So I haven't-
21. CLChris Lattner
  That's okay. Why- why did you say?
22. LFLex Fridman
  It just... Well, because I- I know the importance. It's like adults use-
23. CLChris Lattner
  (laughs)
24. LFLex Fridman
  ... strict typing. And so I'm- I refuse to grow up in that sense. It's a- it's a kind of rebellion. But I- I- I just know that, um, it probably reduces the amount of errors, even if it's for... Forget about performance improvements. It probably reduces-
25. CLChris Lattner
  Well-
26. LFLex Fridman
  ... errors when you do strict typing.
27. CLChris Lattner
  Yeah, so I mean, I think it's- it's interesting if you look at that, right? And the reason I- I'm giving you a hard time, man-
28. LFLex Fridman
  Yes.
29. CLChris Lattner
  ... is that there- there's this- this cultural norm, this pressure, this like-
30. LFLex Fridman
  Mm-hmm.
47:38 – 59:56
Immutability
1. LFLex Fridman
  Yeah. Anyway, so there's a lot of amazing features on the roadmap and those already implemented. It'd be awesome-
2. CLChris Lattner
  Yeah.
3. LFLex Fridman
  ... I could just ask you a few things.
4. CLChris Lattner
  Yeah. Go for it.
5. LFLex Fridman
  So, uh, the other performance improvement comes from immutability. So what's the, what's this var and this let thing that we got going on? What's-
6. CLChris Lattner
  Well, so-
7. LFLex Fridman
  ... what's immutability?
8. CLChris Lattner
  Yeah. So one of the things that is useful, and it's not always required but it's useful, is knowing whether something can change out from underneath you. Right? And so in Python you have a pointer to an array, right? And so you pass that pointer to an array around to things. If you pass into a function, they may take that and scroll away in some other data structure.
9. LFLex Fridman
  Mm-hmm.
10. CLChris Lattner
  And so you get your array back and you go to use it. Now somebody else is like putting stuff in your array.
11. LFLex Fridman
  Yeah.
12. CLChris Lattner
  How do you reason about that? It gets to be very complicated and leads to lots of bugs, right?
13. LFLex Fridman
  Mm-hmm.
14. CLChris Lattner
  And so one of the things that, you know, again, this is not something Mojo forces on you, but something that Mojo enables is a thing called value semantics. And what value semantics do is they take collections like arrays, like dictionaries, also tensors and strings and things like this that are much higher level and make them behave like proper values. And so it makes it look like if you pass these things around, you get a logical copy of all the data.
15. LFLex Fridman
  Mm-hmm.
16. CLChris Lattner
  And so if I pass you an array, it's your array. You can go do what you want to it. You're not gonna hurt my array.... now, that is an interesting and very powerful design prin- principle. It defines away a ton of bugs. You have to be careful to implement it in an efficient way. (laughs)
17. LFLex Fridman
  I- yes, is there a performance hit that's s- uh, significant?
18. CLChris Lattner
  Uh, generally not, if you implement it the right way.
19. LFLex Fridman
  Interesting.
20. CLChris Lattner
  But it requires a lot of very low level, uh, getting the language right bits.
21. LFLex Fridman
  I- I assumed that'd be a huge performance hit 'cause it's a really n- the benefit is really nice 'cause you don't get into the-
22. CLChris Lattner
  Absolutely. Well, well, the-
23. LFLex Fridman
  ... complex ...
24. CLChris Lattner
  ... the trick is, is you can't do, you can't do copies, so you have to provide the behavior of copying without doing the copy. (laughs)
25. LFLex Fridman
  Yeah. How do you do that? Is it ... How do you do that?
26. CLChris Lattner
  It's not magic.
27. LFLex Fridman
  Okay.
28. CLChris Lattner
  It's just ... It's- it's actually pretty cool. Well, so first, before we talk about how that works, let's talk about how it works in Python, right? So in Python, you define a person class, or maybe a person class is a bad idea. The ... You define a database class, right?
29. LFLex Fridman
  Mm-hmm.
30. CLChris Lattner
  And a database class has an array of records, something like that, right? And so what the problem is, is that if you pass in a record or a class instance into the database, it'll take ahold of that object, and then it assumes it has it.
59:56 – 1:34:23
Distributed deployment
1. CLChris Lattner
2. LFLex Fridman
  What about the, uh, the deployment and the execution across multiple machines?
3. CLChris Lattner
  Yeah.
4. LFLex Fridman
  So, uh, you write that, "The modular compute platform dynamically partitions models with billions of parameters and distributes their execution across multiple machines, uh, enabling unparalleled efficiency." By the way, the, the use of "unparalleled" in that sentence... anyway-
5. CLChris Lattner
  (laughs)
6. LFLex Fridman
  "... enabling unparalleled efficiency, scale, and reliability for the largest workloads." So how, how do you do this, um, abstraction of, um, distributed deployment o- of, of, of large models?
7. CLChris Lattner
  Yeah, so one, one of the really interesting, um, tensions... So there's a whole bunch of stuff that goes into that. I'll pick a random walk through it.
8. LFLex Fridman
  Mm-hmm.
9. CLChris Lattner
  Uh, if you, if you go back and replay the history of machine learning, right? I mean, the brief, the brief most recent history of machine learning, 'cause this is ver- as you know, very deep.
10. LFLex Fridman
  Yeah.
11. CLChris Lattner
  I, uh, I knew Lex when he had an AI podcast, yes. (laughs)
12. LFLex Fridman
  Right? Yep.
13. CLChris Lattner
  (laughs)
14. LFLex Fridman
  Yeah.
15. CLChris Lattner
  Um, so, uh, so if you look at just TensorFlow and PyTorch, which is pretty recent history in the big picture, right? But TensorFlow is all about graphs. PyTorch, I think, pretty unarguably ended up winning. And why did it win? Mostly because of usability.
16. LFLex Fridman
  Mm-hmm.
17. CLChris Lattner
  Right, and the usability of PyTorch is, I think, huge. And I think, again, that's a huge testament to the power of taking abstract, theoretical, technical concepts and bringing it to the masses, right? Now, the challenge with what the TensorFlow versus the PyTorch n- design points was that TensorFlow's e- kind of difficult to use for researchers, but it was actually pretty good for deployment. PyTorch is really good for researchers, it kind of's not super great for deployment, right? And so I think that we as an industry have been struggling. And if you look at what deploying a machine learning model today means is that you'll have researchers who are, I mean, wicked smart, of course, but they're wicked smart at model architecture and data and calculus (laughs) and like all, uh, like, they're wicked smart in various domains. They don't wanna know anything about the hardware or deployment or C++ or things like this, right? And so what's happened is you get people who train the model, they throw over, throw it over the fence, and then you have people that try to deploy the model. Well, every time you have a team A does X, they throw it over the fence, and team Y does some, t-team B does Y, like, you have a problem, because of course it never works the first time. And so you throw it over the fence, they figure out, "Okay, it's too slow, it won't fit, doesn't use the right operator, m- the tool crashes," whate- whatever the problem is. Then they have to throw it back over the fence. And every time you throw a thing over a fence, it takes three weeks of project managers and meetings and things like this. And so, uh, what we've seen today is that getting models into production can take weeks or months.
18. LFLex Fridman
  Mm-hmm.
19. CLChris Lattner
  Like, it's not atypical, right? I talk to lots of people, and you talk about, like, VP of software at some internet company trying to deploy a model, and they're like, "Why do I need a team of 45 people?" (laughs) Like, "I, it's so easy to train a model. Why, why can't I deploy it," right? And if you dig into this, every layer is problematic. So if you look at the language piece, I mean, this is tip of the iceberg. It's a very exciting tip of the iceberg for folks, but you've got Python on one side and C++ on the other side. Python doesn't really deploy. I mean, it can theoretically, technically, in some cases, but often a lot of production teams will want to get things out of Python because they get better performance and control and whatever else. So Mojo can help with that. If you look at serving, so you talk about gigantic models. Well, a gigantic model won't fit on one machine (laughs) , right? And so now you have this model, it's written in Python, it has to be rewritten in C++. Now it also has to be carved up so that half of it runs on one machine, half of it runs on another machine.
20. LFLex Fridman
  Mm-hmm.
21. CLChris Lattner
  Or maybe it runs on 10 machines. Well, so now suddenly the complexity is exploding, right? And the reason for this is that if you, if you look into TensorFlow, PyTorch, these systems, they weren't really designed for this world, right? They were designed for, you know, back in the day when we were n- starting and doing things, where it was a different, much simpler world. Like, you wanted to run ResNet-50 or some ancient model architecture like this, it was just a, it was a completely different world than-
22. LFLex Fridman
  Train on one GPU-
23. CLChris Lattner
  Exactly.
24. LFLex Fridman
  ... do inference on one GP (laughs) .
25. CLChris Lattner
  AlexNet, yeah, AlexNet, right? The major breakthrough.
26. LFLex Fridman
  (laughs)
27. CLChris Lattner
  And, um, and the world has changed, right? And so now the challenge is, is that TensorFlow, PyTorch, these systems, they weren't actually designed for LLMs. Like, that, that was not, that was not a thing. And so what, where TensorFlow actually has amazing power in terms of scale and deployment and things like that, and I think Google is, I mean, maybe not unmatched, but they're, like, incredible in terms of their capabilities and, and gigantic scale. Um, m-many researchers using PyTorch, right? And so PyTorch doesn't have those same capabilities. And so what Modular can do is it can help with that. Now if you take a step back and you say, like, "What is Modular doing," right? So Modular has, like, a, a bitter enemy that we're fighting against in the industry, and it's one of these things where everybody knows it, but nobody is usually willing to talk about it.
28. LFLex Fridman
  The bitter enemy.
29. CLChris Lattner
  The bitter thing that we have to destroy...
30. LFLex Fridman
  Yeah.

Episode duration: 3:34:03

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode pdJQ8iVTwj8

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome