Skip to content
Lex Fridman PodcastLex Fridman Podcast

Elon Musk: Tesla Autopilot | Lex Fridman Podcast #18

Lex Fridman and Elon Musk on elon Musk explains Tesla Autopilot’s path to safer-than-human autonomy.

Lex FridmanhostElon Muskguest
Apr 12, 201932mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:0015:00

    The following is a…

    1. LF

      The following is a conversation with Elon Musk. He's the CEO of Tesla, SpaceX, Neurolink, and a co-founder of several other companies. This conversation is part of the Artificial Intelligence podcast. This series includes leading researchers in academia and industry, including CEOs and CTOs of automotive, robotics, AI, and technology companies. This conversation happened after the release of the paper from our group at MIT on driver functional vigilance during use of Tesla's autopilot. The Tesla team reached out to me, offering a podcast conversation with Mr. Musk. I accepted, with full control of questions I could ask and the choice of what is released publicly. I ended up editing out nothing of substance. I've never spoken with Elon before this conversation, publicly or privately. Neither he nor his companies have any influence on my opinion, nor on the rigor and integrity of the scientific method that I practice in my position at MIT. Tesla has never financially supported my research, and I've never owned a Tesla vehicle. I've never owned Tesla stock. This podcast is not a scientific paper. It is a conversation. I respect Elon as I do all other leaders and engineers I've spoken with. We agree on some things and disagree on others. My goal is always, with these conversations, is to understand the way the guest sees the world. One particular point of disagreement in this conversation was the extent to which camera-based driver monitoring will improve outcomes, and for how long it will remain relevant for AI-assisted driving. As someone who works on and is fascinated by human-centered artificial intelligence, I believe that if implemented and integrated effectively, camera-based driver monitoring is likely to be of benefit in both the short-term and the long-term. In contrast, Elon and Tesla's focus is on the improvement of autopilot, such that its statistical safety benefits override any concern with human behavior and psychology. Elon and I may not agree on everything, but I deeply respect the engineering and innovation behind the efforts that he leads. My goal here is to catalyze a rigorous, nuanced, and objective discussion in industry and academia on AI-assisted driving, one that ultimately makes for a safer and better world. And now, here's my conversation with Elon Musk. What was the vision, the dream of autopilot when, uh, in the beginning, the big picture system level, when, uh, it was first conceived and started being installed in 2014 in the hardware and the cars? What was the vision, the dream?

    2. EM

      I wouldn't characterize it as a vision or dream, simply that there are obviously two massive revolutions in, in the, uh, automobile industry. One is the transition to elect- electrification, um, and then the other is autonomy. And, uh, it became obvious to me that, in the future, any, any car that does not have autonomy, uh, would be about as useful as a horse. Which is not to say that there's no use, it's just rare and somewhat idiosyncratic if somebody has a horse at this point. So, it's obvious that cars will drive themselves completely, it's just a question of time, and if we did not participate in the autonomy revolution, then our cars would not be useful to people, relative to cars that are autonomous. I mean, an autonomous car is arguably worth five to ten times more than a non- a car which is not autonomous.

    3. LF

      In the long term.

    4. EM

      Depends what you mean by long term, but let's say at least for the next five years, perhaps 10 years.

    5. LF

      So, there are a lot of very interesting design choices with autopilot early on. First is showing on the instrument cluster, or in the Model 3, on the center stack display, what the combined sensor suite sees. What was the thinking behind that choice? Was there debate? What was the process?

    6. EM

      The whole point of the t- display is to provide a health check on the r- the vehicle's perception of reality. So, the vehicle's, uh, taking in information from a bunch of sensors, primarily cameras, but also radar and ultrasonics, uh, GPS, and so forth. And then, uh, that, that information is then rendered into c- vector space, uh, and that, you know, with a bunch of objects with pr- with properties, like lane lines and traffic lights and other cars. Um, and then in vector space, that is re-rendered onto your display so you can confirm whether the car knows what's going on or not by looking out the window.

    7. LF

      Right. I think that's a extremely powerful thing for people to get an understanding, sort of become one with the system and understanding what the system is capable of.

    8. EM

      Mm-hmm.

    9. LF

      Now, have you considered showing more? So, if we look at the computer vision, you know, like road segmentation, lane detection, vehicle detection, object detection underlying the system, there is at the edges some uncertainty. Have you considered revealing the parts that, uh, the- the uncertainty in the system, the sort of more-

    10. EM

      The probabilities associated with, with say image recognition or something like that?

    11. LF

      Yeah. So right now it shows like the vehicles in the vicinity, a very clean, crisp image, and people do confirm that there's a car in front of me and the system sees there's a car in front of me, but to help people build an intuition of what computer vision is by showing some of the uncertainty.

    12. EM

      Well, I think it's, uh, yeah, my car, I always look- look at the sort of the- the debug view, and there's, there's two debug views, uh, o- one is...... augmented vision, uh, where, which I'm sure you've seen, where it basically, uh, we, we draw boxes and labels around objects that are recognized. And then there's, uh, what we call the visualizer, which is basically a vector-based representation summing up, uh, the input from all sensors. That, that does, does not show b- any pictures, but it shows, uh, all of the ... it basically shows the car's view of, of, of the world in vector space. Um, but I think this is very difficult for people to kno- normal people to understand. They would not know what the heck they're looking at.

    13. LF

      So, it's almost an HMI challenge to... the current things that are being displayed is optimized for the general public understanding of what the system is capable of.

    14. EM

      Yeah. It, like, if you've no idea what, how computer vision works or anything, you can still look at the screen and p- and see if the car knows what's going on. And then if you're, you know, if you're a development engineer or if you're, you know, if you're, if you have the development build like I do, then you can see, uh, you know, all the debug information. But those would just be, like, total gibberish to most people.

    15. LF

      Right. What's your view on how to best distribute effort? So, there's three, I would say, technical aspects of autopilot that are really important. So, it's the underlying algorithms, like the neural network architecture, there's the data, so that to train on, and then there's the hardware development. There may be others, but ... so, look, algorithm, data, hardware. You on- you only have so much money, only have so much time. What do you think is the most important thing to, to, uh, allocate resources to? Or do you see it as pretty evenly distributed between those three?

    16. EM

      We automatically get fast amounts of data, because all of our cars have eight external-facing cameras and radar, and usually 12 ultrasonic sensors, uh, GPS obviously, um, and, uh, IMU. And so we, we basically have a fleet that has, um ... and we've got about 400,000 cars on the road that have that level of data. I g- actually, I think you keep quite close track of it, actually.

    17. LF

      Yes.

    18. EM

      Yeah. So, we're, we're approaching half a million cars on the road that have the full sensor suite.

    19. LF

      Yeah.

    20. EM

      Um, the ... so this is ... I, I'm n- I, I'm not sure how many other cars on the road have this sensor suite, but I would be surprised if it's more than 5,000, which means that we have 99% of all the data.

    21. LF

      So, there's this huge-

    22. EM

      Um-

    23. LF

      ... inflow of data.

    24. EM

      Absolutely. Massive inflow of data. And then we ... it's d- it's taken us about three years, but now we've finally developed our full self-driving computer, which can process, uh, an or- an order of magnitude as much as the NVIDIA system that we currently have in the, in the cars. And it's really just a ... to use it, you unplug the Nvi- NVIDIA computer and plug the Tesla computer in, and that's it. And it's, it's, uh ... in fact, we're not even qu- we're still exploring the boundaries of its capabilities. Uh, but we're able to run the cameras at full frame rate, full resolution, uh, not even crop the images, and, uh, it's still got headroom, even on one of the, the systems. The hard d- full, full self-driving computer is really two computers, two systems on a chip that are fully redundant, so you could put a bolt through basically any part of that system and it still works.

    25. LF

      The redundancy, are they perfect copies of each other, or ...

    26. EM

      Yeah.

    27. LF

      Oh, so it's purely for redundancy as opposed to an arguing machine kind of architecture where they're both making decisions. This is purely for redundancy.

    28. EM

      I- I think of it more like it's ... if you have, uh, a twin engine aircraft, um, commercial aircraft, the system will operate best if both systems are operating, but it's, it's capable of operating safely on one. So ... but a- a- as it is right now, we can just run ... we're h- we haven't even hit the, the, the e- the edge of performance, so there's no need to actually distribute functionality across both SOCs. We, we can actually just run a full duplicate on b- on, on each one.

    29. LF

      So, you haven't really explored or hit the limit of this-

    30. EM

      We have not yet hit the limit, no.

  2. 15:0030:00

    But capable is an…

    1. EM

    2. LF

      But capable is an interesting word.

    3. EM

      Mm-hmm.

    4. LF

      Because, um ...

    5. EM

      Like the hardware is.

    6. LF

      Yeah, the hardware.

    7. EM

      And as we refine the software, it, the capabilities will increase dramatically. Um, and then the reliability will increase dramatically, and then it will receive regulatory approval. So essentially buying a car today is an investment in the future. You're, you're essentially buying a, a ca- you're, you're buying ... The, I think the most profound thing is that if you buy a Tesla today, I believe you are buying an appreciating asset, not a depreciating asset.

    8. LF

      So that's a really important statement there because if hardware is capable enough, that's the hard thing to upgrade-

    9. EM

      Yes.

    10. LF

      ... usually.

    11. EM

      Exactly.

    12. LF

      So then the rest is a software problem.

    13. EM

      Yes.

    14. LF

      Of-

    15. EM

      Software has like no marginal cost, really.

    16. LF

      But what's your intuition on the software side? How hard are the remaining steps to, to get it to where, um, you know, uh, the experience, uh, not just the safety but the-

    17. EM

      Mm-hmm.

    18. LF

      ... full experience is something that people would, uh, enjoy?

    19. EM

      I think people will enjoy it very much so on, on, on any- on the highways. It's, it's a total game changer for quality of life for using, you know, Tesla autopilot on the highways. Uh, so it's really just extending that functionality to city streets, adding in the, the traffic light, uh, traffic light recognition, uh, navigating complex intersections, and, um, and, and then, uh, being able to navigate complicated pa- parking lots. So the car can, uh, exit a parking space and come and find you even if it's in a, a complete maze of a parking lot. And, uh, and, and then if ... And then you can just ... It could just drop you off and find a parking spot by itself.

    20. LF

      Yeah, in terms of enjoyability and something that people would, uh, would actually find a lot of use from the parking lot is a, is a really, you know ... It's, it's rich of annoyance when you have to do it manually so there's a lot of benefit to be gained from automation there. So let me start injecting the human into this discussion a little bit. Uh, so let's talk, talk about full autonomy. If you look at the current level 4 vehicles being tested on road, like Waymo and so on, they're only technically autonomous. They're really level 2 systems with just a different f- design philosophy, because there's always a safety driver in almost all cases and they're monitoring the system.

    21. EM

      Right.

    22. LF

      Do you see Tesla's full self-driving as still for a time to come requiring supervision of the hu- the human being? So its capabilities are powerful enough to drive, but nevertheless requires a human to still be supervising just like a safety driver is in a ... other fully autonomous vehicles?

    23. EM

      I think it'll, it'll require detecting hands on wheel for at, at least, uh, six months or something like that from here. It, it really is a question of, like ...... from a regulatory standpoint, uh, what, h- how much safer than a person does autopilot need to be f- for it to, to be okay to not monitor the car? You know, and, and this is a, a debate that one can have. A- And then if you, but you need, you need a l- a, a large sample s- a l- large amount of data, um, so that you can prove with high confidence, statistically speaking, that the car is dramatically safer than a person, um, and that adding in the person monitoring does not materially affect the safety. So, it might n- need to be, like, two or 300% safer than a person.

    24. LF

      And how do you prove that?

    25. EM

      Incidents per mile.

    26. LF

      Incidents per mile?

    27. EM

      Yeah.

    28. LF

      So, crashes and fatalities? So-

    29. EM

      Yeah. I mean, f- uh, fat- fatalities would be a factor, but there, there're, there are just not enough fatalities to be statistically significant, uh, a- at scale. But there are enough crashes. Th- you know, there are much, far more crashes than there are fatalities. So you can just assess what is the probability of, uh, of a crash. The- then there's a- another step which is probability of injury, then probability of permanent injury, then pro- probability of death. And all of those need to be, uh, much better than a person, uh, by at least, perhaps, 200%.

    30. LF

      And you think there is, uh, the ability to have a healthy discourse with the regulatory bodies on this topic?

  3. 30:0032:29

    Do you think we…

    1. EM

      literally what it- how it appears right now. I could be wrong, but it appears to be the case that Tesla is vastly ahead of everyone.

    2. LF

      Do you think we will ever create an AI system that we can love and loves us back in a deep, meaningful way, like in the movie Her?

    3. EM

      I think AI will be capable of convincing you to fall in love with it very well.

    4. LF

      And that's different than us humans?

    5. EM

      You know, we start getting into a metaphysical question of, like, do emotions and thoughts exist in a different realm than the physical? And maybe they do, maybe they don't, I don't know. But, but from a physics standpoint, I tend to think, I tend to think of things, you know, like physics was my main sort of training, and, and from a physics standpoint, essentially, if, if it loves you in a way that is, that you can't tell whether it's real or not, it is real.

    6. LF

      That's a physics view of love.

    7. EM

      Yeah.

    8. LF

      (laughs)

    9. EM

      If there's no... If you, if you cannot dis- if you cannot prove that it does not, if there's no test that you can apply that would make it... allow you to tell the difference, then there is no difference.

    10. LF

      Right. And it's similar to, uh, seeing our world as simulation. There may not be a test to tell the difference between what the real world-

    11. EM

      Yes.

    12. LF

      ... and the simulation, and therefore, from a physics perspective, it might as well be the same thing.

    13. EM

      Yes. Uh, and there may be ways to test whether it's a simulation. There might be, I'm not saying there aren't. But you could certainly imagine that a simulation could, could correct wha- that once an entity in the simulation found a way to detect the simulation, it could either restart, you know, pause the root simulation, start a new simulation, or do one of many other things that then corrects for that error.

    14. LF

      So when maybe you or somebody else creates an AGI system, and you get to ask her one question, what would that question be?

    15. EM

      What's outside the simulation?

    16. LF

      Elon, thank you so much for talking today. It was a pleasure.

    17. EM

      All right. Thank you.

Episode duration: 32:44

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode dEv99vxKjVI

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome