Skip to content
Lex Fridman PodcastLex Fridman Podcast

Vladimir Vapnik: Statistical Learning | Lex Fridman Podcast #5

Lex Fridman and Vladimir Vapnik on vladimir Vapnik on learning, intelligence, and the limits of deep learning.

Lex FridmanhostVladimir Vapnikguest
Nov 16, 201854mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:0015:00

    The following is a…

    1. LF

      The following is a conversation with Vladimir Vapnik. He's the co-inventor of support vector machines, support vector clustering, VC theory, and many foundational ideas in statistical learning. He was born in the Soviet Union and worked at the Institute of Control Sciences in Moscow. Then, in the United States, he worked at AT&T, NEC labs, Facebook Research, and now is a professor at Columbia University. His work has been cited over 170,000 times. He has some very interesting ideas about artificial intelligence and the nature of learning, especially on limits of our current approaches and the open problems in the field. This conversation is part of MIT course on Artificial General Intelligence and the Artificial Intelligence podcast. If you enjoy it, please subscribe on YouTube or rate it on iTunes or your podcast provider of choice, or simply connect with me on Twitter or other social networks @lexfridman, spelled F-R-I-D. And now, here's my conversation with Vladimir Vapnik. Einstein famously said that God doesn't play dice.

    2. VV

      Yeah.

    3. LF

      You have studied the world through the eyes of statistics, so let me ask you in terms of the nature of reality, fundamental nature of reality, does God play dice?

    4. VV

      We don't know some factors, and because we don't know some factors which could be important, it looks like God play dice, but we only should describe. In philosophy, they distinguish between two positions, positions of instrumentalism, where you're creating theory for prediction, and position of realism, where you're trying to understand what God did.

    5. LF

      Can you describe instrumentalism and realism a little bit?

    6. VV

      For example, if you have some mechanical laws, what is that? Is it law which true always and everywhere or it is law which allow you to predict position of moving element? The, uh, what, what you believe, you believe that it is God's law, that God created the world which adhere to this physical law-

    7. LF

      Yeah.

    8. VV

      ... or it is just law for predictions?

    9. LF

      And which one is instrumentalism?

    10. VV

      For predictions.

    11. LF

      Just predict.

    12. VV

      If you believe that this is law of God-

    13. LF

      Mm-hmm.

    14. VV

      ... and it's always true everywhere, that means that you're realist.

    15. LF

      So you-

    16. VV

      You're trying to re- to really, uh, understood, understand the God's thought.

    17. LF

      So the way you see the world is, is an instrumentalist?

    18. VV

      You know-

    19. LF

      Absolutely.

    20. VV

      ... I'm working for some models, model of, uh, machine learning. So in this model, we can see, um, setting and we try to solve, resolve the setting to solve the problem. And you can do it in two different way. From the point of view of instrumentalist, and that's what everybody does now, because, uh, they say the goal of machine learning is to, uh, find the rule for classification.

    21. LF

      Mm-hmm.

    22. VV

      That is true, but it is instrument for prediction. But I can say the goal of, uh, machine learning is to, to learn about conditional probability, so how God play deuce, and he... if he play, what is probability for one, what is probability for another given situation. But for prediction, I don't need this. I need the rule.

    23. LF

      The rule.

    24. VV

      But for understanding, I need conditional probability.

    25. LF

      So let me just step back a little bit first to talk about, you mentioned, uh, which I read last night, the, the parts of the 1960 paper by Eugene, uh, Wigner-

    26. VV

      Yeah.

    27. LF

      ... uh, unreasonable effectiveness of mathematics in natural sciences. Such a, such a beautiful paper-

    28. VV

      Yeah. Absolutely.

    29. LF

      ... by the way. It made me feel, uh, to be honest, to confess my own work in the past few years on deep learning heavily applied, made me f- feel that I was missing out on some of the beauty of nature i- in a way that math can uncover. So let me just s- step away from the, the poetry of that for a second. How do you see the role of math in your life? Is it, is it a tool? Is it poetry? Where, where does it sit and does math for you have limits of what it can describe?

    30. VV

      Some people saying that math is language which use God. So I believe exactly-

  2. 15:0030:00

    Okay. …

    1. VV

      you want... and you have model for recognition now, so you would like so that theoretical description from model coincide this empirical description which you saw on

    2. NA

      Okay.

    3. VV

      ... text there.

    4. LF

      Mm-hmm.

    5. VV

      So about looks like a duck, it is general. But what about swims like a duck? You should know that duck swims. You can't say, "It play chess like a duck." Okay, duck doesn't play chess. And it is completely legal predicate, but it is useless. So how teacher can recognize not useless predicate? So up to now, we don't use this predicate in existing machine learning.

    6. LF

      And you think that's not necessary-

    7. VV

      So why we need zillions of data? (clears throat) But in this English proverb, proverb they use only three predicate; looks like a duck, swims like a duck, and quack like a duck.

    8. LF

      So you can't deny the fact that swims like a duck and quacks like a duck has humor in it, has ambiguity?

    9. VV

      Let's talk about swim like a duck. In... it does not say, "Jumps, jump li- like, like a duck." Why? Because-

    10. LF

      It's not relevant, uh ...

    11. VV

      But that means that you know ducks, you know different birds, you know animals.

    12. LF

      Yeah.

    13. VV

      And you derive from this that it is relevant to say swim like a duck.

    14. LF

      So underneath, in order for us to understand swims like a duck, it feels like we need to know millions of other little pieces of information-

    15. VV

      I am not sure.

    16. LF

      ... which we pick up along the way. You don't think so? There doesn't need to be this knowledge base in, in those statements carries some rich information that helps us understand the essence of duck?

    17. VV

      Yeah.

    18. LF

      How far are we from integrating, uh, predicates?

    19. VV

      Well, you know that when, when you consider complete theory of machine learning-

    20. LF

      Yeah.

    21. VV

      ... so what it does, you have a lot of functions and then you're, you're, you're talking, it looks like a duck. You see your training data. From training data you recognize like, uh, expected duck should look.

    22. LF

      Mm-hmm.

    23. VV

      Then you remove all functions which does not look like you think it should look from training data, so you decrease amount of function from which we... you pick up one. Then you give a second predicate, and it again decre- decreases set of function. And after that, you, you pick up the best function you can. Find it is standard machine learning. So why you need not too many examples?

    24. LF

      Because your predicates are very good? (laughs) Or, or you're not-

    25. VV

      Yeah, that, that's means the predicate very good.

    26. LF

      Yeah.

    27. VV

      Because every predicate is invented to decrease admissible set of function.

    28. LF

      So you talk about admissible set of functions and you talk about good functions. So what makes a good function?

    29. VV

      So admissible set of function is set of function which has small capacity or small diversity, small VC dimension exactly-

    30. LF

      Mm-hmm.

  3. 30:0045:00

    Mm-hmm. …

    1. VV

      training data. Uh, actually, for mathematicians, uh, when you're considering variant, you need to use law of large numbers. When you're making training in existing algorithm, you need uniform law of large numbers-

    2. LF

      Mm-hmm.

    3. VV

      ... which is much more difficult (‎;…) ‎and requires (‎;...) dimension and all this stuff. But nevertheless-... if you use both weak and strong way of convergence, you can decrease a lot of training data.

    4. LF

      Yeah. You could do the- the three, the swims like a duck, uh, and quacks like a duck.

    5. VV

      Yeah, yeah.

    6. LF

      But our... So let- let's- let's step back and, um, think about intel- human intelligence in general. And clearly, that has evolved in a non-mathematical way. (laughs) It wa- it wasn't, uh... As far as we know, uh, God, uh, or- or- or whoever, didn't, uh, come up with a model, um, and place it in our brain of admissible functions. It kind of evolved. I don't know. Maybe you have a view on this. But, so Alan Turing in the '50s, in- in his paper, um, asked and rejected the question, "Can machines think?" It's not a very useful question, but can you briefly entertain this useful qu- useless question? Can machines think? So talk about intelligence and your view of it.

    7. VV

      I don't know that. I know that Turing described imitation. If computer can imitate human being, let's call it intelligent. And he understands that it is not thinking, computer.

    8. LF

      Yes.

    9. VV

      He- he completely understand what he doing, but he set up problem of le- im- imitation. So now, we understand that the problem not in imitation. I'm not sure that intelligence just inside of us. It might be also outside of us. I have several observations. So when I prove some theorem-

    10. LF

      Mm-hmm.

    11. VV

      ... it's very difficult theorem. Uh, in couple of years, in several places, uh, people proved the same theorems, say, Sawyer Lemma after us was done then- and other guys proved the same theorem.

    12. LF

      Yeah.

    13. VV

      In the history of science, it's happened all- all the time. For example, geometry.

    14. LF

      Mm-hmm.

    15. VV

      It's happened simultaneously. First did, Lobachevsky and then Gauss and Bolyai and- and other guys in the... approximately in 10 times period, 10-10 years period of time.

    16. LF

      Mm-hmm.

    17. VV

      And I saw a lot of examples like that. And many mathematicians thinks that when they develop something-

    18. LF

      Mm-hmm.

    19. VV

      ... they developing something in general which affect everybody.

    20. LF

      Mm-hmm.

    21. VV

      So maybe our models that intelligence is only inside of us is incorrect.

    22. LF

      It's our interpretation, yeah.

    23. VV

      It might be there exist some connection-

    24. LF

      Yeah.

    25. VV

      ... this world intelligence. I don't know that.

    26. LF

      You're almost like plugging in into w- uh...

    27. VV

      Yeah, exactly.

    28. LF

      (laughs) And contributing to this, uh...

    29. VV

      Into big network.

    30. LF

      (laughs) Into- into a big, uh, maybe neural network.

  4. 45:0054:02

    Mm-hmm. …

    1. VV

      this predicate, we can do good job with small amount of observations. And the very first challenges is well-known digit recognition, and you know digits.

    2. LF

      Mm-hmm.

    3. VV

      And please tell me variants. I thinking about that, I can say, for digit three, I would introduce concept of horizontal symmetry.

    4. LF

      Mm-hmm.

    5. VV

      So the, the digit three has horizontal symmetry, say more than, say digit two or something like that. But as soon as I get the idea of horizontal symmetry, I can mathematically invent a lot of measure of horizontal sym- symmetry, or even vertical symmetry or diagonal symmetry, whatever, if I have idea of symmetry. But what else? Looking come- on, on digit I see that it is meta- meta predicate, which is not shape. It is something like symmetry, like how dark is whole picture. Something like that. Which, which can self-rise a predicate.

    6. LF

      You think such a predicate could rise out of s- something that's, um, not general? Meaning, it feels like, for me to be able to understand the difference between a two and a three, I would need to have had a, a ch- a childhood of 10 to 15 years playing with kids, going to school, being yelled by parents. All of that. Wa- walking, jumping, looking at ducks. And now, then I would be able to generate the right predicate for telling the difference between two and a three. Or do you think there's a more efficient-

    7. VV

      (laughs)

    8. LF

      ... way?

    9. VV

      I don't know. I know for sure that you, you must know something more than digits-

    10. LF

      Yes.

    11. VV

      ... to, to-

    12. LF

      And that's a powerful statement.

    13. VV

      Yeah. But maybe there are several languages of description, um, these elements of digits. So I'm talking about symmetry about something.

    14. LF

      Symmetry, right.

    15. VV

      Properties of geometry, I'm talking about. Something abstract. I don't know that, but this is a problem of intelligence. So in one of our article, it is trivial to show that every example can carry not more than one bit of information in real. Because, uh, when you show example, uh, and you say this is one, you can remove, say, a function which does not tell you one.

    16. LF

      Yeah.

    17. VV

      Say it's the best strategy if you can do it perfectly, is remove half of the functions. But when you use one predicate, which looks like a duck, you can remove much more function than half. And that means that it contain a lot of bit of informations from formal point of view. But when you have a general picture of what you want to recognize and general picture of the world, can you invent this predicate? And that predicate carry a lot of information.

    18. LF

      Beautifully put. Maybe just me, but in all the math you show, in your work, uh, which is some of the most profound mathematical work in, in, in the field of learning AI and just math in general, I hear a lot of poetry and philosophy. Y- you really kinda, um, talk about philosophy of science. You, there's a po- there's a poetry and music to a lot of the work you're doing and the way you're thinking about it. So do you... Where does that come from? Do you es- do you escape to poetry? Do you escape to music? Or not? (laughs)

    19. VV

      I think that, I think that there exist ground truths.

    20. LF

      There exists ground truth? (laughs)

    21. VV

      Yeah. And that ex- can be seen everywhere.

    22. LF

      Yeah.

    23. VV

      The smart guy philosophers, sometime I surprise how they deep see, sometime I see that some of them are completely out of subject.

    24. LF

      Mm-hmm.

    25. VV

      But the ground truth, I see music.

    26. LF

      Music is a ground truth?

    27. VV

      Yeah. And in poetry, many poet, they, they believe the... They take dictation.

    28. LF

      (laughs) So what, uh, what piece of music as a piece of em- empirical evidence gave you a sense that they are, um, they're touching something in the ground truth?

    29. VV

      It is structure.

    30. LF

      The structure (laughs) within-

Episode duration: 54:02

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode STFcvzoxVw4

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome