Lex Fridman PodcastDavid Ferrucci: IBM Watson, Jeopardy & Deep Conversations with AI | Lex Fridman Podcast #44
EVERY SPOKEN WORD
155 min read · 30,552 words- 0:00 – 15:00
The following is a…
- LFLex Fridman
The following is a conversation with David Ferrucci. He led the team that built Watson, the IBM question-and-answering system that beat the top humans in the world at the game of Jeopardy. From spending a couple of hours with David, I saw a genuine passion, not only for abstract understanding of intelligence, but for engineering it to solve real-world problems under real-world deadlines and resource constraints. Where science meets engineering is where brilliant, simple ingenuity emerges. People who work at joining the two have a lot of wisdom earned through failures and eventual success. David is also the founder, CEO, and chief scientist of Elemental Cognition, a company working to engineer AI systems that understand the world the way people do. This is the Artificial Intelligence podcast. If you enjoy it, subscribe on YouTube, give it five stars on iTunes, support it on Patreon, or simply connect with me on Twitter @lexfridman, spelled F-R-I-D-M-A-N. And now, here's my conversation with David Ferrucci. Your undergrad was in biology with a- with an eye toward medical school before you went on for the PhD in computer science. So let me ask you an easy question. What is the difference between biological systems and computer systems? In your... when you sit back, look at the stars, and think philosophically.
- DFDavid Ferrucci
I often wonder, I often wonder whether or not there is a- a substantive difference. I mean, I think the thing that got me into computer science and into artificial intelligence was exactly this presupposition that, uh, if we can get machines to think, or I should say this question, this philosophical question, if we can get machines to think, to understand, to process information the way do- we do, so if we can describe a procedure or describe a process, even if that process were the intelligence process itself, then what would be the difference? So, um, from a philosophical standpoint, I'm not sure I'm convinced that there- there- there is. I mean, you can go in the direction of spirituality or you can go in the direction of a soul, but in terms of, you know, what we can- what we can experience, uh, from an intellectual and physical perspective, I'm not sure there is. Clearly, there implement- there- there are different implementations. But if you were to say, as a biological information, processing system fundamentally more capable than one we might be able to build out of silicon or- or some other, uh, substrate, uh, I don't- I don't know that there is.
- LFLex Fridman
How distant do you think is the biological implementation? So fundamentally, they may have the same capabilities, but is it, um, really a far mystery where a huge number of breakthroughs are needed to be able to understand it? Or is it something that, for the most part, in the important aspects, echoes of the same kind of characteristics?
- DFDavid Ferrucci
Yeah, that's interesting. I mean, uh, so, you know, your question presupposes that there's this goal to recreate, you know, what we perceive as biological intelligence. I'm not- I'm not sure that's the- I'm not sure that- that's how I would state the goal. I mean, I think that studying-
- LFLex Fridman
What is the goal?
- DFDavid Ferrucci
Good. So I think there are a few goals. I think that understanding the human brain and how it works is important for us to be able to diagnose and treat issues, for us to understand our own strengths and weaknesses, um, both intellectual, psychological, and physical. So neuroscience and understanding the brain from that perspective has a ne- there's a clear, clear goal there. From the perspective of saying I want to m- I want to- I want to mimic human intelligence, that one's a little bit more interesting. Human intelligence certainly has, um, a lot of things we envy. It's also got a lot of problems too. So I think we're capable of sort of stepping back and saying, "What do we want out of it? Uh, what do we want out of an intelligence? Uh, how do we want to communicate with that intelligence? How do we want it to behave? How do we want it to perform?" Now, of course, it's- it's- it's somewhat of an interesting argument because I'm sitting here as a human with a biological brain and I'm critiquing the strengths and weaknesses of human intelligence and saying that we have the capacity to s- the capacity to step back and say, "Gee, what do- what is intelligence and what do we really want out of it?" And that even- in and of itself suggests that human intelligence is something quite enviable, that it could- it- you know, it can- it can- it can, um, introspect that- it can introspect that way.
- LFLex Fridman
And the flaws, you mentioned the flaws. That humans have flaws.
- DFDavid Ferrucci
Yeah. But I think- I think that flaws that human intelligence has is ex- extremely, um, prejudicial and biased in the way it draws many inferences.
- LFLex Fridman
Do you think those are... Sorry to interrupt. Do you think those are features or are those bugs? Do you think the- the prejudice, the forgetfulness, the fear... What other flaws? List them all. What? Love? Maybe that's a flaw. Do you think those are all things that can be get- gotten- get in the way of intelligence or the essential components of intelligence?
- DFDavid Ferrucci
Well, again, it's- i- if you go back and you define intelligence as being able to sort of accuracy- accurately, precisely, rigorously reason, develop answers, and justify those answers in an objective way, yeah, then human intelligence has these flaws in that it tends to be more influenced by some of the things you said.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
Uh, and it's- and it's largely an inductive process, meaning it takes past data, uses that to predict the future.... very advantageous in some cases, but fundamentally biased and prejudicial in other cases, 'cause it's gonna be strongly influenced by its priors, whether they're f- whether they're right or wrong for some, you know, objective reasoning perspective. You're gonna favor them because that's, those are the decisions or those are the paths that succeeded in the past. And I think that mode of intelligence makes a lot of sense for, um, when your primary goal is to act quickly and s- and, and survive and make fast decisions. And I think those create problems, uh, when you wanna think more deeply and make more objective and reasoned decisions. Of course, humans capable of doing both.
- LFLex Fridman
Right.
- DFDavid Ferrucci
They do sort of one more naturally than they do the other, but they're capable of doing both.
- LFLex Fridman
You's saying they do the one that responds quickly and it more naturally?
- DFDavid Ferrucci
Right.
- LFLex Fridman
'Cause that's the thing we kinda need to not be eaten by, uh, the p- the predators-
- DFDavid Ferrucci
Well-
- LFLex Fridman
... in the world.
- DFDavid Ferrucci
... for example, but I mean, but, uh, then we, we've, we've learned to reason, uh, through logic. We've developed science. We've trained people to do that. I think that's harder for the individual to do. Uh, I think it requires training and, you know, and, and, and teaching. I think we are... human mind is cer- certainly is capable of it, but we find it more difficult. And then there are other weaknesses, if you will, as you mentioned earlier, just memory capacity and, and, um, how many chains of inference can you actually, um, go through without, like, losing your way, so just focus and...
- LFLex Fridman
S- so the way you think about intelligence, and we're really sort of floating in this philosophical s- slightly space, but I think you're, like, the perfect person to talk about this because, uh, we'll get to Jeopardy! and beyond, th- that's like an incredible, one of the most incredible accomplishments in AI, in the history of AI, but hence, the philosophical discussion. So let me ask, you've kind of alluded to it, but let me ask again, what is intelligence underlying the discussions we'll have with, with Jeopardy! and beyond, how do you think about intelligence? Is it a sufficiently complicated problem, being able to reason your way through solving that problem? Is that kinda how you think about what it means to be intelligent?
- DFDavid Ferrucci
So I, I think of intelligence two, primarily two ways. One is the ability to predict. So in other words, if I have a problem, what's gonna... can I predict what's gonna happen next? Whether it's to, you know, predict the answer of a question or to say, "Look, I'm looking at all the market dynamics and I'm gonna tell you what's gonna happen next," or you're in a, in a room and somebody walks in and you're gonna predict what they're gonna do next or what they're gonna say next.
- LFLex Fridman
So in a, in a highly dynamic environment full of uncertainty, be able to-
- DFDavid Ferrucci
Lots of, lot-
- LFLex Fridman
... predict.
- DFDavid Ferrucci
... you know, the more-
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
... the more variables, the more complex, the more possibilities, the more complex. But can I take a small amount of prior data and learn the pattern and then predict what's gonna happen next accurately and consistently? That's a f- that's certainly a form of intelligence.
- LFLex Fridman
W- what do you need for that, by the way? You need to have an understanding of the way the world works in order to be able to unroll it into the future, right? Like, w- what do you think-
- DFDavid Ferrucci
Well-
- 15:00 – 30:00
Um, otherwi- otherwise... In…
- LFLex Fridman
- DFDavid Ferrucci
Um, otherwi- otherwise... In fact, there have been-
- LFLex Fridman
That's the metric of success, yeah.
- DFDavid Ferrucci
... several proofs. There have been, yeah. There have been several proofs out there where mathematicians would study it for a long time before they were convinced that it actually proved anything.
- LFLex Fridman
(laughs)
- DFDavid Ferrucci
Right? You never know if it proved anything until the community of mathematicians decided that it did. So I mean, so, yeah, it's, but it's, it's a real thing, right? And, and, and that's sort of the point, right? Is that ultimately un- you know, this notion of understanding, us understanding s- something is ultimately a social concept.
- LFLex Fridman
Yes.
- DFDavid Ferrucci
In other words, you, you, I have to convince enough people that I, I did this in a reasonable way. I could do this in a way that other people can understand and, and replicate, and, um, that makes sense to them. So we're ver- we're, our human intelligence is bound together in, in that way. We're bound up, um, in that sense. We sort of never really get away with it until we can con- sort of convince others, uh, that our thinking process, you know, makes sense.
- LFLex Fridman
Do you think the general question of intelligence is then also a social construct? So if we task, ask questions of an artificial intelligence system, "Is this system intelligent?" the answer will ultimately be a socially constructed concept.
- DFDavid Ferrucci
I think, I think, so I, so I think, yeah, I'm making two statements. I'm saying we can try to define intelligence in a super objective way that says, "Here, here's this data. I wanna predict this type of thing."
- LFLex Fridman
Right.
- DFDavid Ferrucci
"Um, learn this function, and then if you get it right, often enough, we consider you intelligent."
- LFLex Fridman
But that's more like a savant.
- DFDavid Ferrucci
More like a question of intelligence.
- LFLex Fridman
That's-
- DFDavid Ferrucci
I think it i-
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
... I think it is. It doesn't mean it's not useful. It's f-
- LFLex Fridman
Right.
- DFDavid Ferrucci
... it could be incredibly useful. It could be solving a problem we can't otherwise solve, and, um, can solve it more reliably than we can. But then there's this notion of can humans take responsibility for the, the, the decision that you're, that you're making? Can we make those decisions ourselves? Can we relate to the process that you're going through? And now you, as an agent, whether you're a machine or another human, frankly, are now obliged to make me understand how it is that you're arriving at that answer, and allow me, me, me or the, obviously, a community or a judge of people to decide whether or not, whether or not that makes sense. And by the way, that happens with humans as well.
- LFLex Fridman
Right.
- DFDavid Ferrucci
You're sitting down with your staff, for example, and you ask for suggestions about what to do next, and someone says, "Oh, I think you should buy, and I s- think you should buy this much," or whatever, or sell or whatever it is. Or I s- think you should launch the product today or tomorrow, or launch this product versus that product, whatever the decision may be. And you ask, "Why?" And the person said, "I just have a good feeling about it."
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
And it's not, you're not very satisfied. Now, that person could be, you know, you might s- say, "Well, you've been right, you know, before, but I'm gonna put the company on the line. Can you explain to me why I should believe this?"
- LFLex Fridman
Right. And that explanation may have nothing to do with the truth. You just, the, the ultim-
- DFDavid Ferrucci
It's gotta convince the other person.
- LFLex Fridman
The ultimate metric is-
- DFDavid Ferrucci
It could still be wrong.
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
It could still be wrong.
- 30:00 – 45:00
And yet, humans are…
- DFDavid Ferrucci
perspective are you bringing to the table? What are your prior experiences with those artifacts? What are your fundamental assumptions and values? What, what is your ability to kind of reason, to chain together, um, logical implications as you're sitting there and saying, 'Well, if this is the case, then I would conclude this. And if that's the case, then I would conclude that'?" And, uh... So, your reasoning processes and how they work, your prior models and, uh, what they are, your values and your assumptions. All those things now come together into the interpretation. Getting in sync on that is, is hard.
- LFLex Fridman
And yet, humans are able to intuit some of that without any pre-
- DFDavid Ferrucci
Because they have the shared experience.
- LFLex Fridman
... mea- and we're not talking about shared, two people having a shared experience.
- DFDavid Ferrucci
No.
- LFLex Fridman
We, as a society-
- DFDavid Ferrucci
That's correct. We have the shared experience and we have similar brains, so we tend to inter- so in other words, part of our shared experience is our shared local experience. Like, we may live in the same culture, we may live in the same society, and therefore, we have similar educations. We have similar qu- what we like to call prior models, about the world, prior experiences, and we use that as a... Think of it as a wide collection of interrelated variables, and they're all bound to similar things. And so we take that as our background, and we start interpreting things similarly. But as humans, we have a, we have, um, a lot of shared experience. We do have similar brains, similar goals, similar emotions, under similar circumstances, because we're both humans. So now, one of the early questions you asked, well, how is biological and g- you know, computer, uh, information systems, uh, fundamentally different? Well, one i- you know, one is, can... Sh- humans come with a lot of pre-programmed stuff.
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
A ton of pro- programmed stuff. And they're able to communicate because they have a lot of... Because they share that stuff.
- LFLex Fridman
Do you think that shared knowledge... If, if, if we can maybe escape the hardware question, how much is encoded in the hardware? Just the shared knowledge and the software, the, the history, the many centuries of wars and so on that, that came to today. That shared knowledge. Uh, how hard is it to encode? And did you have a hope? Can you speak to how hard is it to encode that knowledge systematically in a way that could be used by a computer?
- DFDavid Ferrucci
So I think it is possible to learn to, for a machine, to program a machine to acquire that knowledge with a similar foundation. In other words, an inter- a similar interpretative, interpretative foundation for processing that knowledge.
- LFLex Fridman
Uh, what do you mean by that? How-
- DFDavid Ferrucci
So in other, in other words-
- LFLex Fridman
... foundation?
- DFDavid Ferrucci
... we view the world in a particular way. And so, in other words, we, we have a, if you will, as humans, we have a framework for interpreting the world around us.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
So we have multiple frameworks for interpreting the world around us. But, uh, if you're interpreting, for example, socio-political interactions, you're thinking about, well, there's people, there's collections and groups of people. They have goals. Goals largely built around survival and quality of life.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
There are e- there are fundamental economics around scarcity of resources. And when, when humans come and start interpreting a situation like that, because you brought, you brought up, like, historical events. They start interpreting situations like that. They apply a lot of this, a lot of this, this fundamental framework for interpreting that. Well, who are the people? What were their goals? What resources did they have? How much power or influence did they have over the other... Like, just fundamental-
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
... substrate, if you will, for interpreting and reasoning about that. So I think it is possible to imbue a computer with that, that stuff that humans, like, take for granted when they go and, and, and sit down and try to interpret things. And then, and then with that, with that foundation, they acquire, they start acquiring the details, the specifics in any given situation, are then able to interpret it with regard to that framework. And then given that interpretation, they can do what? They can predict. But not only can they predict. They can predict now with an explanation that can be given in those terms, in the terms of that underlying framework that most humans share.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
Now, you could find humans that come and interpret events very differently than other humans, because they're, like, using a, a different s- different framework. You know, the movie Matrix comes to mind, where, you know, they decided that humans were really just batteries, and that's how they (laughs) interpreted the value of humans-
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
... um, as a source of electrical energy. So but, um, but I think that, you know, for the most part, we, we, we have a way of, of interpreting the events or at least social events around us, because we have this shared framework. It comes from, again, the fact that we're, we're similar beings that have similar goals, similar emotions, and we as... We can make sense out of these. These frameworks make sense to us.
- LFLex Fridman
So how much knowledge is there, do you think? So it's... You said it's possible.
- DFDavid Ferrucci
Well, there's always a tremendous amount of detailed knowledge in the world. There are, you know. You can imagine, you know, effectively infinite number of unique situations and unique, unique configurations of these things. But the, the knowledge that you need, w- what I refer to as, like, the frameworks, for... You need for interpreting them, I don't think. I think that's, those are finite. Um-
- LFLex Fridman
You think the frameworks are more important than the bulk of the knowl- so, like, framing
- DFDavid Ferrucci
Yeah. ... describes the- Because what the frameworks do is they give you now the ability to interpret and reason, and to interpret and reason and to interpret and reason over the specifics in ways that other humans would understand.
- LFLex Fridman
What about the specifics? You know-
- 45:00 – 1:00:00
Yeah. …
- LFLex Fridman
I don't know if you've driven the vehicle or, are, are, are aware of what-
- DFDavid Ferrucci
Yeah.
- LFLex Fridman
... the, the... So you basically, the human and machine are working together there, and the human is responsible for their own life to monitor the system. And, you know, the system fails every few miles. And so the- there's, there's hundreds of... there's millions of those failures a day, and so that's like a moment of interaction. Do you see?
- DFDavid Ferrucci
Yeah. That- that- that- no, that's exactly right. That's a moment of interaction, um, where, you know, the, the, the machine has learned some stuff, uh, it l- has a failure. Somehow the failure is communicated. The human is now filling in the, the mistake, if you will, or maybe correcting it or doing something that is more successful. In that case, the computer takes that learning. So I believe that the collaboration between human and machine, I mean, that's sort of a primitive example and sort of a more, um... Another example is where the machine is literally talking to you and saying, "Look, I'm, I'm reading this thing. I know, I know that, like, the next word might be this or that, but I don't really understand why. I have my guess. Can you help me understand the framework that supports this?" And then can kind of t- acquire that, take that, and reason about it and reuse it the next time it's reading to try to understand something. Not un- not unlike a human, uh, student might do. I mean, I remember like, uh, when my, my daughter was in first grade and she was... had a, um, re- reading assignment about electricity and, you know, somewhere in, in, in the text it says, "And electricity is produced by water flowing over turbines," or something like that. And then there's a question that says, "Well, how is electricity created?" And so my daughter comes to me and says, "I mean, I could..." You know, created and produced are kind of synonyms in this case. "So I can go back to the text and I can copy by water flowing over turbines, but I have no idea what that means."
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
"Like, I don't know how to interpret water flowing over turbines and what electricity even is. I mean, I can get the answer right by matching the text, but I don't have any framework for understanding what this means at all."
- LFLex Fridman
And framework really is... I mean, it's a set of, not to be mathematical, but axioms of ideas that you bring to the table in interpreting stuff and then you build those up somehow.
- DFDavid Ferrucci
You, you, you build them up with the expectation that there's a shared understanding of what they are.
- LFLex Fridman
Share... Yeah, yeah. It's the social-
- DFDavid Ferrucci
Yeah, right.
- LFLex Fridman
... that, that us humans... Do you have a sense that humans on Earth, in general, share a set of f... Like, how many frameworks are there?
- DFDavid Ferrucci
I mean, it depends on how you bound them, right? So, in other words, how big or small, like their, their individual scope. Um, but there's lots and there are new ones. I think they're... I, I think the way I think about is kind of in a layer. I think of the architecture as being layered in that there's, there's a small set of primitives that allow you the foundation to build frameworks and then there may be, you know, many frameworks, but you have the ability to acquire them, and then you have the ability to reuse them. I mean, one of the most compelling ways of thinking about this is little reasoning by analogy where I can say, "Oh, wow, I've learned something very similar." Um, you know, I never heard of this, I never heard of this game, uh, soccer, but, um, if it's like basketball in the sense that the goal is like the hoop and I have to get the ball in the hoop and I have guards and I have this and I have that, like where, where does the... where, where are the similarities and where are the differences? And I have a foundation now for interpreting this new information.
- LFLex Fridman
And then, uh, different groups, like the millennials will have a framework and then, and then, and then-
- DFDavid Ferrucci
Well, that, you know that-
- LFLex Fridman
... and then ever-
- DFDavid Ferrucci
Yeah.
- LFLex Fridman
You know?
- DFDavid Ferrucci
Well, like that, that-
- LFLex Fridman
The Democrats and Republicans.
- DFDavid Ferrucci
Well-
- LFLex Fridman
Millennials, nobody wants that framework, I think.
- DFDavid Ferrucci
Well, well, I mean, I think-
- LFLex Fridman
No one understands it.
- DFDavid Ferrucci
Right. I mean, they're talking about political and social ways of interpreting the world around them, and I think these frameworks are still largely, largely similar. I think they differ in maybe what some fundamental-
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
... assumptions and values are. Now-... from a reasoning perspective, like the ability to process the framework of mine- m- might not be that different. The implications of different fundamental values or fundamental assumptions in those framework- frameworks may reach very different conclusions. So, from-
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
... so from a- a social perspective, the conclusions may be very different. From an intelligence perspective, I've- I, you know, I just followed where my assumptions took me.
- LFLex Fridman
Yeah, the pro- the process itself will look similar. But that's a fascinating idea that, um, frameworks really help carve how, uh, a statement will be interpreted.
- DFDavid Ferrucci
Mm-hmm.
- 1:00:00 – 1:15:00
Can I, um... You…
- DFDavid Ferrucci
off of a, um... I was doing the, the open domain question answering stuff, but I was coming off a co- couple other projects. I had a lot more time to put into this. And I argued that it could be done, and I argued it would be crazy not to do this.
- LFLex Fridman
Can I, um... You could be honest at this point. So even though you argued for it, uh, what's the confidence that you had yourself, uh, privately, uh, that this could be done? What was... (laughs) We just told, told the story, how you tell stories to convince others.
- DFDavid Ferrucci
Mm-hmm.
- LFLex Fridman
How confident were you? What was your estimation of the problem at that time?
- DFDavid Ferrucci
So I thought it was possible, and a lot of people thought it was impossible. I thought it was possible.
- LFLex Fridman
Okay.
- DFDavid Ferrucci
The reason why I thought it was possible is because I did some brief experimentation. I knew a lot about how we were approaching o- open domain factoid question answering. We had, we had been doing it for some years. I looked at the Jeopardy! stuff. I said, "This is gonna be hard," for a lot of the, uh, points that you, we mentioned earlier. Hard to interpret the question, um, hard to do it quickly enough, hard to compute an accurate confidence. None of this stuff had been done well enough before. But a lot of the technologies we're building were the kinds of technologies that should work. But more to the point, what was driving me was, I was in IBM Research-I was a senior leader in IBM Research, and this is the kind of stuff we were supposed to do.
- LFLex Fridman
Yeah, yeah.
- DFDavid Ferrucci
In other words, we were basically supposed to-
- LFLex Fridman
This is the moon shot. This is the-
- DFDavid Ferrucci
I mean, we were supposed to take things and say, "This is an active research area. It's our obligation to kind of, if we have the opportunity, to push it to the limits, and if it doesn't work, to understand more deeply why we can't do it." And so, I was very committed to that notion, saying, "Folks, this is what we do. It's crazy not, not to do it."
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
"This is an active research area. We've been in this for years. Why wouldn't we take this grand challenge and, and push it as hard as we can?" At the very least, we'd be able to come out and say, "Here's why this problem is, is way hard."
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
"Here's what we tried and here's how we failed." So, I was very driven, um, as a scientist from that perspective, and then I also argued, based on what we did a feasibility study, uh, why I thought it was hard but possible, and I showed examples of, you know, where it succeeded, where it failed, why it failed, and sort of a high level architectural approach for why we should do it. But for the most part, that... at that point, the execs really were just looking for someone crazy enough to say yes, because for s- several years at that point, everyone had said no.
- LFLex Fridman
Yeah.
- DFDavid Ferrucci
"Um, I'm not willing to risk my reputation and my (laughs) career, you know, on this thing."
- LFLex Fridman
Clearly, you did not have such fears. Okay.
- DFDavid Ferrucci
I, I did not.
- LFLex Fridman
So you dived right in, and yet, from what I understand, it was performing very poorly in the beginning. So, what were the initial approaches and why did they fail?
- DFDavid Ferrucci
Well, there were lots of hard aspects to it. I mean, one of the reasons why prior approaches that we had worked on in the past, um, failed was because of... because the questions were difficult, difficult to interpret, like what are you even asking for, right?
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
Very often, like if, if the question was very direct, like, "What city?" You know, or what, you know, even then it could be tricky, but, but you know, "What city?" Or, or, "What person?" Is often when it would name it very clearly, you would know that. And, and if there were just a small set of them, in other words, we're gonna ask about these five types. Like, it's gonna be an answer, and the answer will be, uh, a city in this state or a city in this country. The answer will be a person of this type, right? Like an actor or whatever it is. But it turns out that in Jeopardy! there were like tens of thousands of these things, and it was a very, very long tail. Meaning, you know, that it just went on and on, and, and so even if you focused on trying to encode the types at the very top, like, like there's five that were the most... let's say five of the most frequent, you'd still cover a very small percentage of the data. So you couldn't take that approach of saying, "I'm just going to try to collect facts about these five or 10 types or 20 types or 50 types or whatever." So, that was like one of the first things, like, what do you do about that? And so we came up with a, an approach toward that, and the approach looked promising, and we, we continued to improve our ability to, to, to handle that problem throughout the project. The other issue was that right from the outset I said, "We're not going to..." I committed to doing this in three to five years. So we did it in four, so I got lucky.
- LFLex Fridman
(laughs)
- DFDavid Ferrucci
Um, but one of the things that that putting that like stake in the ground, was I kn- and I knew how hard the language understanding problem was. I said, "We're not going to actually understand language to solve this problem. We are not going to interpret the question and the domain of knowledge the question refers to, and reason over that to answer these questions. We're obvi- we're not gonna be doing that." At the same time, simple search wasn't good enough to, to confidently answer with a s- you know, a single correct answer.
- LFLex Fridman
First of all, that's like brilliant, that's such a great mix of innovation an- and practical engineering, three, three, four. Uh, so you're not, you're not trying to solve the general NL- NLU problem. You're saying, "Let's solve this in any way possible."
- DFDavid Ferrucci
Oh, I, yeah, no.
- LFLex Fridman
But-
- DFDavid Ferrucci
No, I was committed to s- saying, "Look, we're gonna solve an open domain question answering problem. We're using Jeopardy! as a driver for that."
- LFLex Fridman
As a, as a big benchmark.
- 1:15:00 – 1:21:43
And speed. So better…
- DFDavid Ferrucci
you had a total of a thousand or 5,000 whatever passages. For each passage now you'd go and figure out whether or not there was a candidate, what we call a candidate answer in there. So you had a whole bunch of another ... a whole bunch of other algorithms that would find candidate answers, possible answers to the question. And so you had candidate answer gen- called candidate answers generators, a whole bunch of those. So for every one of these components the team was constantly doing research, coming up better ways to generate search queries from the questions, better ways to analyze the question, better ways to generate candidates.
- LFLex Fridman
And speed. So better is accuracy and, uh, speed.
- DFDavid Ferrucci
Correct. So right, speed and accuracy for the most part were separated. We handled that sort of in separate ways, like I focus purely on accuracy, end-to-end accuracy. Are we ultimately getting more questions and producing more accurate confidences? And then I had a whole nother team that was constantly analyzing the workflow to find the bottlenecks and then figuring out how to both parallelize and drive the- the algorithm speed. But anyway, so- so now think of it like you have this big fan out now, right? Because you have ... you had multiple queries, now you have, now you have thousands of candidate answers. For each candidate answer, you're gonna score it. So you're gonna use all the data that built up. You're gonna use the question analysis.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
You're gonna use how the query was generated. You're gonna use the passage itself and you're gonna use the candidate answer that was generated, uh, and you're gonna score that. So now we have a group of researchers coming up with scorers. There are hundreds of different scorers.
- LFLex Fridman
Mm-hmm.
- DFDavid Ferrucci
So now you're getting a fan added again from however many candidate answers you have to all the different scorers. So if you have a-a 200 different scorers and you have a thousand candidates, now you have two thou- ... hundred thousand scorers and-and so now you gotta figure out, you know, how do I now rank these, rank these answers based on the scorers that came back? And I want to rank them based on the likelihood that they're a correct answer to the question. So every scorer was its own research project.
- LFLex Fridman
What do you mean by scorer? So is that the annotation process of, of basically a human being saying that this, this answer is-
- DFDavid Ferrucci
Yeah, think of it, think of it as-
- LFLex Fridman
... has a quality of-
- DFDavid Ferrucci
... you can think of, think of it, if you want to think of it what you're doing, you know, if you want to think about what a human would be doing, a human would be looking at a possible answer. They'd be reading the, you know, Emily Dixon, Dickinson, they'd be reading the passage in which that occurred, they'd be looking at the question and they'd be making a decision of how likely it is that Emily Dixon, Dickinson, given this evidence in this passage, is the right answer to that question.
- LFLex Fridman
Ah, got it. So that, that's the annotation task. That's the annotation process.
- DFDavid Ferrucci
That's the scoring task.
- LFLex Fridman
So but scoring implies zero to one kind of continuous-
- DFDavid Ferrucci
That's right, you give it a zero to one score. (laughs)
- LFLex Fridman
So it's not a binary ...
- DFDavid Ferrucci
No.
- LFLex Fridman
So it's-
- DFDavid Ferrucci
You give it a score. You give it a zero to ... yeah, exactly, a zero to one score.
- LFLex Fridman
So but humans d- give different scores so you have to somehow normalize and all that kind of stuff, uh, deal with all that complexity.
- DFDavid Ferrucci
Depends on what your strategy is. We both, we- we-
- LFLex Fridman
It could be relative too. It could be, uh ...
- DFDavid Ferrucci
We- we actually looked at the raw scores as well as standardized scores because humans are not involved in this. Humans are not involved.
- LFLex Fridman
Sorry, so I'm- I'm misunderstanding the- the- the process here. This- this is passages. Where is the ground truth coming from?
- DFDavid Ferrucci
Ground truth is only the answers to the questions.
- LFLex Fridman
So it's end-to-end.
- DFDavid Ferrucci
It's end-to-end. So we al- ... so I was always driving end-to-end perform- ... it was a very interesting-
- LFLex Fridman
Wow.
- DFDavid Ferrucci
... a very interesting, you know, engineering, um, approach and ult- ultimately scientific and research approach were always driving end-to-end. Now that's not to say we- we- we wouldn't make hypotheses that individual component performance was related in some way to end-to-end performance.
- LFLex Fridman
Right.
Episode duration: 2:24:31
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode Whtt2H5_isM
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome