Holden Karnofsky — History's most important century

Holden Karnofsky is the co-CEO of Open Philanthropy and co-founder of GiveWell. He is also the author of one of the most interesting blogs on the internet, Cold Takes. We discuss: * Are we living in the most important century? * Does he regret OpenPhil’s 30 million dollar grant to OpenAI in 2016? * How does he think about AI, progress, digital people, & ethics? Highly recommend! 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkeshpatel.com/p/holden-karnofsky * Spotify: https://spoti.fi/3Qi11SY * Apple Podcasts: https://apple.co/3CmZXav 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 - Intro 00:00:58 - The Most Important Century 00:06:44 - The Weirdness of Our Time 00:21:20 - The Industrial Revolution 00:35:40 - AI Success Scenario 00:52:36 - Competition, Innovation , & AGI Bottlenecks 01:00:14 - Lock-in & Weak Points 01:06:04 - Predicting the Future 01:20:40 - Choosing Which Problem To Solve 01:26:56 - $30M OpenAI Investment 01:30:22 - Future Proof Ethics 01:37:28 - Integrity vs Utilitarianism 01:40:46 - Bayesian Mindset & Governance 01:46:56 - Career Advice

Holden KarnofskyguestDwarkesh Patelhost

Jan 3, 20231h 56mWatch on YouTube ↗

EVERY SPOKEN WORD

150 min read · 30,001 words

0:00 – 0:58
Intro
1. HKHolden Karnofsky
  If we had AI systems that could do everything humans do to advance science and technology, that would be insane. We live in a weird time. Growth has been exploding, accelerating over the last blink of an eye. We really need to be kind of like nervous and vigilant about what comes next, and thinking about all the things that could radically transform the world. We just imagine a universe where there actually are some people who live in an especially important time, and then there's a bunch of other people who like tell stories to themselves about how, what, you know, whether they do. How would you want all those people to behave? And it's like, to me, the worst possible rule is all those people should just be like, "Nah, this is crazy," and forget about it.
2. DPDwarkesh Patel
  All right, today, I have the pleasure of speaking with Holden Karnofsky, who is the co-CEO of Open Philanthropy. In my opinion, Holden is one of the most interesting intellectuals alive, well, given your role. So Holden, welcome to The Lunar Society.
3. HKHolden Karnofsky
  Thanks for having me.
4. DPDwarkesh Patel
  Okay, so let's start off by talking about The Most Important Century thesis. Do you wanna explain what this is for
0:58 – 6:44
The Most Important Century
1. DPDwarkesh Patel
  the audience?
2. HKHolden Karnofsky
  You know, my story is I, uh, originally co-founded an organization called GiveWell that helps people decide where to give as effectively as possible. I'm no longer there, but I, I'm on the board, and it's a website called GiveWell.org that I think makes good recommendations where to give to charity to help, uh, a lot of people. And, uh, you know, as we were working at GiveWell, we met Cari Tuna and Dustin Moskovitz, Dustin is the co-founder of Facebook and Asana, and started a project that became Open Philanthropy to try to help them give away their, uh, large fortune, again, to help as many people as possible. And so I've kinda spent my career looking for ways to do as much good as possible with a dollar or with an hour, with whatever resources you have, and especially with money. And so I've kind of developed this professional specialization in looking for ideas that are underappreciated, underrated, tremendously important, because a lot of the time, uh, that's where I think you can find just kind of outsized, what you might call outsized return on investment, opportunities to spend some money and just get an enormous impact because you're doing something very important that, that is being ignored by others. And so it's through that kind of professional specialization that I've actively looked for interesting ideas that are not getting enough attention, and then I encountered the Effective Altruist community, uh, which is a community of people basically built around the idea of, of doing as much good as you can. And so it's, it's through that community that I encountered the idea of The Most Important Century. It's not my idea at all. I got, got to it from a, a lot of people. And the basic idea is that if we developed, uh, the right kind of AI systems this century, and that looks reasonably likely, that could make this century the most important of all time for humanity. So the, the basic mechanics of why that might be or how you might think about that... So one thing is that if you look back at all of economic history, just the rate at which the world economy has grown, you see acceleration. You see that it's, it's growing a lot faster today than it ever was. And one theory of why that might be, or one way of thinking about it through the lens of basic economic growth theory is that in normal circumstances, you can imagine a kind of feedback loop where you have, uh, people have ideas, and the ideas lead to greater productivity and more resources. And then when you have more resources, you can also have more people, and then those people have more ideas. So you get this feedback loop that goes people, ideas, resources, people, ideas, resources. And starting a couple hundred years ago, you run a feedback loop like that, standard economic theory says you'll get accelerating growth. You'll get a rate of economic growth that goes faster and faster. And basically, if you take the story of our economy to date and you just kind of plot it on a chart and do the kind of simplest thing you can to project it forward, you project that it will go, that our economy will reach like an infinite growth rate, uh, this century. And the reason that I currently don't think that's a great thing to expect by default is that one of the steps of that feedback loop broke a couple hundred years ago. Um, so it goes more people, more ideas, more resources, more people, more ideas, more resources. A couple hundred years ago, people stopped having more children when they had more resources. They got just more, they got richer instead of more populous. And this is all discussed in, uh, in The Most Important Century, uh, page on my blog, Cold Takes. And so what happens right now is that when we have more ideas and we have more resources, we don't end up with more people as a result. We don't have that same accelerating feedback loop. And if you had AI systems that could do all the things humans do to advance science and technology, meaning the AI systems could fill in that more ideas part of the loop, um, then you could get that feedback loop back, and then you could get sort of this unbounded, heavily accelerating explosive growth in science and technology. Uh, so that's like, that's the basic dynamic at the heart of it. So that's kind of a, a way of putting it that's trying to use familiar concepts from economic growth theory. A- a- another way of putting it might just be, "Gosh, if we had AI systems that could do everything humans do to advance science and technology, that would be insane." You know, what if we were to take the things that humans do to create new technologies that have transformed the planet so radically, and we were able to completely automate them so that every computer we have is potentially another mind working on advancing technology? So either way you think about it, you can imagine the world changing incredibly quickly and incredibly dramatically. And so I argue in The Most Important Century series that it looks reasonably likely, in my opinion, more than 50/50, that this century will see AI systems that can do all of the key tasks that humans do to advance science and technology, and that if that happens, we'll see explosive progress in science and technology. The world will quickly become extremely different from how it is today. You might think of it as if there was thousands of years of changes, uh, packed into a much shorter time period. And then if that happens, I argue that you, you could end up in a, in a deeply unfamiliar future. I give one example of what that might look like using this hypothetical technology idea called digital people. That would be sort of people that live in virtual environments, uh, that are kind of simulated, but also realistic and exactly like, exactly like us. And when you picture that kind of advanced world, I think there is, there is a decent reason to think that if we did get that rate of scientific and technological advancement, we could basically hit the limits of science and technology. We could basically find most of what there is to find and end up with a civilization that expands well beyond this planet, has a lot of control over the environment, and is very stable for very long periods of time, and basically looks sort of post-human in a lot of relevant ways. And if you think that, then this is, this is basically our last chance to shape how this happens. So that's The Most Important Century hypothesis in a nutshell is that if we develop...... AI that can do all the things humans do to advance science and technology. We could quick, very quickly reach a very futuristic world, very different from today's, could be a very stable, very large world. This
6:44 – 21:20
The Weirdness of Our Time
1. HKHolden Karnofsky
  is our last chance to shape it.
2. DPDwarkesh Patel
  Gotcha. Okay, so, uh, I and many other people are gonna find that very wild.
3. HKHolden Karnofsky
  Yeah.
4. DPDwarkesh Patel
  Um, so, uh, l- m- maybe you can walk us through the process by which you went from doing global development stuff to, uh, thinking this way. Um, so, uh, l- I, in 2014, for example, you-
5. HKHolden Karnofsky
  Yeah.
6. DPDwarkesh Patel
  ... you had an interview or a conversation, um, and th- this is a quote from there. Maybe you can walk me through how you got from there to where you are today. "I have looked at the situation in Africa, have understanding of the situation in Africa, and see a path of doing a lot of good in Africa. I don't know how to look into the far future situation, don't understand the far future situation, and don't see a path to doing good on that front I feel good about."
7. HKHolden Karnofsky
  Yeah, first, first, I think I just... I, I went on for a while, but I wanna come back and connect this back up to the, to the... how this relates to the work I was even doing at GiveWell-
8. DPDwarkesh Patel
  Yeah.
9. HKHolden Karnofsky
  ... why this is all kind of one theme. If we are kind of on the cusp, or this century, of creating these advanced AI systems, then we could be looking at a future that's like very good or very bad.
10. DPDwarkesh Patel
  Okay.
11. HKHolden Karnofsky
  Um, and I think there are decent arguments that if we move forward without caution, and we develop kind of sloppily-designed AI systems, they could end up with goals of their own, um, and basically, we would end up with a universe that contains very little that humans value-
12. DPDwarkesh Patel
  Mm-hmm.
13. HKHolden Karnofsky
  ... or a galaxy that does. We could also imaginably end up with a world in which, uh, very powerful technologies are used by just not very well-meaning governments to create a world that isn't very good, or a world where we kind of eliminate a lot of forms of material scarcity and have a world that's much better than today's. And so a lot of what I ask... a lot of what I asked at GiveWell was, how can we help the most people possible per dollar spent? And if you ask, how can we help the most people p- possible per dollar spent? Then if you think that funding some work to help shape that transition, to help make sure that we don't move forward too incautiously, to help make sure that we do, you know, increase the odds that we do get like a good future world instead of a bad future one, that's helping a huge number, uh, of people per dollar spent. So, that's the motivation, and now you're quoting me, you know, at a, a discussion-
14. DPDwarkesh Patel
  (laughs)
15. HKHolden Karnofsky
  ... an argument I was having, uh, where we posted a transcript back in 2014, and at that time, was, was... that was part of my journey of getting here. As I was talking to people who were saying, "Holden, you wanna help a lot of people with your resources, you should be focused on this massive event that could be coming this century that, that very few people are paying attention to, that there might be a chance to make this go well or poorly for humanity." And I was saying, "Gosh, like that sure is interesting," and I did think it was interesting, and that's why I was spending the time in doing the conversation. But I said, "You know, when I look at global poverty and global health, I see what I can do. I see the evidence, I see the actions I can take, and I'm not seeing that with this stuff." Um, so what changed? I would say a good chunk of what changed is, is maybe like the most boring answer possible is I just kept at it. (laughs) Uh, so I think I was sitting there in 2014 saying, "Gosh, this is really interesting, but it's all a bit overwhelming, it's all a bit crazy. I don't know how I would even think about this. I don't know how I would come up with a risk from AI that I actually believed was a risk and could do something about today." And now I've just been thinking about this for a much longer time period, and I do believe, you know, most things you could say about the far future are very unreliable and not worth taking action on, but I think there are a few things one might say about what a transition to very powerful AI systems could look like. There are some, some things I'm willing to say. I'm willing to say it would be bad if AI systems were poorly designed, had goals of their own, and ended up kind of running the world instead of humans. That seems bad, and I do... and I am more familiar today than I was then with the, the research and the work people can do to make that less likely, and the actions people can take to make that less likely. So, that's probably more than half the answer, but, uh, another thing that would be close to half the answer is I think the world has changed, and I think that basically there's been, uh, big changes in the world of AI since then. So, I think basically in 2014, that was the beginning of what's sometimes called the deep learning revolution, and since then, we've basically seen these, these very computation-intensive but fundamentally simple AI systems, d- and achieve a lot of progress on a lot of different unrelated tasks. And it's looking to me like not totally crazy to imagine that the current way people are developing AI systems, cutting-edge AI systems, could take us all the way to the kind of extremely powerful AI systems that automate roughly everything humans do to advance science and technology. It's not so wild to imagine that we could just keep on going with these systems, make them bigger, put more work into them, but basically stay on the same path, and you could get there. And if you, if you imagine doing that, uh, it becomes a little bit less daunting to imagine the risks that might come up and the things we could do about them. So I don't think it's necessarily the leading possibility, but it's, it's enough to sort of start thinking concretely about the problem.
16. DPDwarkesh Patel
  Another quote from the interview that, uh, t- today I find appealing because I haven't done the work you have yet, "Does even the upper crust of humanity have a track record of being able to figure out the kinds of things MIRI claims to have figured out?" And by the way, for context for the viewers, MIRI is the organization Eliezer was leading, which, uh, which is who you were talking to at the time.
17. HKHolden Karnofsky
  Yeah, well, I, I don't remember exactly what kinds of things MIRI was trying to figure out, and I'm not sure that if... that I even understood what they were that well. So, I, I definitely, I definitely think it is true that it is hard to predict the future no matter who you are, no matter how hard you think, no matter how much you've studied. I think that is true. I think parts of our kind of world, or memeplex, or whatever you wanna call it, overblow this at least a little bit, um, and I think I was, I was kind of buying into that a little bit more than I could. So, I think, you know, probably in 2014, I would have said something like, "Gosh, you know..."... really no one's ever done something like making smart statements about what the, you know, several decades out future could look like, or making smart statements about what we'd be doing today to prepare for it. Since then, I think a bunch of people have looked into this, um, and looked for like historical examples of people kind of making long-term predictions and long-term interventions. And I don't think it's amazing, but I think, I think, uh, I wrote a recent blog post entitled The Track Record of Futurists Seems... Fine. Uh, fine is how I put it, where I, I don't, uh, I don't think there's anyone who has demonstrated a real ability to predict the future with precision and know exactly what we should do, but I also don't think humans' track record of this is so bad and so devastating that we shouldn't think we are capable of at least giving it a shot. Um, and I think if you enter into this endeavor with self-awareness about the fact that everything is less reliable than it appears and feels at first glance, and you look for the few things that you would really bet on, I think it's worth doing. I think it's worth the bet. Uh, my job is to find things that we might do 10 things and have nine of them fail embarrassingly and one of them be such a big hit that it makes up for everything else, and a lot of my job is to find stuff like that. So I don't think it's totally crazy to think we could make meaningful statements about how things we do today could make these future events go better, especially if the future events aren't crazily far away, especially if they're within the next few decades. So that's, that is something I've changed my mind on at least some d- to some degree.
18. DPDwarkesh Patel
  Gotcha. Okay, so we'll get to the forecasting stuff in a second, but let's-
19. HKHolden Karnofsky
  Yeah.
20. DPDwarkesh Patel
  ... uh, let's continue on the object level conversation about-
21. HKHolden Karnofsky
  Sure.
22. DPDwarkesh Patel
  ... The Most Important Century. So, um, I, I wanna make sure I have the thesis right. So is the argument that because we're living in a weird time, we shouldn't be surprised if something like transformative AI happens this century? Or is the argument that since transformative AI could happen this century, it's a weird time?
23. HKHolden Karnofsky
  Yeah. Uh, so, so something we haven't covered yet but I think is worth throwing in now is that a significant part of The Most Important Century series is kind of just making the case that even if you ignore AI, there's a lot of things that are very strange about the time that our generation lives in. The reason I spent so much effort on this is because my number one, back in 2014 and before that, my number one objection to these stories about transformative AI is, you know, it's, it's not anything about whether the specific claims about AI or economic models or align- uh, you know, alignment research makes sense. It's just this whole thing sounds crazy, and it's just suspicious. It's just suspicious if someone says to you, "You know this could be the most important century of all time for humanity." I, I, I titled, uh, the series that way because I wanted people to know that I was saying something kind of crazy and that I should have to defend it. I- I didn't want to be backpedaling or soft pedaling or h- or hiding, uh, what a big claim I was making. And so I think my biggest source of skepticism has been just like I don't have any specific objection. It just sounds kind of crazy and suspicious to say that we might live in one of the most significant times or the most significant time for humanity ever. And so a lot of my series is just kind of saying it is weird, it is weird to think that, but we already have a lot of evidence that we live in an extraordinarily weird time that would be on the short list of contenders for most important time ever before you get into anything about AI, just using like completely commonly accepted facts about the world. For example, if you chart the history of economic growth, you see that the last couple hundred years have seen faster growth by a lot than like anything else in the history of humanity or the world. And if you chart anything about like scientific and technological developments, you can see that everything significant is packed together in the recent past. And there's almost no way to cut it. You know, I've looked at many different cuts of this. There's almost no way to cut it that will give you that conclusion. One way to put it is that the, you know, the universe is, is something like 11 or 12 billion years old. Uh, humanity, or sorry, life on Earth is like three billion years old. And then humanity is just a blink of an eye, uh, com-
24. DPDwarkesh Patel
  Yeah.
25. HKHolden Karnofsky
  ... compared to that. You could call it 300,000, three, three million years old. Human civilization is a blink of eye compared to that. And we're in this really, really tiny sliver of time, couple hundred years, when we've seen basically, you know, all or like just a huge amount of the technological advancement and economic growth. So that's weird. You know, I also talk about the fact that the current rate of economic growth seems high enough that we can't keep it going for that much longer. Uh, if it went for another 10,000 years, that's another blink of an eye in galactic timescales. It looks to me, and we can get to this, like we would run out of a- atoms in the galaxy, wouldn't have anywhere to go. And so I think there are a lot of signs that we just live in a really strange time. Uh, I think one, one more I'll just throw in there, then we can move on, is that, uh, you know, I think a lot of people who disagree with my take would say, "Look, I do believe eventually we will develop space colonization abilities, go to the stars, fill up the galaxy with life, you know, maybe have artificial general intelligence." I just... To say it's this century is crazy. I think it might be 500 years. I think it might be 1,000 years. I think it might be 5,000 years. And a big point I make in the series is I say, "Well, even if it's 100,000 years, that's still an extremely crazy time to be in, in the scheme of things." If you just make, you know, if you make a graphic timeline and, and you kind of show my view versus yours, they look exactly the same down to the pixel. And so there's already a lot of reasons to think we live in a very weird time. We're on this planet where there's, where there's no other sign of life anywhere in the galaxy. We believe that we could fill up the galaxy with life. That alone would make us among the earliest life that has ever exist- or it would make us the earliest life that has ever existed in the galaxy, a tiny fraction of it. Um, so that is a lot of what the series is about. And I have sort of answered your question, but I'll do it explicitly. You, you ask, you know, is this series about, uh, transformative AI could come and therefore this century could be weird? Or is it about this century could be weird, therefore transformative AI could come? The central claim is that transformative AI could be developed in this century, and the stuff about how weird a time we live in is just a response to an objection. It's a response to a point-
26. DPDwarkesh Patel
  I see.
27. HKHolden Karnofsky
  ... of skepticism. It's a way of saying there's already a lot of reasons to think we live in a very weird time. And so actually this thing about AI is only a moderate quantitative update, not a complete revolution in the way of thinking about things.
28. DPDwarkesh Patel
  Th- there's a famous comedian who has a bit where he's imagining what it must've been like to live in 10 BC.
29. HKHolden Karnofsky
  Yeah.
30. DPDwarkesh Patel
  (laughs) People are just numbering...
21:20 – 35:40
The Industrial Revolution
1. HKHolden Karnofsky
2. DPDwarkesh Patel
  Mm-hmm. Uh, I'm glad you brought up the Industrial Revolution, because I feel like there's two implicit claims within the most important century thesis-
3. HKHolden Karnofsky
  Yeah.
4. DPDwarkesh Patel
  ... that don't seem perfectly compatible. One is that, you know, we live in an extremely wild time, that the transition here is potentially wilder than any other transition there has been before. And the second is, we have some sense of what we can be doing to make sure this transition goes well.
5. HKHolden Karnofsky
  Yeah.
6. DPDwarkesh Patel
  I'm curious if you think that somebody at the beginning of the Industrial Revolution, if knowing what they knew then could have done something significant to, you know, make sure th- that it went as favorably as possible. Or do you-
7. HKHolden Karnofsky
  Yeah.
8. DPDwarkesh Patel
  ... think that that's a bad analogy for some reason?
9. HKHolden Karnofsky
  It's, it's a pretty good analogy for, for being thought-provoking and for thinking, gosh, if, if you had seen the Industrial Revolution coming in advance, and, and this is, you know, when economic growth really reached a new level, uh, back in the 1700s and 1800s, you know, what, what could you have done? And I think, you know, part of the answer is, it's not that clear, and I think that is a bit, you know, a, a bit of an argument that we should maybe not, uh, not get too carried away today by thinking that we know exactly what we can do, but I don't think the answer is quite nothing. Um, so I have a, I have kind of a, a goofy cold takes post that I never, that I never published and, and may never publish because I, I don't know, I kind of lost track of it. But it's, it's kind of a, it's kind of saying well, what if, what if you'd been sitting i- in that time and you had known the Industrial Revolution was coming or you had thought it might be, um, you'd asked yourself what you could be doing? One answer you might have given is you might have said, "Well, gosh, if this happens, whatever country it happens in might be just like, disproportionately influential. And you know what would be great is if, is if I could help transform the thinking and the culture in that country to have a better handle on human rights and more value on human rights and individual liberties and a lot of other stuff." Um, and gosh, it kind of looks like people were doing that and it kind (laughs) of, kind of looks like it worked out.
10. DPDwarkesh Patel
  (laughs) Fair enough.
11. HKHolden Karnofsky
  Um, so this is the Enlightenment. And I think, um, you know, even, I even give this kind of goofy example, I can... I could look it up and it's, it's, it's all kind of a trollish post, but the example is, it's like someone's kind of thinking through, "Hey, you know, I'm thinking about this sort of esoteric question about what a government owes to its citizens or, you know, when a citizen has a right to overthrow a government or when it's acceptable to kind of enforce certain beliefs, uh, and not." And it's like the other person in the dialogue is just like, "This is the weirdest, most esoteric question. Why does this matter? Why aren't you helping poor people?" But, you know, these are the questions that the, the Enlightenment thinkers were thinking about. And I think there is a good case that they came up with a lot of stuff that, that really shaped the whole world since then because of the fact that the UK became so influential, really laid the groundwork for a lot of stuff about the rights of the governed and free speech and individual rights and human rights. And, uh, and then, and then I go to the next analogy and I'm like, okay, now we're sitting here today and someone is saying, "Well, instead of working on global poverty, I'm studying this kind of esoteric question about how you get an AI system to do what you want it to do instead of doing its own thing." And it's, you know, you could, you could... Yeah, it's not... I think it's not completely crazy to see them as analogous. Now, I don't, I don't think this is what the Enlightenment thinkers were actually doing. I don't think they were saying, "This could be the most important millennium, so let's do this." But it is interesting that you... It, it doesn't look like there was nothing to be had there. It doesn't look like there's nothing you could have come up with, and in many ways it looks like what the Enlightenment thinkers were up to had the same kind of esoteric, strange, overly cerebral feel at the time and ended up mattering a huge amount. So it doesn't feel like there's zero precedent either.
12. DPDwarkesh Patel
  Yeah. Maybe I'm a bit more pessimistic about that because people like, uh, the people who were working on individual rights, uh, frameworks, like Long, I don't think they were, like, anticipating-
13. HKHolden Karnofsky
  Yeah, there were.
14. DPDwarkesh Patel
  ... an industrial revolution.
15. HKHolden Karnofsky
  Yeah.
16. DPDwarkesh Patel
  The, the... I mean, I feel like the person who actually did anticipate the industrial revolution probably (laughs) his political philosophy was actually probably a net negative given, um, you know... I'm talking about Marx.
17. HKHolden Karnofsky
  Who are we talking about?
18. DPDwarkesh Patel
  Karl Marx.
19. HKHolden Karnofsky
  Okay.
20. DPDwarkesh Patel
  Yeah, yeah.
21. HKHolden Karnofsky
  Yeah.
22. DPDwarkesh Patel
  So-... it's not obvious to me that e- even if you saw something like this happening that you're-
23. HKHolden Karnofsky
  Oh, it's totally not obvious.
24. DPDwarkesh Patel
  ... right.
25. HKHolden Karnofsky
  I, I, I mean, I think my basic position here is we, we... I'm not sitting here, like, highly confident. I'm not saying, you know, there's tons of precedent and we know exactly what to do. That's not what I believe. I believe we should be giving it a shot. I think we should be trying and I don't think we should be totally defeatist in saying, "Well, it's, it's so obvious that there's never anything you could have come up with throughout history, and humans have been helpless to predict the future." I don't think that is true. And so, yeah, I think, I think that's, like, enough of an example to kind of illustrate that. And I mean, gosh, like, you could, you could make the same statement today is you could say, "Look, doing research on how to get AI systems to behave as intended is a perfectly fine thing to do at any period in time." Um, it's not, like, a bad thing to do. And I think John Locke was doing his stuff because he felt it was a good thing to do at any period in time. But the thing is, that if we are at this crucial period of time, it becomes an even better thing to do and it becomes magnified to the point where it could be more important than other things.
26. DPDwarkesh Patel
  Now, the one reason I might be skeptical of this theory is that I could say, "Oh gosh, if you look throughout history, people were often convinced they were living in the most important time."
27. HKHolden Karnofsky
  Sure.
28. DPDwarkesh Patel
  Or at least an especially important time.
29. HKHolden Karnofsky
  Yeah.
30. DPDwarkesh Patel
  And if you go back, I mean, e- everybody can't be right about living in the most important time, so maybe I should just, uh, have-
35:40 – 52:36
AI Success Scenario
1. HKHolden Karnofsky
2. DPDwarkesh Patel
  Gotcha, gotcha. Okay, so let- let's talk about transformative AI then.
3. HKHolden Karnofsky
  Yeah.
4. DPDwarkesh Patel
  Can you describe what success looks like concretely? Are humans part of the post-transformative AI world? W- are- are we hoping that these AIs become, uh, enslaved gods that help us create a utopia? W- what is the concrete success scenario look like?
5. HKHolden Karnofsky
  Uh, I mean, I think we've- we've talked a lot about the difficulty of predicting the future and I think I- I do want to emphasize that I really do believe in that. So my attitude to the most important century is not at all, "Hey, I know exactly what's gonna happen and I'm making a plan to get us through it." It's much more like there's a general fuzzy outline of a big thing, uh, that might be approaching us. There's maybe, like, two or three things we can come up with that seem good to do. Everything else we think about, we're gonna- not gonna know if it's good to do or bad to do. And so I'm just trying to find the things that are good to do so that I can make things go a little bit better or help things go a little bit better. That is my general attitude. So it's, um, you know, it's... I don't know. It's like if you were- if you were on a, on a ship in a storm and you saw some, like, very large fuzzy object obscured by the clouds, you might want to steer away from it. You- you might not want to say, "Well, what I think that is, is it's an island and I think there's probably, you know, a tiger on it and if we go and train the tiger in the right way," blah, blah, blah, blah, blah. You don't want to get into that, right? So that- that is the general attitude I'm taking. So what does success look like to me? I mean, success could look like a lot of things but one thing success would look like to me would frankly just be that we get something not too different from the trajectory we're already on. So in other words, if we could have AI systems that-... behaved as intended, acted as tools and amplifiers of humans, did the things they're supposed to do, and if we could avoid a world where those AI systems got sort of like, I don't know, all controlled by one government or one person, um, avoid a world where that caused a huge concentration of power, if we could just have a world where AI systems are just another technology, they help us do a lot of stuff, we invent lots of other technologies and everything is like, relatively broadly distributed and everything works roughly as it's supposed to work, um, then you might be in a world where we continue the trend we've seen over the last couple of hundred years, which is that we're all getting richer, we're all getting more tools, we all hopefully get increasing ability to kind of understand ourselves, study ourselves, understand what makes us happy, what makes us thrive. And hopefully the world just gets better over time and we have more and more new ideas, uh, the ideas make us hopefully wiser. And I, you know, I do think that in most respects, the world of today is just like a heck of a lot better than the world of 200 years ago. I don't think the only reason for that is wealth and technology, but I think they played a role. And I think that like, yeah, if you'd gone back to 200 years ago and said, "Holden, you know, how would you like the world to develop a bunch of new technologies? As long as they're like, sort of evenly distributed and they behave roughly as intended and people mostly just get richer and discover new stuff?" I'd be like, "That sounds great." I don't know exactly where we're gonna land. I can't predict in advance whether we're going to decide th- that, that we wanna treat our technologies as having their own rights. That's stuff that the world will figure out. But I'd like to avoid massive disasters that are identifiable, because I think if we can, we might end up in a world where, where the future is wiser than we are and is able to do better things.
6. DPDwarkesh Patel
  Okay. Um, the way you put it with AI enabling humans, that's, doesn't sound like something that could last for thousands of years. It almost sounds as-
7. HKHolden Karnofsky
  Sure. Oh, agreed. Yeah, yeah.
8. DPDwarkesh Patel
  ... weird as like, chimps saying, you know, what we would like is humans to be our tools.
9. HKHolden Karnofsky
  Yeah.
10. DPDwarkesh Patel
  At best, maybe they could hope we would give them nice zoos. But like what, what is the world for humans in this, in this future?
11. HKHolden Karnofsky
  I mean, a world I could easily imagine, although that doesn't mean it's realistic at all, is, is a world where we, we do build these AI systems, they do what they're supposed to do, and we kind of use them to gain, you know, more, more intelligence and wisdom. Um, I've talked a little bit about this, this hypothetical idea of digital people, maybe we develop something like that. And then, you know, after 100 years of this, we've been around and we've been, people have been having discussions in the public sphere and people kind of start to talk about whether the AIs themselves do have rights of their own and should be sharing the world with us. And then maybe they do get rights and maybe A-, you know, maybe some AI systems end up voting, or maybe we decide they shouldn't and they don't. And either way, you have this kind of world where there's a bunch of different beings that all have rights and interests that matter and they vote on how to set up the world so that we can all hopefully thrive and have a good time. We have less and less material scarcity so fewer and fewer trade-offs need to be made. That would be great. I don't know exactly where it ends (laughs) or what it looks like, but that does... I don't know. I mean, what, does anything strike you as like, as, as, as unimaginable about that?
12. DPDwarkesh Patel
  Yeah. The fact that you can have beings that can be copied at will, but also there is some method of voting that-
13. HKHolden Karnofsky
  Oh, yeah.
14. DPDwarkesh Patel
  Yeah, yeah. So...
15. HKHolden Karnofsky
  Yeah, that's a problem that would have to be solved. I mean, we, we have a lot of, today we have a lot of attention paid to, you know, how a voting system works, who gets to vote, and how we avoid things being unfair. And yeah, I mean, it's, it's, it's definitely true that if we had, if we decided there was some kind of digital entity that should have the right to vote and that digital entity was able to copy itself, well you could get some havoc right there, so you'd, you'd wanna come up with some system that maybe restricts how many copies you can make of yourself, or restricts how many of those copies can vote. These are problems that I, that I'm hoping can be handled in a, in a way that, while not perfect, could be non-catastrophic, uh, by a society that hasn't been derailed by some huge concentration of power or misaligned AI systems.
16. DPDwarkesh Patel
  So that sounds like that might take time.
17. HKHolden Karnofsky
  Yeah.
18. DPDwarkesh Patel
  But let's say you didn't have time. So let's say you get a call-
19. HKHolden Karnofsky
  Yeah.
20. DPDwarkesh Patel
  ... and somebody says, "Holden, next month my company is developing or deploying a model that might plausibly lead to AGI."
21. HKHolden Karnofsky
  Yeah.
22. DPDwarkesh Patel
  What, what does Open Philanthropy do? Wh- what do you do?
23. HKHolden Karnofsky
  Well, I, I need to distinguish. I mean, the, y- you may not have time, uh, to avoid some of these like catastrophes, like huge concentration of power or AI systems that don't behave as intended and have their own goals. If you can prevent those catastrophes from happening, you might then get more time after you build the AIs to have these tools that help us, you know, help us invent new technologies and help us perhaps figure things out better and ask better questions. And then you could have a lot of time where you could figure out a lot in a little time if, if you had those things. Um, but if someone said, "How, how long did you give me?"
24. DPDwarkesh Patel
  Um, a month.
25. HKHolden Karnofsky
  A month.
26. DPDwarkesh Patel
  Let's say three months. Let, let's say it's a, it's a little bit more...
27. HKHolden Karnofsky
  Yeah. I would, I would find that extremely scary and I think I would, I think that would do, I kind of, I kind of feel like that's one of the worlds in which I might not even be able to offer an enormous amount. I think, so, so I, my job is in philanthropy and a lot of what philanthropists do historically or have done well historically is we help fields grow. We help do things that operate on very long timescales. So an example of something Open Philanthropy does a lot of right now is we fund people who do research on AI alignment and we fund people who are thinking about what it would look like to get through the most important century successfully. And a lot of these people right now are like, very early in their career and just figuring stuff out and so a lot of the world I picture is like it's 10 years from now, it's 20 years from now, it's 50 years from now, and there's this whole field of expertise that got support when traditional institutions wouldn't support it, and that was because of us. And then you come to me and you say, "We've got one week left. What do we do?" And I'm like, "I don't know. We did what we could do." (laughs)
28. DPDwarkesh Patel
  (laughs)
29. HKHolden Karnofsky
  Um, you know, like, go back in time and like try to prepare for this better. So that would be a lot of my answer. I mean, I could say more specific things about what I'd say in the one to three month timeframe but a lot of it would be like flailing around and freaking out, frankly.
30. DPDwarkesh Patel
  Gotcha. Okay. So maybe we can reverse the question.
52:36 – 1:00:14
Competition, Innovation , & AGI Bottlenecks
1. HKHolden Karnofsky
2. DPDwarkesh Patel
  You've expressed skepticism towards the competition frame around AI or, you know, you try to make capabilities go faster for the countries or companies you favor most.
3. HKHolden Karnofsky
  Yeah.
4. DPDwarkesh Patel
  Um, but you know, elsewhere you've used the innovation as mining metaphor and maybe you can explain that when you're giving the answer, but it seems like this frame should imply that actually the s- the second most powerful AI company is probably right on the heels of the first most powerful and then so just actually if you think the first most powerful...
5. HKHolden Karnofsky
  Mm-hmm.
6. DPDwarkesh Patel
  ... is going to take safety more seriously, you should try to boost them. How do you think about how these two different frames interact?
7. HKHolden Karnofsky
  I think it's, it's common for people who become convinced that AI could be really important to just jump straight to, "Well, I wanna make sure that people I trust build it first." And that could mean my country, that could mean my friends, uh, people I'm investing in, and I have generally, you know, called that the competition frame. Want to win a competition to develop AI and I've contrasted it with a frame that, that I also think is important which is the caution frame which is that we need to all work together to be careful to not build something that spins out of control and has all these properties and behaves in all these ways we didn't intend. I do think we're likely, if we do develop these very powerful AI systems, I do think we're likely to end up in a world where there's multiple players trying to develop it and they're all hot on each other's heels. And I am very interested in ways to find ways for us all to work together to avoid disaster as we're doing that. And I am maybe less excited than the average person who first learns about this is about like picking the one I like best and helping them race ahead. Uh, although I am somewhat interested in both.
8. DPDwarkesh Patel
  But if you take the innovation is mining metaphor seriously...
9. HKHolden Karnofsky
  Yeah.
10. DPDwarkesh Patel
  ... doesn't that imply that actually the competition is really a big factor here because...
11. HKHolden Karnofsky
  Oh. So the, the innovation mining metaphor is, is from another, another bit of Cold Takes.
12. DPDwarkesh Patel
  Yeah.
13. HKHolden Karnofsky
  It's, it's an argument I make that, that you should think of ideas as being somewhat like natural resources in the sense of once someone discovers a scientific hypothesis or even once, you know, once someone writes a certain great symphony, that's something that can only be done once. That's an innovation that can only be done once.
14. DPDwarkesh Patel
  Yeah.
15. HKHolden Karnofsky
  And so it gets harder and harder over time to have revolutionary ideas because the most revolutionary easiest to find ideas have already been found. So there's an analogy to mining. Um, I don't think it applies super importantly to AI thing because all I'm saying is that success by person one makes success by person two harder. I'm not saying that it has no impact or that it doesn't speed things up. So just to use a literal mining metaphor, let's say there's like a bunch of, a bunch of gold in the ground. It is true that if you rush and go get all that gold, it'll be harder for me to now come in and find a bunch of gold. That is true. What's not true is that it doesn't matter if you do it. I mean, you might do it a lot faster than me. You might do it a lot ahead of me. That could...
16. DPDwarkesh Patel
  Yeah, yeah.
17. HKHolden Karnofsky
  ... totally a thing that happens.
18. DPDwarkesh Patel
  Fair enough, fair enough.
19. HKHolden Karnofsky
  Yeah.
20. DPDwarkesh Patel
  So maybe one piece of skepticism that somebody could have about transformative AI is that all, all this is going to be bottlenecked by the non-automatable, um, uh, steps in the innovation sequence, so there won't be these feedback loops that...
21. HKHolden Karnofsky
  Yeah.
22. DPDwarkesh Patel
  ... speed up. Well, what is your reaction?
23. HKHolden Karnofsky
  Yeah. I think the single best criticism and my biggest point of skepticism on this most important century stuff is the idea that you could build an AI system that's very impressive, that could do pretty much everything humans can do, but there might be like one step that you still have to have humans do and that could bottleneck everything. And then you could have the, the world not speed up that much and science and technology not advance that fast because AIs are doing almost everything but humans are still like slowing down this one step, or the real world is slowing down one step so you have to do, you know, let's say real world experiments to invent new technologies and they just take how long they take. Um, I think this is the best, the best objection to this whole thing and the one that I'd most like to look into more. But I ultimately think that like there's enough, there's enough reason to think that if you had AI systems that had kind of like human-like reasoning and analysis capabilities, I think you shouldn't count on this kind of bottleneck causing everything to go really slow. And a lot of that, I mean, I write about that in this, in this piece called ... weak point in the most important century, full automation. Part of this is just, like, you don't need to automate the entire economy to get this crazy growth loop. You can automate just a part of it that... uh, specifically parts that have to do with, like, very important tech like energy and AI itself. And those actually seem, in many ways, just like, less bottlenecked than a lot of other parts of the economy, so you could have, you could be, like, developing better AI algorithms and AI chips, manufacturing them mostly using robots, using those to come up with even better designs. And then you could also be designing, like, more and more efficient solar panels, using those to collect more and more energy to power your AIs. So you... Like, a lot of the crucial pieces here just actually don't seem all that likely to be bottlenecked. And then also, I mean, at the point where you have something that has the ability to have kind of like creative new scientific hypotheses the way a human does, which is a debate over whether we should ever expect that and when, but once you have that, I think you should figure there's just a lot of ways to get aro- around all your other bottlenecks, because you have this potentially massive population of thinkers looking for them. And so an example is, you know, you could with, with enough firepower, enough energy, enough AI, uh, enough analysis, you could probably find a way to simulate a lot of the experiments you need to run, for example.
24. DPDwarkesh Patel
  Gotcha. Now it seems like the specific examples you used of, uh, energy and AI innovations, it seems like those are probably, like, the hardest things to automate, given the fact that those are the ones that humanity's only gotten around to advancing most recently. I... Can you, can you talk through the intuition that those might be easier?
25. HKHolden Karnofsky
  I think some of the stuff that might be hardest to automate would just be stuff that, in some sense, doesn't have anything to do with software or capabilities. So, an example of something that might just be, like, extremely hard to automate might be, like, trust, like, like, you know, making, making a business deal or providing care for someone who's sick. It might just be that, like, even if an AI system has all the same intellectual capabilities as a human, you know, can write poetry just as well, can have just as many ideas, can have just as good a conversation, it just... It doesn't look like a human, so people don't want that. Maybe it can create a perfect representation of a human on a screen, but it's still on a screen. And in general, I see the progress in AI as being mostly on the software front, not the hardware front. So AIs are able to do a lot of incredible things with language, things with math, things with board games. Uh, I wouldn't be surprised if they could write hit music in the next decade or two. But they... People really are not making the same kind of progress on, like, robotics. Like, so, so weirdly, a task that might be among the hardest, harder ones to automate... Or especially if things go fast, it might be hard to automate the task of, like, taking this bottle of water and taking off the cap because I have this, like, you know... I have this hand that is just, like, well-designed for that. Um, well, it's clearly not designed for that, but it's... I, I have these, like, these hands that can do a lot of stuff and we aren't seeing the same kind of progress there. So I think, I think there are a lot of places where, like, AI systems might have kind of... Their brains can do roughly everything human brains can, but there's some other reason they can't do some key economic tasks, and I think these are not the tasks I see likely to bottleneck the R&D as much.
26. DPDwarkesh Patel
  Gotcha.
27. HKHolden Karnofsky
  This is, this is an argument I make in, in one of my more obscure Cold Takes posts, but I say, like, you know, AI that could actually take everyone's job, like, like, every human's job, might be, like, a lot harder than AI that could radically transform the galaxy via new technology, because it might be... In some ways, like, it might be easier to take a scientist job than, like, a teacher's job or a doctor's job because the teachers and the doctors are regulated, and people might just say, "I want human teachers, I don't want an AI teacher." Whereas you can sit there in your lab with your AI scientists and find new theories that change the world. So, some of this stuff, I think, is very counterintuitive, but, but I could... I can imagine worlds where you get, you know, you get really wacky stuff before you get self-driving cars out on the road just because the way the regulation's working.
28. DPDwarkesh Patel
  Gotcha. Okay. Let's talk about another weak point, uh, or the one
1:00:14 – 1:06:04
Lock-in & Weak Points
1. DPDwarkesh Patel
  that you identify as a weak point, lock-in. What do you think are the odds of lock-in, given transformative AI?
2. HKHolden Karnofsky
  So lock-in is my term, or I don't know if it's my term, but it's a term I use, um, to talk about the possibility that we could end up with a very stable civilization, um, and so I talk about that. It's another post. It's called Weak Point in the Most Important Century: Lock-In. I wrote posts about the weakest points in the series. And the idea is basically like, throughout history so far, let's say someone becomes in charge of a government and they're very powerful and they're very bad. Usually this is, like, generally considered to be temporary in at least some sense, like the kind of thing that's not going to go on forever. Um, there's a lot of reasons the world is just dynamic and the thi-... The ways the world is ten- tend to just, like, not stay that way, uh, completely. The world just has changed a lot throughout history. It's kind of a dumb thing to say, but I'll get to why, why this might be important. Um, so, you know, if someone is running a country in a really cruel, corrupt way, I mean, for one thing, at some point they're going to get old and die and someone else is going to take over and that person will probably be different from them. Uh, for another thing, the world is changing all the time. There's new technologies, new things are possible, there's new ideas. And so, you know, the most powerful country today might not be the most powerful tomorrow. The people in power today might bo- not be the ones in power tomorrow. And I think this gets us used to the idea that everything is temporary, everything changes. And a point I make in The Most Important Century series is that you can imagine a level of technological development where there just aren't new things to find and there isn't a lot of new growth to have and people aren't dying because we've... You know, for whatever reason we've, we... You know, that seems like it should be medically possible for people not to age or die. And so you can imagine a lot of the sources of dynamism in the world actually going away if we had enough technology. You could imagine a government that was able to actually surveil everyone, which is not something you can do now, with a dictator who actually doesn't age or die, um, who knows everything going on, who's able to respond to everything. And then you can imagine that world just being completely stable. Um, I think this is a very scary thought and it's something we have to be mindful of, that if the rate of technological progress speeds up a lot, we could quickly get to a world that doesn't have a lot more dynamism and is a lot more stable.What are, I think, are the odds of this? I don't know. It's very hard to put a probability on it. I think when you think about... If you imagine that we're going to get this explosion in scientific and technological advancement, I think you have to take pretty seriously the idea that that could end by hitting a wall, that there could not be a lot of room for more dynamism, and that we could have these kind of very stable societies. What does seriously mean? I don't know, a quarter, a third, a- a half, something like that. I don't know. I'm making up numbers. I think it's serious enough to, to think about it and think about it as something that affects the stakes of what we're talking about here.
3. DPDwarkesh Patel
  Gotcha. Um, so I- I... Are... I'm curious if you're concerned about lock-in, just from the perspective of locking in a negative future-
4. HKHolden Karnofsky
  Yeah.
5. DPDwarkesh Patel
  ... or if you think that might intrinsically be bad to lock in any kind of future. If you could j-
6. HKHolden Karnofsky
  Yeah.
7. DPDwarkesh Patel
  ... right now press a button and lock in a reasonably positive future-
8. HKHolden Karnofsky
  Yeah.
9. DPDwarkesh Patel
  ... but that won't have any dynamism, or one where dynamism is guaranteed but-
10. HKHolden Karnofsky
  Yeah.
11. DPDwarkesh Patel
  ... uh, net expected positive is not, h- how would you make that determination?

Episode duration: 1:56:10

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode UckqpcOu5SY

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome