Dwarkesh PodcastSteve Hsu - Intelligence, Embryo Selection, & The Future of Humanity
EVERY SPOKEN WORD
150 min read · 30,070 words- 0:00 – 0:49
Intro
- SHSteve Hsu
(instrumental music plays) The, the big future AI in the, in the singularity looks back and says, "Hey, who gets the most credit for this genomics revolution that happened in the early 21st century?" That AI's gonna find these papers on the archive in which we prove-
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
... this was possible.
- DPDwarkesh Patel
What advice did Richard Feynman give you about picking up girls?
- SHSteve Hsu
(laughs) It, it's very funny because most wokest people today hate this stuff, but most progressives, like Margaret Sanger or you know (laughs) well, in some sense, forebearers of today's wokest, in the early 20th century, they were all what we would call today eugenicists.
- DPDwarkesh Patel
Today, I have the pleasure of speaking with Steve Hsu. Steve, thanks for coming on the podcast. I'm excited about this.
- SHSteve Hsu
Hey, it's my pleasure. I'm excited too. And I, I just wanna say, I've, I've listened to some of your earlier interviews and thought you were very insightful, which is why I was really excited to have a conversation with you.
- 0:49 – 12:21
Feynman’s advice on picking up women
- SHSteve Hsu
- DPDwarkesh Patel
That means a lot for me to hear, uh, hear you say because I'm a big fan of your podcast. My first question is, what advice did Richard Feynman give you about picking up girls?
- SHSteve Hsu
(laughs) Wow. Um, so one day in the spring of my senior year, I was walking across campus, and I see Feynman coming toward me. And we knew each other from various things. And it's a small campus, (clears throat) and I was a physics major, and he was my hero, so I guess I had known him since probably freshman year. Um, so he sees me, and, uh, you know, he's got this... I don't know if it's Long... I guess it's a Long Island or it's, it's some kind of New York borough accent, and he says, uh, "Hey, Hsu." This is how he says my name. "Hey, Hsu." And I'm like, "Hi, Professor Feynman." And, uh, so we start talking. And he says to me, um, "Wow, you're kind of a big guy." And I was a lot bigger then 'cause I played on the... I was a linebacker on the Caltech football team, so I was about almost 200 pounds. Uh, I'm a little... just over six feet tall. And, um, so I was pretty, like, a gym rat at that time. And so he's like... I was much bigger than him obviously. He's like, "Wow, you're a big guy, Steve. Uh, I gotta ask you something." And Feynman was born in, like, 1918, so he, he's not really, like, from the modern era. Like, he was, he was, uh... I guess he was going through graduate school when, uh, the Second World War started. And so, to him, the whole concept of a health club, a gym, was like totally... you know, he couldn't understand it. And, um, that was the era... This was the '80s, so that was the era when Gold's Gym was, like, becoming a world w- a national franchise, and so there were gyms all over the place, 24 Hour Fitness and stuff like this. So he didn't know what it was, and he's a very interesting guy. So he, he... his suspicion... He says to me, "What do you guys do there? Is that... Is it just a thing to meet chicks, to meet girls, or do you guys actually tr-... is it really for training? Do you guys really go there to get buff, uh, to get big?" You know, and, and so I started explaining to him. I said, "Yes, uh, you know, people are there to get big, but people are all- also checking out the girls, and there is a lot of stuff happening, (laughs) you know, at the, at the health club or in the weight room." And so, you know, he grills me on this for a long time. And one of the famous things about Feynman is that he, he has this laser-like focus so if there's something he really doesn't understand and he wants to get to the bottom of it, he will just focus in on you and just start questioning you and get to the bottom of it. That's the way his brain works. So he did that to me (laughs) for, like, I don't know how long. We were talking about lifting weights and, uh, everything 'cause he didn't know anything about it. And, um, at the end, he says to me, "Wow, Steve, I really appreciate that, you know. Uh, let me, you know, l- let me, let me give you some good advice." And, um, so then he starts telling me, uh, about how to pick up girls and... which I guess he... you know, he's a (laughs) kind of an expert on.
- DPDwarkesh Patel
(laughs) Yeah.
- SHSteve Hsu
And he says to me, he goes, he goes, um... one of the things he says to me, he says, like, "I don't know how much girls really like guys that are as big as you." Like, "I don't... I'm not..." He thought, like, it might be a turn-off actually. And, um, he said, "But you know what? You have a nice smile." So that was, that was the one compliment Fey- you know, he gives me. "You have a nice smile." And then he starts telling me, he says, "You know, the main thing is you... it's a numbers game, okay? You have to divorce your... you have to be totally rational about it. You're never gonna see that girl again, right? You're, you're in an airport lounge or you're, you're, you're at a bar. It's Saturday night in Pasadena or Westwood, and you're talking to some girl," and he says, "You're never gonna see her again. This is your one interaction with her, five-minute interaction. Do what you have to do, and if she for some reason doesn't like you, just go to the next one." And that's what he says. So, uh, you know, and he, he gives a little more... (laughs) some colorful details and stuff. Uh, but the point is, he's like, "You should not care what they think of you. You're, you're trying to do your thing." And, you know, he's a pretty... he had a kind of a reputation at Caltech as a womanizer. I could go into that too, but, uh, I heard this from the secretaries and stuff. (laughs) But, um...
- DPDwarkesh Patel
With the students or with like others-
- SHSteve Hsu
No, no, with sec- secretaries.
- DPDwarkesh Patel
... with the staff? Okay.
- SHSteve Hsu
Mostly secretaries-
- DPDwarkesh Patel
Right, right, right.
- SHSteve Hsu
... who were almost all female at that time. He, he, he had thought about this a lot, and he was just like, "Look, it's a numbers game." Just, I guess, the, the PUA type... Are you familiar with PUA culture, like a-
- DPDwarkesh Patel
Yeah, yeah. Yeah, I know what that is.
- SHSteve Hsu
So the PUA guys would say, like, "Yeah, don't..." You know, it's like an operation. Like, you're, you're just doing something, you follow the algorithm, and you... whatever happens, it's not a reflection on your self-esteem or your internal self-image. It's just that's what happened, and you just go onto the next one. And that was basically the advice (laughs) he was giving me. Um, you know, and he said other things which were pretty standard like, "You know, be funny. You're a funny guy. You know, girls like that. Be confident." You know, just basic stuff. But the main thing I remember was the oper- operationalization of it as an algorithm and that you should just not internalize whatever happens if you get rejected 'cause that's what really hurts. You know, you're a guy, right? When you go across...... the bar to talk to that girl. Maybe that doesn't happen in your generation, (laughs) maybe you just, like, swipe in.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
But, but we had to go, it was terrifying. Y- We had to go across the bar and talk to some lady and it's loud, and you, you got, like, a few minutes to make your case basically. (laughs)
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
And, um, nothing hurts more and nothing is more scary than walking across up to the girl, maybe she and her friends or something, right? So, he was just saying, like, "You gotta train yourself out of that. Like, you're never gonna see them again. The face space of humanity is so big you'll never encounter them again and it just doesn't matter, so just do your best."
- DPDwarkesh Patel
Yeah, that's interesting because I wonder when he w- when... I mean, in the 40s when he was at that age, was he doing this? I don't know what the cultural conventions were at the time, but I don't know, uh, during the... were there bars in the 40s where you could just go hit o- hit on girls or...
- SHSteve Hsu
Oh yeah, absolutely. Absolutely.
- DPDwarkesh Patel
Okay.
- SHSteve Hsu
I mean, if you read literature from that time or even a little bit earlier, like Hemingway or John O'Hara or, you know, they talk about, you know, how men and women interacted in bars and stuff like this in, in New York City. And you know, um, yeah, so that was a thing. That was much more of a thing than I think for your generation. (laughs)
- DPDwarkesh Patel
Yeah, yeah.
- SHSteve Hsu
That, that's what I can't figure out with my kids.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
Like, what is going on? (laughs) Like, how do, how do boys and girls meet these days?
- DPDwarkesh Patel
Right.
- SHSteve Hsu
But, uh, back in the day, it was like the guy had to do all the work, and it was like your... the most terrifying thing you could do and, um, you know, and you just have to train yourself out of that.
- DPDwarkesh Patel
Right. Uh, by, by the way when, uh, for, for context for the audience, when, well, Feynman says you were a big guy. Like, you, you were a football player at Caltech, right? And then-
- SHSteve Hsu
Yeah.
- DPDwarkesh Patel
... there's a picture of you actually on, uh, your website where... Uh, maybe this was after, after college or something but yeah-
- 12:21 – 24:54
Embryo selection
- DPDwarkesh Patel
yeah. Okay, so, um, let's talk about your, uh, company, Genomic Prediction. Uh, yeah, do you wanna talk ab- about what this company does? Do you wanna give an intro into what this is?
- SHSteve Hsu
Yeah. So if you don't mind, what I should say, uh, there are two ways to introduce it. One is the scientific view and then the other is the IVF view. And I can kind of do a little of both. So scientifically, the issue is we have more and more genomic data. If you give me a bunch, the genomes of a bunch of people and then you give me some information about each person, like do they or do they not have in- uh, diabetes, or how tall are they, or you know, what's their IQ score or something, then any... All of your listeners will be familiar with AI and machine learning. It's a natural AI machine learning problem to figure out which features in the DNA variation between people are predictive of whatever variable you're trying to predict, whatever pheno- the biological term is phenotype. So this is an ancient scientific question of how do you relate the genotype of the organism, (laughs) the specific DNA pattern, to the phenotype, the actual expressed characteristics of the organism. And if you think about it, this is what biology is. Like once we had the molecular revolution and people figured out that DNA is wh- the thing which stores the information, which is passed along, and evolution selects on the variation in the DNA as it's expressed as phenotype, and as that phenotype affects fitness, okay, or reproductive success, all... That's the whole ballgame for biology. And I'm lucky that as a physicist who's trained in kind of mathematics and computation, I arrived on the scene at a time when we're gonna solve this basic fundamental problem of biology through brute force AI (laughs) and machine learning. So that, that's how I kind of got into this, right? Now you ask as an entrepreneur, like, okay, fine, you, Steve, you're doing this in your office with your post-docs and collaborators on your computers and stuff, but what use is it, right? What, what, what use is all this stuff? The most direct application of this is in the following setting. Every year around the world, there are millions of families that go through IVF, typically because they're having some fertility issues and also mainly typically because the mother is older, like, uh, typically in her 30s or maybe 40s. And in the process of IVF, because they use hormone stimulation, they generally produce more eggs. Instead of one per cycle, they might produce, depending on the age of the woman, uh, anywhere between five or 10 or 20 or even I recently learned for young women who are hormonally stimulated, if they're egg donors, they could produce 60 or 100 eggs in one retrieval cycle. And then it's trivial. As you know, men produce sperm all the time. We're just producing it. Uh, you can fertilize those eggs pretty easily in a little dish, and you get a bunch of embryos which they grow. Uh, they just start growing once they're fertilized. Now the problem is if you're a family and you produce more embryos than you're going to use, you have what we call the embryo choice problem. You have to figure out like, okay, I have these 20 viable embryos, which 1:00 am I gonna use? And so the most direct application of this science that I described is, well, we can now genotype those embryos from a small biopsy, and I can tell you things about the embryos. I could tell you, hey, number four is an outlier for breast cancer risk. I would think carefully about using number four. Number 10 is an outlier for cardiovascular disease risk. You might want to think about not using that one. The other ones are okay. And so that is what Genomic Prediction does, and I think we work with two or three hundred different IVF clinics on six continents now.
- DPDwarkesh Patel
Yeah, yeah. So the super fascinating thing about this is that the, the diseases you talked about, or at least their risk profiles, they're, um, they're polygenic. So you can have thousands of, um, uh, SNPs, single nuclei- nucleotide polymorphisms that determine whether you're gonna get this disease or not. And the, the... So I, I'm really curious to learn, um, like how you were able to transition this space and like how your knowledge of mathematics and physics was able to help you figure out how to make sense of all this data?
- SHSteve Hsu
Yeah, that's a great question. So you know, first of all, again, like I was kind of stressing like the fundamental scientific importance of all this stuff.
- DPDwarkesh Patel
Yeah.
- SHSteve Hsu
If you go into a slightly higher level of detail, which you were getting at with the individual SNPs or polymorphisms, those are individual locations in the genome where I might differ from you and you might differ from another person. And typically, if you just take pairs of individuals, uh, each human, each pair of individuals will differ at a few million places in the genome, okay? And that's what's controlling. That's why I look a little different than you and, you know... So um-
- DPDwarkesh Patel
Just a little. (laughs)
- SHSteve Hsu
Just, yeah, a little bit. But I mean, yeah, you look better than me, but you know.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
Um, the, uh, the question is the following. So what, a lot of times what theoretical physicists do is they, they have a little spare energy, they have some spare cycles, and they, they get tired of thinking about quarks or something, and they wanna like maybe dabble in biology or they want to dabble in computer science or some other field. And the thing that we always have to do as theoretical physicists, we always feel like, oh, I, I have a lot of horsepower, I can figure a lot out, a lot of stuff out. Like for example, Feynman helped de- design the first parallel processors at Thinking Machines. I gotta figure out which problems I can actually make an impact on 'cause other... I can waste a lot of time. Some people spend their whole lives studying one problem, like one molecule or something-... or one, you know, biological system, and I don't have time for that. I'm just gonna jump in and jump out. I'm a physicist, right? That, that's a typical attitude among theoretical physicists. So the, the thing that I had to confront about 10 years ago was, I knew the rate at which sequencing costs were going down. So I could anticipate we would get to the day, today, when there are millions of genomes with pheno- good phenotype data available for analysis, so that a typical run for us, a training run, might involve almost a million genomes or half a million genomes or something. So the que- the mathematical question is, what is the most effective algorithm given a set of genomes and phenotype information to build the best predictor? Right? So it's a, it's a- it can be boiled down to a very well-defined machine learning problem, and it turns out for some subset of algoritms that- algorithms, there are theorems. There are, there are actually performance guarantees that tell you, that give you a bound on how much data you need to capture almost all of the variation in the, in the, in the, uh, in the features. And so I spent actually a fair amount of time, like probably a year or two studying, uh, these results, very famous results. Some of them were proved by a guy called Terence Tao, who's a Fields medalist, and these are results on something called compressed sensing, which is a penalized form of high-dimensional regression which tries to build sparse predictors. Uh, uh, machine learning people may- might know it as L1 penalized, uh, optimization. And anyway, so the, the point is that we, in the early ... The very first paper we wrote on this was to prove that, uh, using real genomic da- genomic data that these theorems that were very abstract could be applied in order to predict how much data you would need to, quote, "solve individual human traits." So we, we showed that you would need at least a few hu- around a few hundred thousand individuals and their heights, their genomes and their heights, to solve height as a phenotype. And we proved that in a paper using all this fancy math in 2012, I want to say the paper came out, around 2012. And then around 2017 when we got a hold of half a million genomes, we were able to implement it in practical terms and show that our mathematical result from some years ago was correct. And we, the, the transition from low, uh, performance of the predictor to high performance, there's a kind of what we call a phase transition boundary between those two domains occurred just where we said it was gonna occur. So some of these technical details are really just not understood even by practitioners in computational genomics who are not quite that mathematical. They don't understand actually these results that in our earlier papers, they don't really know why we can do stuff that other people can't do or why we can predict how much data we're gonna need-
- DPDwarkesh Patel
Mm-hmm.
- SHSteve Hsu
... to do stuff. It's not well appreciated even in the field. But, um, if you look carefully when the, the, the big future AI in the, in, in, in our future, in the singularity, looks back and says, "Hey, who gets the most credit for this genomics revolution that happened in the early 21st century?" They're gonna find, that AI is gonna find these papers on the archive in which we proved (laughs) this is possible and then five years later we did it and et cetera, et cetera. Right now it's underappreciated, but the, the future AI that Roko's basilisk AI when he looks back is gonna give me a little bit of credit for it.
- DPDwarkesh Patel
(laughs) Yeah, yeah. So I, um, I was kind of a little interested in this, uh, a few years ago. And then at, at that time I looked into like how these polygenic risk scores are calculated and it was basically you just find the correlation between the phenotype and the, the, the, the alleles that correlate with it and you just add up how many copies of these alleles you have, what is the correlation? So it seemed like, uh, and you just do a weighted sum of that. So that seemed like a very, um, uh, that just seems super simple, es- especially in an era where we have all this, uh, machine learning and ... But it seemed like they were getting good predictive results out of that. So what is the delta between how good you can get with all this fancy mathematics versus just like a very simple, like sum of correlations?
- SHSteve Hsu
Yeah. So y- yeah, you're absolutely right that the ultimate models that are used when you've done all the training and your- the dust settles, the models are very simple. They have an additive, uh, structure. So it's basically like I either assign a nonzero weight to this particular region in the genome or I don't. And then I need to know what is the weighting, but then the, the function is a linear function of, it's an additive function of the state of your genome at some subset of positions. So the ultimate model that you get is very simple. Now, if you go back 10 years when we were doing this, there were lots of claims that it was going to be super non-linear. That it wasn't gonna be additive the way I just described. It was gonna, there were gonna be lots of interaction terms between regions. People, some biologists are still convinced that's true even though like we already (laughs) know, like we have predictors that don't have interactions. Okay. The other question which is more technical is that there, in any small region of your genome, the state of the individual variants is highly correlated because you inherit them in, in, in chunks. And so you need to figure out which one of those you want to use. You don't want to activate all of them because you might be over counting. So that's where this L1 penalization, uh, sparse methods, they force the predictor to be, uh, you know, sparse and that is a, a key step. Otherwise, you might over count. You might have 10, 10 different variants close by that have roughly the same statistical significance if you just do some simple regression math. But then you don't know which one of those 10 to use, and you might be over counting effects or under counting effects. So, so this, what you end up doing is a super high dimensional optimization where you, you, you only activate, you grudgingly activate a SNP when the signal is strong enough and you d- once you activate that one, the algorithm has to be smart enough to, to penalize the other ones nearby and not activate them because you're over counting effects if you do that. So there's a little-... little bit of subtlety into it- in it, but the m- the main point which you made, which is that the ultimate predictors, which are very simple and additive, just sums over effects sizes times states, actually works really well. And that is related to a deep statement about the additive structure of the genetic architecture of individual differences. So in other words, it's kinda weird that the ways that I differ from you are merely just 'cause I have more of something and you have less of something, and it's not like, "Oh, these things are interacting in some super un-" you know, incredibly un-understandable way. And so that's a very deep thing, which again, is not appreciated that much by biologists yet, but over time, I think they're gonna figure out that there's something interesting here.
- 24:54 – 34:48
Why hasn't natural selection already optimized humans?
- SHSteve Hsu
- DPDwarkesh Patel
Right. No. I thought that was p- that was super fascinating, and I- I- I commented on that- about that on, uh, Twitter. Uh, e- what- what is really interesting about that is, um, I guess two things. One is you have this really interesting evolutionary argument about why that would be the case. Uh, y- y- you might wanna explain. And the second is, it makes you wonder, um, if it's just- if just becoming more intelligent is just a matter of, like, turning on certain SNPs, it's not a matter of, like, all this, uh, incredible optimization that, you know, y- y- it's- it's like solving a Sudoku puzzle or anything. If- i- if that's the case, then why aren't we already, um... Why hasn't the human population already been selected to be maxed out on all these traits if it's just a matter of a bit flip?
- SHSteve Hsu
Yeah. So okay, so the first issue, which is how- why- you know, why is this a- why is this genetic architecture so simple, surprisingly simple? And again, 10 years ago, we didn't know it was gonna be simple, so when we were- when I was checking to see whether this was a field th- that I should go into because either we are capable or not capable of making progress, we had to study the more general problem of the non-linear possibilities as well. But eventually, we realized that probably most of the variance was gonna be captured in an additive way so, you know, we could narrow down the problem quite a bit. There are evolutionary reasons for this. There's a famous theorem by Fisher, who's the father of population genetics and also of really f- what you call frequentist statistics. And so Fisher proved something called the fundamental- Fisher's fundamental theorem of natural selection (laughs) , which says that if you impose some selection pressure on a population, the rate at which that population responds to the selection pressure, like say, like it's the bigger, uh, rats that out-compete the smaller rats. At what rate does the rat population then start getting bigger? He showed that it's dominated by the additive variance, that that dominates the rate of evolution. And it's easy to understand why. If it's a- if it's a non-linear mechanism that you need to make the rat bigger, when you sexually reproduce and that gets chopped apart, you might break the mechanism. Whereas if each little, uh, allele has its own independent effect, you can just- you can just, uh, inherit them without worrying about breaking the mechanisms. So it was well-known for (laughs) , you know, at least among a tiny population of theoretical population bio- biologists that- that additive variance was the dominant way that, uh, populations would respond to selection. So that was already known. And the other thing is that humans have been through a pretty tight bottleneck, and we're not that different from each other. So it's very plausible to me that if I wanted to edit an embryo, a human embryo and make it into a frog, then there's all kinds of non-linear subtle things I have to do. But all those very non-linear complicated subsystems are fixed in humans. You have the same system as I do. You have the non- the human not frog or ape not frog (laughs) version of that region of DNA, and so do I. But the small ways in which we differ are just these little additive switches, mostly little additive switches. And so that's the- this deep scientific discovery over the- from the last, say, 5, 10 years of work in this area. Now, um, you were asking about why evolution hasn't, like, completely, quote, "optimized" all traits in humans already. Now, I don't know if you ever do deep learning or very high-dimensional optimization, but you realize, like, in that high-dimensional space, you're often moving on a surface which is slightly tilted, so you're getting gains, but it's also kind of flat. So even though you, like, scale up your compute or data size by an order of magnitude, you don't move that much farther. You get some gains, but you're never really at the global max of anything in this- in these high-dimensional spaces. I- I don't know if that makes sense to you, but- but it's quite plausible to me that two things are important here. One is o- evolution has not had that much time to optimize humans. And what do you mean by optimization? Because the environment that humans live in has changed radically in the last 10,000 years. Like, for a while, we didn't have agriculture. Now, we have agriculture (laughs) . Now, we have swipe left if you wanna have sex tonight. You know, we... The- the- the environment didn't stay fixed. And so when you say, like, fully optimized for the environment, what- what do you mean? Um, the ability to diagonalize matrices might not have been very adaptive, uh, 10,000 years ago.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
Uh, and might not even be adaptive now.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
But- but anyway, so- so it's a complicated question. I- One can't reason that naively about, "Oh, well, if- if God wanted us to be 10 feet tall, we'd be 10 feet tall," or, "If- if it's better to be smart, my brain would be, like, this- this big or something." So you can't reason that naively about, uh, stuff like that.
- DPDwarkesh Patel
I see. Yeah. Okay. So I guess it could make sense, for example, with certain health risks, like the thing that makes you more likely to get di- diabetes or heart disease today might be... I- I don't know what the, uh, the polite short effect of that could be, but may- maybe that's not that important when you're not that obese.
- SHSteve Hsu
Let me just point out that- that- that most of the diseases that we care about now, most of them, not the rare ones, but the common ones, they manifest when you're, like, 50, 60, 70 years old. And there was never any evolutionary, uh, big advantage, I think, of being super long-lived, right? So there- there's even a debate about whether, like, okay, if the grandparents are around to help raise the kids, that- that raises the fitness a little bit of the family unit. But- but-Most of the time in the pa- and most of our evolutionary past, humans just died (laughs) , you know, fairly early, and so a lot of these diseases would never have been optimized against evolutionarily. But we see them now because we live under such good conditions, uh, we can, uh, you know, people regularly approach 80 or 90 years.
- DPDwarkesh Patel
Regarding the linearity and additivity point, I was go- uh, I was gonna make the analogy, and I'm curious if this is valid, but when you're programming, one thing that's good practice is to have all the implementation details in separate function calls or separate programs or something, and then have your main l- uh, have your main, uh, uh, you know, uh, loop of operation just be, call different functions, like do this, do that. So that you can easily comment stuff away or, uh, change arguments, and this seemed a l- very similar to that where you have, uh, wh- where just by turning these things on and off you can, uh, change what the, the next offering is gonna be, uh, and you don't have to worry about, like, actually implementing the, uh, what- whatever the, uh, underlying mechanism is.
- SHSteve Hsu
Well, what- what you said is related to what Fisher proved in his theorems, which is that, you know, if- if suddenly it becomes advantageous to have X, like white fur instead of black fur or something, um, it would be best if there were little levers that you could, you could move somebody from black fur to white fur continuously by just modifying those switches in- in an additive way. It just turns out with, for sexually reproducing species where the DNA gets scrambled up in every generation, it's better to have switches of that kind. Um, and so th- the other point related to your software, uh, analogy is that there are, there seem to be modular, fairly modular things going on in, uh, the genome. So when we looked at, we were the first group to, I think we had, for like initially like say 20 major disease conditions we had decent predictors for, and we just started looking carefully at the, just something as trivial as the overlap of, you know, my sparsely trained predictor turns o- uses these features for diabetes, but it uses these features for schizophrenia, and how much overlap? Just- just the stupidest metric is like how much overlap or variance accounted for overlap is there between pairs of disease conditions, and it's very modest. It's- it's actually the opposite of what naive biologists would say when they talk about pleiotropy or, they're- they're just disjoint. They're just disjoint regions of your (laughs) genome that are governing certain things. And so why not? You have three billion base pairs, there's a lot you can do in there, there's a lot of information in there. (laughs) So you can have, if you need a thousand to control diabetes risk, I can have, I think I estimated you can easily have a thousand roughly independent traits that are just disjoint in their genetic dependencies, and so if you think about, like, D&D, like your strength and your dex and your wisdom and your intelligence and charisma, those are all disjoint. They're- they're all just independent variables, so it's like a seven-dimensional space that your character lives in. Well, there's enough information, uh, in the few million differences between me and you, there's enough for a thousand-dimensional space of variation. Like, oh, how big is your spleen? My big- my spleen's a little bit smaller, yours is a little bit bigger, that can vary independently of your IQ. Oh, it's big surprise. The size of your spleen can vary independently of the size of your big toe. Oh, yeah, yeah. There's about a thousand, if you just do information theory, there's about a thousand (laughs) different parameters I can vary independently with the number of variants that I have between me and you. So, and- and this thing, because you understand some information theory, uh, is obvi- is like kind of trivial to explain, but try explaining it to a biologist. Y- you won't get very far.
- DPDwarkesh Patel
Yeah, yeah. The, do the log2 of the- the number of, uh, uh-
- SHSteve Hsu
That's all. (laughs)
- DPDwarkesh Patel
Is that basically how you do it? Yeah. Okay.
- SHSteve Hsu
That's all it is. I mean-
- DPDwarkesh Patel
Okay.
- SHSteve Hsu
Well- or- I mean, well, we, it's in our, it's in our paper, like we- we basically look at, okay, how many, how many variants are typically accounting for most of the variation for any of these major traits? And then imagine that they're mostly disjoint, well, just how much length of DN- how many variants do you need then to independently vary, uh, a thousand traits? Well, it's- it's a few million differences between me and you are enough, right? So...
- DPDwarkesh Patel
Right.
- SHSteve Hsu
Um, it's trivi- it's very trivial math. Once you understand the base, how- how to reason about information theory, then it- it's very trivial, but, uh, it ain't trivial for theoretical biologists (laughs) as far as I can tell.
- DPDwarkesh Patel
But-
- 34:48 – 43:53
Aging
- DPDwarkesh Patel
but the result is so interesting because I remember reading in The Selfish Gene that, like, he hypothesizes, um, the reason we have aging or one of the possible reasons we have aging is that, um, there's antagonistic pleiotropy. What, there's, um, there- there's something that makes you healthier when you're young and fertile that makes you unhealthy when you're old, and evolution would have selected for such a trade-off because tha- tha- that when you're young and fertile is when evolution, uh, and your genes care about you. And so it, but if there's enough space in the genome for you- for- where these trade-offs are not nece- necessary, then this may be like a bad explanation for aging, or do you think I'm straining the analogy?
- SHSteve Hsu
No, no, you're- you're, it's- it's, uh, I love your interviews because, uh, the point you're making here is ex- is really good. So Dawkins, who is a kind of evolutionary theorist but from the old school when they had almost no data, okay, to deal, you know, you can imagine how much data they had compared to today. He would like to tell you a story about a particular gene that maybe it has this positive effect when you're young, but it makes you age faster, so there's a trade-off. And, you know, we know about things like sickle cell anemia, and, you know, we- we know stories like that, and no doubt there are stories like that which are true about specific vari- variants in your genome. But that's not the general story. (laughs) The general story, which we only discovered in the last five years, is that almost every trait is controlled by thousands of variants, and those variants tend to be disjoint from the ones that control the other trait. So they weren't wrong, but they didn't have the big picture.
- DPDwarkesh Patel
Yeah, I see. Um, and, uh, and then, yeah. So you had this paper, I think, um, w- it was called Polygenic Health Index for General Health and Disease Risk, and then you showed that with 10 embryos, you could increase, uh, disability-adjusted life years by, uh, four, which i- which is, like, a huge increase of, like, if you think about, like, if- if you could just live four years longer in a healthy state.
- SHSteve Hsu
Yeah. What's the value of that? What would you pay to buy that for your kid?
- DPDwarkesh Patel
Right. Yeah. But, uh, I don't know. This seems like, um ... This, uh, uh, going back to that earlier question about the trade-offs, or, like, about why this hasn't already been selected for, the i- if you're right and there's no, like, trade-off to do this, um, just living four years older, even if that's b- beyond your fertility, just, like, being a gr- grandpa or something, that seems like, uh, an unmitigated good. So why ... Uh, w- it's kind of mysterious that that hasn't already been, uh, you know, selected for.
- SHSteve Hsu
So, no. It's a, I- I- I real- I'm glad you're really asking about these questions 'cause these are things that people are very confused about, um, you know, even in the field. So, first of all, let me say, when, if- if you have a trait that's controlled by 10,000 variants, okay, like height is controlled by over 10,000 variants and probably cognitive ability a little bit more, the square root of 10,000 is 100. Okay? So if I could come to this little embryo and I said, "I wanna give it one extra standard deviation of height, plus one standard deviation of height," I only need to edit 100. I only need to flip 100 minus variance to plus variance. These are very rough numbers, but, you know, the- the- one standard deviation is like the square root of n, right? If I flip a c- if I flip a coin n times and I want a better outcome in terms of number of ratio of heads to tails and I wanna increase it by one standard deviation, I only need to flip square root of n heads because you ... If you flip a lot, you're gonna get a very narrow distribution peaked around a half, and the width of that distribution is the square root of n. Okay. So once I tell you, "Hey, your, you know, height is controlled by 10,000 variants and I only need to flip 100 genetic variants to make you one standard deviation for a male that would be three inches tall, two and a half or three inches taller," suddenly you realize, wait a minute, there's a lot of variance up for grabs there. I mean, if I could flip 500 variants in your genome, I would make you five standard deviations taller. You'd be, like, seven feet tall, and I didn't have to do that much work, and there's a lot more variation where that came from. Okay? Because I only flipped 500 out of 10,000. I could have flipped even more, right? So there's this kind of quasi-infinite well of variation which evolution or genetic engineers could act on. And again, the early population geneticists who bred, who breed corn, who breed, uh, animals, they know this. This is actually something they explicitly know (laughs) about because they've done calculations. Now interestingly, the human geneticists, who are mainly concerned with diseases and stuff, are often not familiar with what the math that the animal breeders already know. And y- you might be interested to know that, like, the milk you drink is, comes from heavily genetically optimized cows, who are, who are actually bred artificially using ... And- and they're, and they're, they're using almost exactly the same technologies that we use at genomic prediction, but they're doing it to optimize, like, milk production and stuff like this. So, um, there is a big well of variance. It's a consequence of this super multi- um, polygenicity of the trait, and it does look like people could ... Coming back to your qu- (clears throat) your question about, uh, longevity, it does look like people could, quote, "be engineered to live much longer" than they currently do, um, by f- just, say, flipping the variants that make, that reduce risk for individual diseases that tend to shorten your life. And th- then the question is back to, "Well, why didn't evolution (laughs) give us lifespans of a thousand years like back in the Bible? People in the Bible used to live for a thousand years. Why don't we li-" I mean, that probably didn't really happen. But, um, the question is you have this very high dimensional space and you have a fitness function, and how, uh, big is the slope in- in a particular direction of that fitness function? Like, like, how much more successful reproductively would Joe have been, Joe Caveman have been, if he lived to be 150 instead of only, you know, 100 or something? And there just hasn't been enough time to, you know, explore this super high dimensional space. Um, that's the, that's the actual answer. But now we have the technology. We're gonna fucking explore it fast now. That's the, that's the point that people, you know, the big light bulb is, should go off. Like, no, we, we're mapping this space out now. Pretty confident in 10 years or so that CRISPR gene editing technologies will be ready for massively multiplexed edits, and we're gonna start navigating in this high dimensional space as we like. So that's the, that's the more long-term consequence of these scientific, uh, insights.
- DPDwarkesh Patel
Yeah. That's super interesting. And what do you think will be the plateau for a trait like, uh, you know, how long you live? What- what ... is it ... 'cause before is, I guess, with the current data and techniques, you think it could be significantly greater than that or?
- SHSteve Hsu
Well, we did a very simple calculation which, which amazing that it gives the, kinda the right result. We said, like, given this polygenic predictor, uh, that we built, which isn't perfect. I mean, we're gonna, it's gonna improve a lot as we get more data. Given this polygenic predictor for overall health, uh, which is used in selecting embryos today, if you just say, like, "Well, out of a billion people, what's the best person? Typically, what would their score be on this index? And then how long would they be predicted to live?" It's about 120 years. So it's actually spot on. It's- it's (laughs) basically that's- that's one in a billion type person lives to be, like, 120 (laughs) years old or so, roughly. Um-... the, how much better can you do? Probably a lot better. Pro- probably, I mean, I, I don't wanna speculate, but, 'cause, uh, other effects, non-linear effects, things that we're not taking into account will start to play a role at some point. So it's a little bit hard to estimate what the limiting, the, what the true limiting factors will be. But the one statement which is super robust, and I'll stand by it, I'll debate any Nobel Laureate in biology or whatever (laughs) wants to talk about it, um, there's clearly a lot of variance available to be selected on or edited. There's c- that's just, there's no question about that. And that's been, that's been established in animal breeding and plant breeding for, you know, a long time now. So we can ... You, if you want a, a chicken that grows to be this big instead of this big, you can do it. If you want a cow that produces literally 10 times or 100 times (laughs) more milk, uh, than a regular cow, you can do it. The egg you ate for breakfast this morning? Those bioengineered chickens, they lay almost an egg a day. A chicken in the wild lays, like, an egg a month. How the hell did we do that?
- DPDwarkesh Patel
Right, right.
- SHSteve Hsu
By genetic engineering, that's how we did it. So ...
- DPDwarkesh Patel
Yeah, yeah. And that was just brute, uh, brute r- artificial selection. There, there, no fancy, uh, (laughs) machine learning there. Uh-
- SHSteve Hsu
Uh, last ten years, last ten years it's gotten sophisticated-
- DPDwarkesh Patel
Oh, really?
- SHSteve Hsu
... machine learning, genotyping of chickens, um, artificial insemination, um, modeling of the traits using ML. Last ten years has-
- DPDwarkesh Patel
Oh, okay.
- SHSteve Hsu
It's basically, for cow breeding now it's totally, totally done by ML now.
- 43:53 – 53:50
First Mover Advantage
- SHSteve Hsu
- DPDwarkesh Patel
I, I had no idea. That's super interesting. Um, what is, uh ... So you mentioned that you're accumulating, um, data and improving your techniques over time. Is there a first mover advantage to, uh, a, you know, a genomic prediction company like this? Or is it, is it just whoever has the newest, best, uh, algorithm for going through the biobank data?
- SHSteve Hsu
Yeah, uh, I, it's a, that's another (laughs) it's, uh, it's another super question. So for your entrepreneurs in your audience, I would say in the short run if you ask, like, "Oh what, how's, you know, what's, what the, what should the valuation of GP be?" (laughs) You know, 'cause that's how the venture guys would want me to answer the question. Um, there is a huge first mover advantage because they're important is the channel relationships between us and the clinics. And nobody's gonna be able to get in there very easily when they come later because we're, we're developing trust and a big track record with clinics all over the world, and we're well known. Could 23andMe or some company that has a huge amount of data, and if they were to actually get better AI ML people working on this, um, could they kind of blow us away a little bit and build better predictors 'cause they have just much more data than we do? Possibly, yes. Um, now there's a core expertise that we have in doing this kind of work for years and years and years that we're just really good at it. And so even though we don't have as much data as 23andMe, we might still, our predictors are better than theirs right now. Um, and I'm out there all the time working with biobanks all around the world, like in countries like, I don't wanna say all the names, but other countries trying to get my hands on as much data as I can. So, um, but there, there may not be a lasting, (smacks lips) you know, advantage beyond the actual business channel connections to that particular market. It may not be a defensible purely scientific, uh, moat around the company. Uh, we do have patents on, on specific technologies about how to do the genotyping or how to do error correction on embryo DNA and stuff like this, that we do have patents on stuff like that. But th- this general idea of, like, who's gonna be the best at predicting human traits from DNA, it's unclear who's gonna be the winner in that race. Maybe it'll be the Chinese government in 50 years. I, who knows?
- DPDwarkesh Patel
Yeah, that's interesting. I mean, y- if you think about, like, a company like Google, like, it's, it theoretically it's possible you could come up with a better algorithm than PageRank and then beat them. But it, it just seems like probably the engineer at Google is gonna be the one that comes up with whatever edge case or whatever improvement is possible.
- SHSteve Hsu
That's exactly ... See, it's exactly what I would say. I would say, like, yeah, maybe, I mean PageRank actually by now is totally deprecated, but-
- DPDwarkesh Patel
Right.
- SHSteve Hsu
But, um, even if somebody else comes up with a somewhat better algorithm or somewhat better, maybe they have a little bit more data, if you have a team that's been doing this for a long time and you're really focused and good, it's still tough to beat you, especially if you have a lead in the market.
- DPDwarkesh Patel
Yeah, yeah. And then so what layer of the stack, uh, do you ... So you guys, are, are you guys doing the actual biopsy or is it just that they upload the genome and you're the one processing and just giving recommendations? Is it y- is it just, like, an API v- call basically or ...
- SHSteve Hsu
So, uh, it's great. I love your question. So, um, it is totally standard. Every good IVF clinic in the world regularly takes embryo biopsies. So that's totally standard. It's, like, a lab tech doing that. Okay? And then what happens is they take the little sample and they put it on ice and they just ship it. And the DNA as a molecule is extremely robust and stable. In fact, my other startup solves crimes that are 100 years old, uh, from DNA that we get from some semen stain on some-
- DPDwarkesh Patel
Mm.
- SHSteve Hsu
... rape victim, vic- you know, victim, serial killer victim's bra strap. We've done stuff like that. Uh-
- DPDwarkesh Patel
Jack the Ripper, when, whenever are we gonna solve that mystery?
- SHSteve Hsu
If they can give me samples, uh, we can get into that. Uh, that's-
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
For example, we just learned that you can recover DNA pretty well from s- the, like, if someone licks a stamp and puts it on-
- DPDwarkesh Patel
Mm.
- SHSteve Hsu
... on their correspondence. So the ... I mean, if you can do Neanderthals, you can do a lot for solving crimes
- NANarrator
(laughs) Yeah.
- SHSteve Hsu
... like per- how we're ... So in the IVF, uh, workflow, our lab, which is in New Jersey, can service every clinic in the world because they just, they just take the biopsy, they put it in a standard shipping container and they just send it to us. And then we, we actually genotype the DNA in our lab, but we've actually trained a few of the bigger clinics-... to actually do the genotyping on their site. And at that point, it's just like they upload some data into the cloud, and then they get back some stuff from our platform. So at that point, we're... it's gonna be the whole world, man. Every human, every human (laughs) who wants their kid to, you know, be healthy and, you know, get the best they can, it's gonna... that data's gonna come up to us and it's gonna... the report is gonna come back down to their IVF physician.
- DPDwarkesh Patel
Right, yeah. Which I- which is great if you think, uh, well, they got... I, l- let's say you think there's a potential that this technology might get regulated in some way, um, you could just have, like, just go to Mexico or something, have them upload the, have them offload the genome. Uh, you know, you don't care where they upload it from. And then, then you can just get the recommendations there.
- SHSteve Hsu
Yeah, I think we're gonna evolve to a point where because the, this, the genotyping technology's getting better and better, eventually we're gonna be out of the wet part of this business, and only in the bit and cloud part of this business. Because eventually, the clinic, no matter where it is, they're gonna have a little sequencer which is this big. And their tech is gonna do it, and then they're just gonna hit upload, and, um, then they get the report back like three seconds later from us, uh, the, for the physician to look at, and the parents can look at it on their phone or whatever. Actually we're, we're basically there actually with some clinics. So, uh, yeah, it's gonna be tough to regulate because it's just bits, right? So you have the bits, and you're in some repressive, terrible country, you know, that doesn't allow you to select for some special traits that people are nervous about, but you just upload it to some vendor who's in Singapore or in, uh, you know, some free country and (laughs) and they-
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
... they give you the report back. Doesn't have to be us. I mean, we're not... we don't do those, we don't do the edgy stuff. We only do the health-related stuff right now.
- DPDwarkesh Patel
Yeah, yeah.
- SHSteve Hsu
But if you wanna know how tall i- this embryo's gonna be, I'll tell you a mind-blower. When you do face recognition in AI, you're basically mapping someone's face into a parameter space of like, I think it's on the order of hundreds of parameters, right? Each of those parameters is super heritable. So in other words, if I take two twins and I, I measure their... I photograph them and the, the algorithm gives me the value of that parameter for twin one and twin two, they're very close, obviously. That's why I can't tell the two twins apart. The, the, the face recognition can ultimately tell the twins apart, the really good face recognition, but you can, you can just conclude that almost all these parameters are the same for those twins. So it's highly heritable. So we're gonna get to a point soon where I can do the inverse problem where I have your DNA and I predict each of those parameters in the face recognition algorithm, and then from that, I reconstruct the face. So I say, like, "This embryo, when she's 16, this is what she's gonna look like. When she's 32, this is what she's gonna look like." And I- I'll be able to do that for sure. It's just a data... you know, it's just a, a, AI/ML problem right now, but, but the basic biology is clearly gonna work.
- DPDwarkesh Patel
Right, right, right.
- SHSteve Hsu
So, um, so then you're gonna be able to say like, "Oh, look, here's a report. Look, let's... embryo four is so cute."
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
"Why don't we..." You know.
- DPDwarkesh Patel
B- before you get married-
- 53:50 – 1:00:32
Genomics in dating
- DPDwarkesh Patel
if- if this like reaches... uh, if e- if like let's say in 50 years or 100 years all... the majority of the population is doing this and if that means that the diseases that are highly heritable get, um, pruned out of the population, does that mean we'll only be left with the diseases that are like lifestyle diseases? So you won't get like breast cancer anymore, but you will still get, um... you know, you'll still get like fat or I don't know, it's whatever, lung cancer from smoking or something.
- SHSteve Hsu
I think it's hard to discuss the- the asymptotic limit of what's gonna happen here. Um, I'm not, I'm not very confident about making predictions like that. You know, it- it could be that-... uh, we'll get to the point where everybody is, well, everybody who's rich, (laughs) or has been through this stuff for a while, uh, especially if we get the editing working, um, is super low risk for all the, like the top 20 killers, the- the diseases which, you know, have most, uh, life expectancy impact. And yeah, maybe those people live to be 300 years old naturally. I- I don't think that's excluded at all. So, um, I think that's within the realm of possibility. But, you know, it's gonna happen for a few lucky, you know, Elon Musk like people before it happens for schlubs like you and me, so... (laughs)
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
You know, there's gonna be very angry inequality protestors about, you know, the Trump grandchildren who models predict will live to be 200 years old.
- DPDwarkesh Patel
Right.
- SHSteve Hsu
You know?
- DPDwarkesh Patel
Yeah, yeah, yeah. That's-
- SHSteve Hsu
People are not gonna be happy about that.
- DPDwarkesh Patel
Yeah. That's so interesting. Um, and, okay, so one way to think about these different embryos is, uh, like if- if you can mal, uh, produce multiple embryos, you get to select from one of them, each of them is like, um, each of them is like a call option, right? And therefore, you probably wanna optimize for volatility as much or if not more than just the expected value of the trait. And so I'm wondering if there's mechanisms where you can, I- I don't know, like increase the volatility in meiosis or in some other process so you- you just get a- a higher variance and you can just select from the tail better.
- SHSteve Hsu
Well, I'll- I'll tell you, uh, something related to that, which is quite amusing. So I had conversations with some pretty senior people at the company that owns all the dating apps. So you can look up, you can figure out what company this is.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
But they own Tinder and Match and stuff like this. And they're kind of interesting, interested in, wow, what if we have a special feature where instead of Tinder Gold or Platinum, you upload your genome and you match, we- we talk about how well you match the other person based on your genome. Actually, one person told me something which was really shocking is that apparently guys lie about their height on these apps. And if you could have a DNA verified-
- DPDwarkesh Patel
I'm shocked. Truly shocked. (laughs) Truly shocked.
- SHSteve Hsu
And if you could have a, if you could have a DNA verified height on there, (laughs) 'cause our- our accuracy's like an inch or something.
- DPDwarkesh Patel
Right.
- SHSteve Hsu
So, um-
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
... it would prevent, like, really gross distortions, like someone claims they're 6'2" and they're actually 5'9". Probably the DNA could say that's unlikely actually. But- but no, the- the- the application to what you were discussing is more like let's suppose that we're selecting on intelligence or something, and let's suppose that the regions where your girlfriend has all the plus stuff-
- DPDwarkesh Patel
Ooh.
- SHSteve Hsu
... is complementary-
- DPDwarkesh Patel
Yes, yes.
- SHSteve Hsu
... to the regions where you have your plus stuff.
- DPDwarkesh Patel
Ah.
- SHSteve Hsu
So we- we could model that and say, like, your kids, just because of that, you know, the- the- the complementarity of the structure of your genome in the regions that affect intelligence, you're very likely to have some super smart kids way above your, the mean of your, you- you and your girlfriend's values. So you could actually say things like, "Yeah, it's better for you to marry that girl than that girl." You know, if- if you're gonna go, as long as you're gonna go through embryo selection we can throw out the outlier, the bad outliers.
- DPDwarkesh Patel
That is so fascinating.
- SHSteve Hsu
So that, all that's technically feasible. And I- I think actually it's true that one of the earliest patent applications, they'll- they'll all deny it now, uh, what's her name? Uh, gosh, I can't remember her name. (laughs) The- the CEO of 23andMe, um, Wojcicki.
- DPDwarkesh Patel
Yeah.
- SHSteve Hsu
Uh, she'll deny it now, but I think if you look in the patent database, the very ear- one of the very earliest patents that 23andMe filed when they were, like, still a small startup was about exactly this, is like advising, uh, parents about mating and how their kids would turn out and stuff like this. So, you know, we don't even go that far in- in GP. We don't even talk about stuff like that, but they were thinking about it when they founded 23andMe.
- DPDwarkesh Patel
Th- that is unbelievably interesting. Um, and, but by the way, speaking of height, there's- there's just, uh, this, again, this just occurred to me, but, you know, it's like supposed to be highly heritable, but especially people in like Asian countries who, uh, w- we have the experience of like having grandparents who are much shorter than us and then parents that are shorter than us, which suggests that like the environment has like a big part to play in it, just like malnutrition or something. Um, yeah, so how- how do you square that, uh, the fact that like often our parents are shorter than us with the idea that like height is supposed to be super heritable?
- SHSteve Hsu
Another great observation. So the correct, the real correct scientific statement is we can predict height, uh, for people who are born, who will be born and raised in a favorable environment.
- 1:00:32 – 1:07:59
Ancestral populations
- DPDwarkesh Patel
that actually raises the next question I was about to ask, which was, uh, how applicable are these scores across, uh, d- you know, different a-ancestral populations?
- SHSteve Hsu
Huge, huge problem right now, because most of the data (clears throat) is from Europeans. And what happens is that as you... if you train a predictor in this ancestry group and you go to a more distant ancestry group, there's a falloff in the quality of prediction. And this is... again, this is, like, frontier questions, so we don't know the answer for sure. But most people believe, or many people believe, that what happens is that there's a certain correlational structure in each population, where if I know the state of this SNP, I can predict the state of these neighboring SNPs. And that, that is a product of the mating patterns and the ancestry, you know, of that group. And sometimes the predictor, which is just using statistical power to figure things out, will grab one of these SNPs as a tag for the truly causal SNP that's in there. It doesn't really know which one is truly causal. It's just grabbing a tag. But the tagging quality falls off if you then go to another population. Like, this was a very good tag for the truly causal SNP in the British population, but it's not so good a tag in the South Asian population for the truly causal SNP, which we hypothesize is the same. It's the same underlying genetic architecture in these different ancestry groups. We don't know. That's a hypothesis. But even so, the tagging quality falls off. So my group... you know, we've spent a lot of our time looking at performance of predictor train and population A on distant population B, and doing all this stuff, modeling it, trying to figure out, trying to test hypotheses as to whether it's just the tagging decay which is responsible for most of the fault. So all of this is an area of very active investigation. I think, you know, it'll probably be solved in five years. Um, the first really big bio-banks that are non-European are coming online, and so we're... I think we're gonna solve it, you know, in some number of years.
- DPDwarkesh Patel
Oh, uh, what, what does the solution look like? I guess 'cause if you don't know, um... unless you can identify like the causal mechanism by which each SNP is having an effect, how can you know that something is a tag or whether it's the actual underlying, uh, you know, switch?
- SHSteve Hsu
The resolution will be... again, the resol- the, the n- the nature of reality determines (laughs) how this is gonna go, so... and we don't know the nate underlying biology. If it's true, and this is the amazing thing, like people argue about like human biodiversity and all this stuff, and we don't even know whether the specific mechanisms that, say, predispose you to being tall or to having heart disease are the same in these different ancestry groups. We, we assume that it is, but we don't know that. And, and, you know, like as we get further away, like to Neanderthals or Homo erectus, you might be like, "Yeah, they have a slightly different architecture there than we do." But let's assume that the causal structure is the same for South Asians and for British people, okay? And then it's a matter of improving the tags. And you might say, "Wait, wait. Wait a minute, Steve. How do I know? How do I know if I don't know which one is causal how do... what do you mean by improving the tags?" This is a machine learning problem, (laughs) so the question is if there's a SNP, which when I use it across multiple ancestry groups is always coming up as very significant, maybe that one's truly causal. As I vary the tagging correlations in the neighborhood of that SNP, I always find that that one is in the, in the intersection, the, the intersection of all these different sets. That makes me think that one's gonna actually be causal. So, so that's the, that's a process we're engaged in now, is to try to basically do that. It's a mach- it's, uh, basically just a machine learning problem, but, but we need data. That's the main issue.
- DPDwarkesh Patel
Uh, yeah. I was k- I was kind of hoping that wouldn't be possible, because one worry you might have about this research is that, you know, like it, it'll... it'd itself become taboo or cause other sorts of bad social consequences if you can like definitively show that on certain traits there's differences between ancestral populations, right? And since I was kind of hoping (laughs) that maybe there's like just a evasion button where like, yeah, w- we can't say because they're, they're just tags and the tags might be different between different ancestral populations. But I guess with better machine learning, we'll know.
- SHSteve Hsu
That's the situation we're in now, where you have to do some fancy analysis. If you want to claim like Italians literally have lower height potential than Nordics, which is possible and there's been a ton of research about this because there's signals of selection. It looks like the, the, uh, the alleles which are, uh, activated in height predictors looks like they, looks like they've been under some na- some selection between North and South Europe over the last 5,000 years, uh, for whatever reason. We don't know the reason, but there, there are some s- uh... but this is a, this is a thing which is debated by people who study molecular evolution. But suppose it's true, okay? And then what that would mean is that when we finally get to the bottom of it and we find all the causal low side for height, literally the average value for the Italians is lower than the average value for the people living in Stockholm, and that might be true. Uh, people don't get that excited, or they get a little bit excited about height, but they would get really excited if this were true for some other traits, right? Suppose like your extraversion, you know, the, the causal variance affecting your level of extraversion is systematic, that the average value of those weighted... the weighted average of those, uh, states is, um, different in, you know, Japan versus, uh, Sicily.
- DPDwarkesh Patel
Mm-hmm.
- SHSteve Hsu
Right?
- DPDwarkesh Patel
Yeah, yeah.
- SHSteve Hsu
People might freak out over that.
- DPDwarkesh Patel
Right.
- SHSteve Hsu
Um, I'm supposed to just say that's obviously not true. It's obviously not true. It can't be tru- how could it possibly be true?... because there hasn't been enough evolutionary time for those differences to arise. After all, it's not possible that despite what we- what looks to be the case for height over the last 5,000 years in Europe, no other traits could possibly have been differentially selected for over the last 5,000 years. That's the really dangerous thing. So the- th- there are few people who understand this field well enough to understand what you and I just discussed, and who are so alarmed by it that they're just trying to suppress everything. There are people like that. But most of them actually don't really follow it at this- the technical level that you and I are disc- uh, are discussing it. So they're just, like, uh, kind of instinctively negative about it, but they don't- they don't really understand it very well.
- DPDwarkesh Patel
Uh, that- that's good to hear, 'cause I- I- that's, um... Yeah, in a lot of other spaces you see this pattern, that by the time that somebody might want to regulate or, um, uh, is- uh, in some way interfere with, uh, uh, some technology or some information, it already has achieved wide adoption. Uh, you could argue that that's w- the case with crypto today. But if it's true that, like, uh, a bunch of IVF clinics across the world are using this- uh, using, you know, this- these scores to, uh, do selection and other things, um, yeah, by the time the people realize the implications of this data for other kinds of, um, social questions, uh, the- y- by that time, this will already be, like, an actual consumer technology, hopefully.
- SHSteve Hsu
I think that- I think that's
- 1:07:59 – 1:16:00
Is this eugenics?
- SHSteve Hsu
true. I- I think the main outcry will be if it turns out that there are really big gains to be had-
- DPDwarkesh Patel
Hmm.
- SHSteve Hsu
... and only the billionaires are getting them. But- but that might have the consequence of causing countries to make this free, part of their national healthcare system. So Denmark, Israel, th- they- they pay for IVF, uh, for infertile, uh, couples. So it's- it's- it's part of their national healthcare system, and they're pretty aggressive about genetic testing. In Denmark, one in 10 babies born now is born through IVF.
- DPDwarkesh Patel
Right.
- SHSteve Hsu
So, um, yeah. So i- i- i- it's not clear how it's gonna go. But, um, yeah. I mean, th- i- we're in for some fun times. There's no doubt (laughs) about it.
- DPDwarkesh Patel
Yeah. W- I guess one way it could go is th- some countries decide to ban it altogether, and another way g- it could go is countries decide to give everybody free access to it.
- SHSteve Hsu
Nationalize it.
- DPDwarkesh Patel
Exactly. Y- yeah, if you had to choose between the two, I guess you would wanna go for the second one, um, which I- which I guess would be the hope. And maybe w- only those two are compatible with people's, um, I- I don't know, th- their, um, moral, uh, intuitions about this kinda stuff.
- SHSteve Hsu
It- it's very funny, because most wokest people today hate this stuff. But most progressives, like Margaret Sanger, or you know, (laughs) anybody who was progressive, uh, intellectual, well, in some sense, the forebearers of today's wokest, in the early 20th century, they were all-
- DPDwarkesh Patel
Yeah.
- SHSteve Hsu
... what we would call today Eugenists, because they were like, "Oh, shoot. We've- thanks to Darwin, we now know how this all works, and we should take steps to keep society healthy, and not in a negative way where we kill people we don't like, but we should just help society do healthy things when they reproduce and, you know, have healthy kids."
- DPDwarkesh Patel
Right.
- SHSteve Hsu
And so now, it- there's- this whole thing has just been flipped over, uh, among progressives, so-
- DPDwarkesh Patel
Yeah.
- SHSteve Hsu
Uh, it's sad.
- DPDwarkesh Patel
Yeah. E- even in India, like, that w- th- that was, like, very recently, uh, less than 50 years ago, when- when Indira Gandhi, uh, she's like the left side of, uh, India's political spectrum, and yeah, she- obviously she was infamous for putting on these, like, forced-stuff sterilization programs, and, um, yeah, so, you know, b- b- uh, I don't wanna credit the person, but somebody made an interesting comment, um, they- they wouldn't want their name associated with this maybe, but somebody made an interesting comment about this where they said, uh, they were asked, like, "Oh, is it true that, uh, progressives in history, th- uh, uh, the history always tilts towards progressives, and if so aren't- uh, aren't ev- isn't everybody else doomed? Aren't their views doomed?" And the person made a really interesting point, which is that, yes, whatever we considered left at the time tends to be winning, but what is left changes a lot over time, right? So if- in the earlier 20th century, uh, prohibition was a left cause, right? It- it was a progressive cause. And you know, that changed, and now that's no longer- I mean, the opposite is a left cause, but-
- SHSteve Hsu
Yeah, now- now legalizing pot is progressive. (laughs)
- DPDwarkesh Patel
Exactly. So the way- if the- you know, if Conquest's Second Law is true and everything just tilts left over time, just change what left is, right? That's a solution.
- SHSteve Hsu
Yeah. Absolutely. I mean, the- I- I- I- Of course, one can't demand that any of these woke guys be, like, intellectually self-consistent or- or even, like, say the same things from one year to another. But if one could, you know, you wonder what they'd think about these literally communist Chinese.
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
These are literally communists. They're recycling huge parts of their GDP to help the poor-
- DPDwarkesh Patel
Yeah.
- SHSteve Hsu
... and do all this other stuff. Me- you know, medicine is free. Everything- you know, education is free, right? They're literally socialists. They're literally communists. But in Chinese, the Chinese characters for eugenics is a totally positive thing. It's just like-
- DPDwarkesh Patel
(laughs)
- SHSteve Hsu
... healthy reproduction. It means healthy re- well, that's actually what it means in Greek too, but- more or less, but- but the- the whole viewpoint on all this stuff is like 180 degrees off in- in East Asia compared to here, and even- even in the- among the literal communists, you know? So...
- DPDwarkesh Patel
Um, so let's talk about one of the traits that people might be interested in potentially selecting for, which is, um, intelligence. Um, do- uh, do we- uh, what is the potential that we'll be able to actually acquire the data to be able to, um, correlate the genotype with intelligence?
- SHSteve Hsu
Well, that- that's the most personally frustrating aspect of all this stuff. Like, if you ask me, like, 10 years ago, when I started doing this stuff, what did I think we were gonna get, I think everything has gone kind of on the optimistic side of what I would have predicted, so everything's good (laughs) , you know? Didn't turn out to be intractably nonlinear or it didn't turn out to be intractably poly- pleiotropic. You know, all these good things which nobody could have known a priori how they would work turned out to be good for gene engineers of the 21st century. Um-The one thing that's frustrating is because of crazy wokeism and fear of crazy wokeists, the most interesting, what I consider the most interesting phenotype of all is lagging, because everybody's afraid. Uh, even though there are very good reasons for medical researchers to want to know the cognitive ability of people in their studies. Uh, for example, when you wanna study aging or decline of cognitive function, memory in older people, it's you wanna have baseline measurements of how good their cognitive function was when they were younger, right? So there are very good reasons for why you wanna have all this data. But researchers are afraid because it's also linked to all these controversial social issues. And so the amount, there's just a ginormous amount of genomic data where there's actually no cognitive measurement attached as a field to that data, uh, which would have been very cheap to measure. Pe- again, wokeists hate this, but I can measure your IQ on like a 12-minute test no problem, right? (laughs) I mean, not with perfect accuracy, but I can get a pretty, I can get a very useful measurement if I just take, like the NFL has this thing called the Wonderlic, which every player that's being considered for the draft is asked to take this Wonderlic. You can go back (laughs) and look at the Wonderlic scores of every NFL player. It's a short test. It's like 12 minutes long or something, and it's, it's pretty highly correlated. It's like probably correlates 0.8 or 0.9, 0.8 maybe with, uh, with a more fulsome IQ measurement. So, um, it would be trivial and inexpensive to gather this data, and then once we have, my prediction from this earlier math that I was talking about is that when you get to of the order a million, it could be one million, it could be two million, uh, well-phenotyped people, uh, and genomes, we would be able to build a pretty decent IQ predictor that might have a standard error of maybe 10 points or something.
- DPDwarkesh Patel
Hmm.
- SHSteve Hsu
So (clears throat) that would be incredibly, for science, just, you know, unlimited interesting stuff in there, but, uh, not getting done. (laughs)
- DPDwarkesh Patel
Yeah. And, if there are differences between, um, I mean, uh, n- n- differences in how things are tagged between different ancestral groups, uh, I'm not talking about like average differences or anything, just how the genotype is tagged. Uh, and if the Chinese do this first, then that's like an adv- they have an advantage that can't be transferred over, I guess, right? Um, 'cause it, the, it's only applicable or, uh, uh, uh, advantageously applicable to, um, their population.
- 1:16:00 – 1:25:02
Tradeoffs to intelligence
- DPDwarkesh Patel
higher with intelligence? I mean, uh, with certain, uh, populations, oh, of course, by the way, disclaimer, 5,000 years not enough, blah, blah, blah. Um, but given that-
- SHSteve Hsu
O- obviously.
- DPDwarkesh Patel
Obviously.
- SHSteve Hsu
Obviously. Yeah.
- DPDwarkesh Patel
Um, but given that you see it with, uh, certain populations like Ashkenazi Jews, you have a lo- a high- higher incidence of, um, uh, is it nervous system disorders, uh, you know, like Tay-Sachs and other things, um, and that seems potentially to be the trade-off of, you know, the, uh, the higher average intelligence. Do you think that maybe the pleiotropy has a higher, uh, chance of occurring with intelli- intelligence?
Episode duration: 2:21:22
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode 80BhjRh-Q-s
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome