Lex Fridman PodcastManolis Kellis: Biology of Disease | Lex Fridman Podcast #133
EVERY SPOKEN WORD
150 min read · 30,067 words- 0:00 – 2:49
Introduction
- LFLex Fridman
The following is a conversation with Manolis Kellis, his third time on the podcast. He's a professor at MIT, and head of the MIT Computational Biology Group. This time, we went deep on the science, biology, and genetics. So this is a bit of an experiment. Manolis went back and forth between the basics of biology to the latest state-of-the-art in the research. He's a master at this. So I just sat back and enjoyed the ride. This conversation happened at 7:00 AM (laughs) , so it's yet another podcast episode after an all-nighter for me. And once again, since the universe has a sense of humor, this one was a tough one for my brain (laughs) to keep up, but I did my best, and I never shy away from a good challenge. Quick mention of each sponsor, followed by some thoughts related to the episode. First is SEMrush, the most advanced SEO optimization tool I've ever come across. I don't like looking at numbers, but someone probably should. It helps you make good decisions. Second is Pessimist Archive. They're back. One of my favorite history podcasts on why people resist new things, from recorded music to umbrellas, to cars, chess, coffee, and the elevator. Third is Eight Sleep, a mattress that cools itself, measures heart rate variability, has an app, and has given me yet another reason to look forward to sleep, including the all-important power nap. And finally, BetterHelp, online therapy when you want to face your demons with a licensed professional, not just by doing the, uh, David Goggins-like physical challenges like I seem to do on occasion. Please check out these sponsors in the description to get a discount and to support this podcast. As a side note, let me say that biology in the brain and in the various systems of the body fill me with awe. Every time I think about how such a chaotic mess coming from its humble origins in the ocean was able to achieve such incredibly complex and robust mechanisms of life that survived despite all the forces of nature that want to destroy it. It is so unlike the computing systems we humans have engineered that it makes me feel that in order to create artificial general intelligence and artificial consciousness, we may have to completely rethink how we engineer computational systems. If you enjoy this thing, subscribe on YouTube, review it with five stars on Apple Podcasts, follow on Spotify, support on Patreon, or connect with me on Twitter @lexfridman. And now, here's my conversation with Manolis Kellis.
- 2:49 – 26:48
Molecular basis for human disease
- LFLex Fridman
So your group at MIT is trying to understand the molecular basis of human disease. What are some of the biggest challenges in your view?
- MKManolis Kellis
Don't get me started.
- LFLex Fridman
(laughs)
- MKManolis Kellis
I mean-
- LFLex Fridman
Here we go again.
- MKManolis Kellis
... understanding human disease is the most complex, uh, challenge in modern science. So because human disease is as complex as the human genome, it is as complex as the human brain, and it is in many ways even more complex, because the more we understand disease complexity, the more we start understanding genome complexity, and epigenome complexity, and brain circuitry complexity, and immune system complexity, and cancer complexity, and so on and so forth. So traditionally, human disease was following basic biology. You would basically understand basic biology in model organisms, like, you know, mouse and fly and yeast. You would understand sort of mammalian biology and animal biology and eukaryotic biology in sort of progressive layers of complexity, getting closer to human, phylogenetically. And you would do perturbation experiments in those species to see, "If I knock out a gene, what happens?" And based on the knocking out of these genes, you would basically then have a way to drive human biology, because you would, you would sort of understand the functions of these genes. And then if you find that a human gene locus, something that you've mapped from human genetics to that gene, is related to a particular human disease, you'd say, "Aha, now I know the function of the gene from the model organisms. I can now go and understand the function of that gene in human." But this is all changing. This is dramatically changed. So that, that was the old way of doing basic biology. You would start with the animal models, the eukaryotic models, the mammalian models, and then you would go to human. Human genetics has been so transformed in the last decade or two that human genetics is now actually driving the basic biology. There is more genetic mutation information in the human gene- genome than there will ever be in any other species.
- LFLex Fridman
What do you mean by mutation information?
- MKManolis Kellis
So, so perturbations is how you understand systems. So an engineer builds systems, and then they know how they work from the inside out. A scientist studies systems through perturbations, that you basically say, "If I poke that balloon, what's gonna happen?" And I'm gonna film it in super high resolution, understand, I don't know, air dynamics or fluid dynamics if it's filled with water, etcetera. So you can then make experimentation by perturbation, and then the scientific process is sort of building models that best fit the data, designing new experiments that best test your models and challenge your models and so on and so forth. That's the same thing with science, basically. If you're trying to understand biological science, you basically want to do perturbations that then drive the models.
- LFLex Fridman
So how do these perturbations allow you to understand disease?
- MKManolis Kellis
So if, if you know that a gene is related to disease, you don't wanna just know that it's related to the disease. You wanna know, what is the disease mechanism? Because you wanna go and intervene.So, the way that I like to describe it is that traditionally, uh, epidemiology, which is basically the study of disease, you know, sort of the observational study of disease, has been about correlating one thing with another thing. So, if you, if you have a lot of people with liver disease who are also alcoholics, you might say, "Well, maybe the alcoholism is driving the liver disease or maybe those who have liver disease self-medicate with alcohol." So the, the connection could be either way. With genetic epidemiology, it's about correlating changes in genome with phenotypic differences, and then you know the direction of causality. So, if you know that a particular gene is related to the disease, you can basically say, "Okay, perturbing that gene in mouse causes the mice to have X phenotype. So perturbing that gene in human causes the humans to have the disease, so I can now figure out what are the detailed molecular phenotypes in the human that are related to that organismal phenotype in the disease." So, it's all about understanding disease mechanism, understanding what are the pathways, what are the tissues, what are the processes that are associated with the disease so that we know how to intervene. You can then prescribe particular medications that also alter these processes. You can prescribe lifestyle changes that also affect these processes and so on and so forth.
- LFLex Fridman
That's such a beautiful puzzle to try to solve, like what kind of perturbations eventually have this ripple effect that leads to disease across a population, and then you study that for animals or mice first, and then see how that might possibly connect to humans. How hard is that puzzle of trying to figure out how little perturbations might lead to, in a stable way, to a disease?
- MKManolis Kellis
In animals, we make the puzzle h- simpler because we perturb one gene at a time. That's the beauty and it's the power of animal models. You can basically decouple the perturbations. You only do one perturbation, and you only do strong perturbations at a time. In human, the puzzle is incredibly complex because, I mean, obviously you don't do human experimentation. You wait for natural selection and natural genetic variation to basically do its own experiments, which it has been doing for hundreds and thousands of years in the human population and for hundreds of thousands of years across, you know, the, the history leading to the human population. So, you basically take this natural genetic variation that we all carry within us. Every one of us carries six million perturbations. So I've done six million experiments on you, six million experiments on me, six million experiments on every one of seven billion people on the planet.
- LFLex Fridman
What's the six million correspond to?
- MKManolis Kellis
Six million unique genetic variants that are segregating the human population. Every one of us carries millions of polymorphic sites. Poly, many, morph, forms. Polymorphic means many forms. Variants. That basically means that every one of us has single nucleotide alterations that we have inherited from mom and from dad that basically can be thought of as tiny little perturbations. Most of them don't do anything, but some of them lead to all of the phenotypic differences that we see b- between us. The reason why two twins are identical is because these variants completely determine the way that I'm gonna look at exactly 93 years of age.
- LFLex Fridman
How happy are you with this kind of data set? Is it, uh, large enough of-
- MKManolis Kellis
It's-
- LFLex Fridman
... the human population of Earth? Is that too big, too small?
- MKManolis Kellis
Yeah, so, so the h- the, is it, is it large enough is a power analysis question, and in every one of our grants, we do a power analysis based on what is the effect size that I would like to detect and what is the natural variation in the two forms? So, every time you do a perturbation, you're asking I'm changing form A into form B. Form A has some natural genetic varia- some natural phenotypic variation around it, and form B has some natural phenotypic variation around it. If those variances are large and the differences between the mean of A and the mean of B are small, then you have very little power. The further the means go apart, that's the effect size, the more power you have, and the smaller the standard deviation, the more power you have. So basically when you're asking, is that sufficiently large? Certainly not for everything, but we already have enough power for many of the stronger effects in the more tight distributions.
- LFLex Fridman
So that's a hopeful message that there exists parts of the genome that ha- ha- that have a strong effect, that has a small, uh, variance.
- MKManolis Kellis
That's exactly right. Unfortunately, those perturbations are the basis of disease in many cases. So it's not a, you know, hopeful message. Sometimes it's a terrible message. It's basically, well, some people are sick, but if when, if we can figure out what are these contributors to sickness, we can then help make them better and help many other people better who don't carry that exact mutation, but who carry mutations on the same pathways, and that's what we like to call the allelic series of a gene. You basically have many perturbations of the same gene in different people, each with a different frequency in the human population and each with a different effect on the individual who carries them.
- LFLex Fridman
So you said, uh, in the past there would be these small experiments on perturbations and animal models. What does this puzzle-solving process look like today?
- MKManolis Kellis
So we basically have, you know, something like seven billion people in the planet, and every one of them carries something like six million mutations...You basically have an enormous matrix of genotype by phenotype, by systematically measuring the phenotype of these individuals. And the traditional way of measuring this phenotype has been to look at one trait at a time. You would gather families, and you would sort of paint the pedigrees of a strong effect, what we like to call Mendelian mutation. So a mutation that gets transmitted in a dominant or a recessive, but strong effect form, where basically one locus plays a very big role in that disease. And you can then look at carriers versus non-carriers in one family, carriers ver- v- versus non-carriers in another family, and do that for hundreds, sometimes thousands of families. And then trace these inheritance patterns, and then figure out what is the gene that plays that role.
- LFLex Fridman
Is this the matrix that you show in, in, uh, in talks or lectures?
- MKManolis Kellis
So that matrix is the input to the stuff that I show in talks. So basically, that matrix has traditionally been strong effect genes. What the matrix looks like now is instead of pedigrees, instead of families, you basically have thousands and sometimes hundreds of thousands of unrelated individuals, each with all of their genetic variance and each with their phenotype. For example, height or lipids, or, you know, whether they're sick or not for a particular trait. That has been the modern view. Instead of going to families, going to unrelated individuals with one phenotype at a time. And what we're doing now as we're maturing in all of these sciences is that we're doing this in the context of large medical systems or enormous cohorts that are very well phenotyped across hundreds of phenotypes, sometimes with a complete electronic health record. So you can now start relating not just one gene, segregating one family, not just thousands of variants segregating with one phenotype, but now you can do millions of variants versus hundreds of phenotypes. And as a computer scientist, I mean, deconvolving that matrix, partitioning it into the layers of biology that are associated with every one of these elements is a dream come true. It's, it's like the world's greatest puzzle. And you can now solve that puzzle by throwing in more and more knowledge about the function of different genomic regions and how these functions are changed across tissues and in the context of disease. And that's what my group and many other groups are doing, are trying to systematically relate this genetic variation with molecular variation at the expression level of the genes, at the epigenomic level of the gene regulatory circuitry, and at the cellular level of what are the functions that are happening in those cells at the single-cell level, using single-cell profiling. And then relate all that vast amount of knowledge computationally with the thousands of traits that each of these of thousands of variants are perturbing.
- LFLex Fridman
I mean, this is something we talked about, I think, last time. So there's these effects at different levels that happen. You said at, at a single-cell level, you're trying to see things that happen due to certain perturbations, and then... S- it's not just like a puzzle of, um, perturbation and disease. It's perturbation then affect at a cellular level, then at an organ level, the body... Like, how do you disassemble this into, like, what your group is working on? Y- you're basically taking a bunch of the hard problems in space. How do you break apart a difficult disease, uh, and break it apart into problems that you... Into puzzles that you can now start solving?
- MKManolis Kellis
Yeah. So there's a struggle here. Computer scientists love hard puzzles, and they're like, "Oh, I want to, you know, build a method that just deconvolves the whole thing computationally." And, you know, that's very tempting, and it's very appealing, but biologists just like to decouple that complexity experimentally, to just like peel off layers of complexity experimentally. And that's what many of these modern tools that, you know, my group and others have both developed and used. The fact that we can now figure out tricks for peeling off these layers of complexity by testing one cell type at a time. Or by testing one cell at a time. And you could basically say, "What is the effect of this genetic variant associated with Alzheimer's on human brain?" Human brain sounds like, oh, it's an organ, of course, just go one organ at a time. But human brain has, of course, dozens of different brain regions, and within each of these brain regions, dozens of different cell types. And every single type of neuron, every single type of glial cell between astrocytes, oligodendrocytes, microglia, between, you know, all of the neural cells and the vascular cells and the immune cells that are co-inhabiting the, the brain between the different types of excitatory and inhibitory neurons that are sort of interacting with each other between different layers of neurons in the cortical layers. Every single one of these has a different type of function to play in cognition, in interaction with the environment, in maintenance of the brain, in energetic needs, in feeding the brain with blood, with oxygen, in clearing out the debris that are resulting from the super high energy production of cognition in, in humans. So all of these things are basically, um, potentially deconvolvable computationally, but experimentally...... you can just do single-cell profiling of dozens of regions of the brain across hundreds of individuals, across millions of cells, and then now you have pieces of the puzzle that you can then put back together to understand that complexity.
- LFLex Fridman
I mean, first of all, I mean, the human brain, the cells in the human brain are the most, okay, maybe I'm romanticizing it, but cognition seems to be very complicated. So, uh, separating into the f- the function, breaking Alzheimer's down to s- the cellular level seems very challenging. S- is that basically you're trying to find a way that some perturbation in genome results in some obvious major dysfunction in the cell? You- you're trying to find something like that?
- MKManolis Kellis
Uh, e- exactly. So, so what does human genetics do? Human genetics basically looks at the whole path, from genetic variation all the way to disease. So human genetics has basically taken thousands of Alzheimer's cases and thousands of controls matched for age, for sex, for, you know, uh, environmental backgrounds, and so on and so forth, and then looked at that map where you're asking, "What are the individual genetic perturbations? And how are they related to all the way to Alzheimer's disease?" And that has actually been quite successful. So we now have, you know, more than 27 different loci, these are genomic regions that are associated with Alzheimer's, at this end-to-end level. But the moment you sort of break up that very long path into smaller levels, you can basically say, "From genetics, what are the epigenomic alterations at the level of gene regulatory elements, where that genetic variant perturbs the control region nearby?" That effect is much larger.
- LFLex Fridman
You mean much larger in terms of its down-the-line impact? Or...
- MKManolis Kellis
It's much larger in terms of the measurable effect, this-
- 26:48 – 32:31
Deadliest diseases
- LFLex Fridman
To zoom back out, we've been talking about the genetic origins of diseases, but I think it's fascinating to, um, talk about what are the most important diseases to understand? And especially as it connects to the things that you're working on.
- MKManolis Kellis
So, it's very difficult to think about important diseases to understand. There's many metrics of importance. One is lifestyle impact. I mean, if you look at COVID, the impact on lifestyle has been enormous. So understanding COVID is important, because it has impacted the well-being in terms of ability to have a job, ability to have an apartment, ability to go to work, ability to have a mental circle of support, and, uh, all of that for, you know, millions of Americans, like, huge, huge impact. So that's one aspect of importance. So basically, mental disorders. Alzheimer's has a huge importance in the well-being of Americans, whether or not it die, it kills someone, for many, many years, it has a huge impact. So the first measure of importance is just well-being. Like-
- LFLex Fridman
The impact on the quality of life.
- MKManolis Kellis
Impact on the quality of life, absolutely. The second metric, which is much easier to quantify, is deaths.
- LFLex Fridman
What is the number one killer?
- MKManolis Kellis
The number one killer is actually heart disease. It is actually killing 650,000 Americans per year. Number two is cancer, with 600,000 Americans. Number three, far, far down the list, is accidents. Every single accident combined. So basically, you, you know, you read the news, accidents like, you know, there was a huge car crash all over the news. But the number of deaths? Number three by far, 167,000. Lower respiratory disease, so that's asthma, not being able to breathe and so on and so forth, 160,000. Alzheimer's, number fi- number five, with 120,000. And then stroke, brain aneurysms and so on and so forth, that's 147,000. Diabetes and metabolic disorders, etc., that's 85,000. The flu is 60,000. Suicide, 50,000. And then overdose, etc., you know, goes further down the list. So of course, COVID has creeped up to be the number three killer this year with, you know, more than 100,000 Americans and counting. Um, and, you know, but, but if you think about sort of what do we use, what are the most important diseases? You have to understand both the quality of life and the just sheer number of deaths, and just numbers of years lost, if you wish.
- LFLex Fridman
And, and, uh, each of these diseases you can think of as, uh, and also including terrorist attacks, and school shootings, for example, things which lead to fatalities, you can look at as problems that could be solved. And some problems are harder to solve than others. I mean, that's part of the equation. So maybe if you look at these diseases, if you look at heart disease or cancer or Alzheimer's or just, uh, like, schizophrenia and obesity, diabe- like, not necessarily things that kill you, but affect the quality of life, which problems are solvable? Which aren't? Which are harder to solve? Which aren't?
- MKManolis Kellis
I lo- I love your question, because it puts it in the context of a global, um, e- effort, rather than just a local effort. So basically, if you look at the global aspect, exercise and nutrition are two interventions that we can, as a society, make a much better job at.So if you think about sort of the availability of cheap food, it's extremely high in calories, it's extremely detrimental for you, like a lot of processed food, etc. So if we change that equation, and, as a society, we made availability of healthy food much, much easier, and charged a burger at McDonald's the price that it costs on the health system, then people would actually start buying more healthy, uh, foods. So basically that's sort of a societal intervention, if you wish. In the same way, increasing empathy, increasing education, increasing the social framework and support would basically lead to fewer suicides. It would lead to fewer murders. It would lead to fewer, you know, deaths overall. Um, so, you know, that's something that we as a society can do. You can th- you can also think about external factors versus internal factors. So the external factors are basically communicable diseases like COVID, like the flu, etc., and the internal factors are basically things like, you know, cancer and Alzheimer's, where basically your, your genetics will eventually, you know, drive you there. Um, and then, of course, with all of these factors, every single disease has both a genetic component and an environmental component. So heart disease, you know, huge genetic contribut- contribution. Alzheimer's, it's like, you know, 60%, uh, plus genetic. So I think it's like 79% heritability. So that basically means that genetics alone explains 79% of Alzheimer's incidents.
- LFLex Fridman
Uh-huh.
- MKManolis Kellis
And, yes, there's a 21% environmental component where you could basically enrich your cognitive environment, enrich your social interactions, read more books, learn a foreign language, go running, you know, sort of have a more fulfilling life. All of that will actually decrease Alzheimer's, but there's a limit to how much that, that can impact because of the huge genetic footprint.
- 32:31 – 41:22
Genetic component of diseases
- MKManolis Kellis
- LFLex Fridman
So this is fascinating. So s- uh, uh, each one of these problems have a genetic component and an environment component. And so, like, when there's a genetic component, what can we do about some of these diseases? What, what have you worked on? What can you say that's, uh, in terms of problems that are solvable here, or understandable?
- MKManolis Kellis
So my group works on the genetic component, but I would argue that understanding the genetic component can have a huge impact even on the environmental component. Why is that? Because genetics gives us access to mechanism, and if we can alter the mechanism, if we can impact the mechanism, we can perhaps counteract some of the environmental components.
- LFLex Fridman
Oh, interesting.
- MKManolis Kellis
So understanding the biological mechanisms leading to disease is extremely important in being able to intervene. But when you can intervene, and what... Uh, you know, the analogy that I like to gay- to, to give is, for example, for obesity. You know, think of it as a giant bathtub of fat. There's basically fat coming in from your diet, and there's fat coming out from your exercise, okay? That's an in-out equation, and that's the equation that everybody's focusing on. But your metabolism impacts that, you know, bathtub. Basically, your metabolism controls the rate at which you're burning energy, it controls the way, the rate at which you're storing energy, and it also teaches you about the various valves that control the input and the output equation. So if we can learn from the genetics, the valves, we can then manipulate those valves, and even if the environment is feeding you a lot of fat and getting a little of that out, you can just poke another hole at the bathtub-
- LFLex Fridman
(laughs) .
- MKManolis Kellis
... and just get a lot of the fat out.
- LFLex Fridman
Yeah. That's fascinating. Yeah, so the... We're not just passive observers of our genetics. Um, the more we understand, the more we can come up with actual treatments.
- MKManolis Kellis
And I think that's an important, uh, aspect to realize when people are thinking about, uh, strong effect versus weak effect variants. So some variants have strong effects. We talked about these Mendelian disorders where a single gene has a sufficiently large effect, pan- and trans-expressivity, and so on and so forth, that basically you can, um, trace it in families with cases and not cases, cases, not cases, and so on and so forth. But even the... But, you know, but... So, so these are the genes that everybody says, "Oh, that's the genes we should go after, because that's a strong effect gene." I like to think about it slightly differently. These are the genes where genetic impacts that have a strong effect were tolerated, because every single time we have a genetic association with disease, it depends on two things. Number one, the obvious one, whether the gene has an impact on the disease. Number two, the more subtle one, is whether there is genetic in- variation standing and circulating and segregating in the human population that impacts that gene. Some genes are so darn important that if you mess with them, even a tiny little amount, that person's dead. So those genes don't have variation. You're not gonna find a genetic ass- association if you don't have variation. That doesn't mean that the gene has no role. It's simply that the gene... It simply means that the gene tolerates no mutations.
- LFLex Fridman
So that's actually a strong signal when there's no variation. That's so fascinating (laughs) .
- MKManolis Kellis
Exactly. Genes that have very little variation are hugely important. You can actually rank the importance of genes based on how little variation they have.
- LFLex Fridman
Yeah.
- MKManolis Kellis
And those genes that have very little variation but no association with disease, that's a very good metric to say, "Oh, that's probably a developmental gene, because we're not good at measuring those phenotypes." So it's genes that you can tell evolution has excluded mutations from, but yet we can't see them associated with anything that we can measure nowadays, it's probably early embryonic lethal.
- LFLex Fridman
What are all the words you just said? Early embryonic what?
- MKManolis Kellis
Lethal.
- LFLex Fridman
Meaning?
- MKManolis Kellis
Meaning that-
- LFLex Fridman
If you don't have that, then-
- MKManolis Kellis
... that embryo will die.
- LFLex Fridman
Yeah, okay. There's a bunch of stuff that, um, is required for a stable, functional organism-
- MKManolis Kellis
Exactly.
- LFLex Fridman
... across the board.
- MKManolis Kellis
Exactly.
- LFLex Fridman
For our entire, uh, for- for an entire species, I guess.
- MKManolis Kellis
If you look at sperm, it expresses thousands of proteins. Does sperm actually need thousands of proteins? No. But it- it's probably just testing them.
- LFLex Fridman
(laughs) Early on.
- MKManolis Kellis
So my speculation is that misfolding of these proteins is an early test for failure, so that out of the, you know, millions of sperm that are possible, you select the subset-
- LFLex Fridman
Right.
- MKManolis Kellis
... that are just not grossly misfolding thousands of proteins.
- LFLex Fridman
So it's kind of a- an assert, uh, that this is folded-
- 41:22 – 57:09
Genetic understanding of disease
- MKManolis Kellis
- LFLex Fridman
Interesting. So you're... Yeah, okay, so that's what we're looking at. Then what have we been able to find in terms of which disease could be helped?
- MKManolis Kellis
Again, (laughs) don't- don't get me started.
- LFLex Fridman
(laughs)
- MKManolis Kellis
This is, um... We- we- we have found so much. Our understanding of disease has changed so dramatically with genetics. I mean, places that we had no idea would be involved. So, one of the worst things about my genome is that I have a genetic predisposition to age-related macular degeneration, AMD. So it's a form of blindness that causes you to ch- to lose the central part of your vision progressively as you grow older. My increased risk is fairly small. I have an 8% chance. You only have a 6% chance.
- LFLex Fridman
You- Um, I'm an average-
- MKManolis Kellis
Yeah.
- LFLex Fridman
By the way, when you say my, you mean literally yours? You know this about you?
- MKManolis Kellis
I know this about me. Yeah.
- LFLex Fridman
Which is kind of a... I mean, uh, philosophically speaking, is a pretty powerful thing-
- MKManolis Kellis
So-
- LFLex Fridman
... to live with. I mean, m- maybe that's, uh... So we agreed to talk again, by the way, for the-
- MKManolis Kellis
(laughs)
- LFLex Fridman
... listeners, to where we're gonna try to focus on science today and, uh, a little bit of, uh, philosophy next time. But-It's, uh, interesting to think about, the more you're able to know about yourself from the genetic information, in terms of the diseases, how that changes your own view of life.
- MKManolis Kellis
Yeah. So there's, there's a lot of impact there, and there's, uh, something called genetic exceptionalism, which basically thinks of genetics as something very, very different than everything else, as a type of determinism. And, um, you know, let's talk about that next time.
- LFLex Fridman
(laughs)
- MKManolis Kellis
So basically-
- LFLex Fridman
That's a good preview.
- MKManolis Kellis
Yeah (laughs) . So let's go back to AMD. So basically-
- LFLex Fridman
Okay.
- MKManolis Kellis
... with AMD, we have no idea what causes AMD, you know? It was, it was a mystery u- un- until the genetics were worked out. And now, the fact that I know that I have a predisposition allows me to sort of make some life choices, number one. But number two, the genes that lead to that predisposition give us insight as to how does it actually work, and that's a place where genetics gave us something totally unexpected. So there's a complement pathway, which is an immune function pathway, that was in, you know, most of the loci associated with AMD. And that basically told us that, wow, there's an immune basis to this eye disorder that people had just not expected before. If you look at complement, it was recently also implicated in schizophrenia. And there's a type of microglia that is involved in synaptic pruning. So synapses are the connections between neurons, and in this whole use it or lose it view of mental cognition and other capabilities, you basically have, uh, microglia, which are immune cells that are sort of constantly traversing your brain and then pruning neuronal connections, pruning synaptic connections that are not utilized. So in schizophrenia, there's thought to be a change in the pruning, that basically if you don't prune your synapses the right way, you will actually have an increased role of schizophrenia. This is something that was completely unexpected for schizophrenia. Of course, we knew it has to do with neurons, but the role of the complement complex, which is also implicated in AMD, which is now also implicated in schizophrenia, was a huge surprise.
- LFLex Fridman
What's the complement complex?
- MKManolis Kellis
So it's basically a set of genes, the complement genes-
- LFLex Fridman
Mm-hmm.
- MKManolis Kellis
... that are basically having various immune roles. And as I was saying earlier, our immune system has been co-opted for many different roles across the body, so they-
- LFLex Fridman
So
- NANarrator
the-
- MKManolis Kellis
... actually play many diverse roles.
- LFLex Fridman
And somehow, the immune system is connected to the synaptic pruning process.
- MKManolis Kellis
Exactly.
- 57:09 – 1:03:10
Unified theory of human disease
- LFLex Fridman
again, maybe it's a romanticized question, but you know, there's, in physics, there's a theory of everything. Do you think it's possible to move towards an almost, uh, theory of everything of disease from a genetic perspective? So if this unification continues, is it possible that... Like do you think in those terms? Like trying to arrive at a fundamental understanding of how disease emerges, period?
- MKManolis Kellis
That unification is not just foreseeable, it's inevitable.
- LFLex Fridman
Mm.
- MKManolis Kellis
I see it as inevitable. We have to go there. You cannot be a specialist anymore if you're a genomicist. You have to be a specialist in every single disorder. And the reason for that is that the fundamental understanding of the circuitry of the human genome that you need to solve schizophrenia, that fundamental circuitry is hugely important to solve Alzheimer's. And that same circuitry is hugely important to solve metabolic disorders. And that same exact circuitry is, uh, hugely important for solving immune disorders and cancer, and you know, every single disease. So all of them have the same sub-task. And I teach dynamic programming in my class. Dynamic programming is all about sort of not redoing the work. It's reusing the work that you do once. So basically for us to say, "Oh great, you know, you guys in the immune building go solve the fundamental circuitry of everything, and then you guys in the schizophrenia building go solve the fundamental circuitry of everything separately," is crazy. So what we need to do is come together and sort of have a circuitry group, the circuitry building that sort of tries to solve the circuitry of everything, and then the immune folks who will apply this knowledge to all of the disorders that are associated with immune dysfunction. And the schizophrenia folks who are basically interacting with both the immune folks and with the neuronal folks, and all of them will be interacting with the circuitry folks, and so on and so forth. So that's sort of the current structure of my group if you wish. So basically what we're doing is focusing on the fundamental circuitry, but at the same time, we're the users of our own tools by collaborating with many other labs in every one of these disorders that we mentioned. We basically have a heart focus on cardiovascular disease, coronary artery disease, heart failure and so on and so forth. We have an immune focus on, uh, several immune disorders. We have a cancer focus on metastatic melanoma and immunotherapy response. We have a psychiatric disease focus on schizophrenia, autism, PTSD, and other psychiatric disorders. We have an Alzheimer's and neurodegeneration focus on Huntington's disease, ALS, and you know, AD-related disorders like frontotemporal dementia and Lewy body dementia, and of course, a huge focus on Alzheimer's. We have a metabolic focus on the role of exercise and diet, and sort of how they are impacting metabolic, uh, you know, organs across the body, and across many different tissues. And all of them are interfacing with the circuitry, and the reason for that is another computer science principle of eat your own dog food.
- LFLex Fridman
Mm-hmm.
- MKManolis Kellis
If everybody ate their own dog food, dog food would taste a lot better.
- LFLex Fridman
Mm-hmm.
- MKManolis Kellis
The reason why Microsoft Excel and Word and PowerPoint was so important and so successful is because the employees that were working on them were using them....for their day-to-day tasks. You can't just simply build a circuitry and say, "Here it is, guys. Take the circuitry. We're done," without being the users of that circuitry because you then go back, and because we span the whole spectrum from profiling the epigenome, using comparative genomics, finding the important nucleotides in the genome, building the basic functional map of what are the genes in the human genome, what are the gene regulatory elements of the human genome. I mean, over the years, we've written a series of papers on, how do you find human genes in the first place using comparative genomics? How do you find the motifs that are the building blocks of gene regulation using comparative genomics? How do you then find how these motifs come together and act in specific tissues using epigenomics? How do you link regulators to enhancers, and enhancers to their target genes using epigenomics and regulatory genomics? So, through the years, we've basically built all this infrastructure for understanding, what I like to say, every single nucleotide of the human genome and how it acts in every one of the major cell types and tissues of the human body. I mean, this is no small task. This is an enormous task that takes the entire field, and that's something that my group has taken on along with many other groups. And we have also, and that sort of, I think, sets my group perhaps apart, we have also worked with specialists in every one of these disorders to basically further our understanding all the way down to disease, and in some cases, collaborating with pharma to go all the way down to therapeutics because of our deep, deep understanding of that basic circuitry, and how it allows us to now improve the circuitry, not just treat it as a black box. They basically go and say, "Okay, we need a better cell type-specific wiring that we now have tiss- at this tissue-specific level." So, we're focusing on that because we're understanding, you know, the needs from the disease front.
- LFLex Fridman
So, you have a sense of the entire pipeline. I mean, o- one... M- maybe you can indulge me. One nice question to ask would be,
- 1:03:10 – 1:28:13
Genome circuitry
- LFLex Fridman
how do you, from the scientific perspective, go from knowing nothing about the disease to going, you said, uh, to, to going through the entire pipeline and actually have a drug or, or a treatment that cures that disease?
- MKManolis Kellis
So, that's an enormously long path and an enormously great challenge. And what I'm trying to argue is that it progresses in stages of understanding, rather than one gene at a time. The traditional view of biology was you have one post-doc working on this gene and another post-doc working on that gene, and they'll just figure out everything about that gene, and that's their job. What we've realized is how polygenic the diseases are, so we can't have one post-doc per gene anymore. We now have to have these cross-cutting needs, and I'm gonna describe the path to circuitry along those needs, and every single one of these paths, we are now doing in parallel across thousands of genes. So, the first step is you have a genetic association, and we talked a little bit about sort of the Mendelian path and the polygenic path to that association. So, the Mendelian path was looking through families to basically find gene regions, and ultimately genes that are underlying particular disorders. The polygenic path is basically looking at unrelated individuals in this giant matrix of genotype by phenotype and then finding hits where a particular variant impacts disease all the way to the end, and then we now have a connection, not between a gene and a disease, but between a genetic region and a disease. And that distinction is not understood by most people, so I'm gonna e- explain it a little bit more. Why do, do we not have a connection between a gene and a disease, but we have a connection between a genetic region and a disease? The reason for that is that 93% of genetic variants that are associated with disease don't impact the protein at all. So, if you look at the human genome, there's 20,000 genes. There's 3.2 billion nucleotides. Only 1.5% of the genome codes for proteins. The other 98.5% does not code for proteins. If you now look at where are the disease variants located, 93% of them fall in that outside the genes portion. Of course, genes are enriched, but on- they're only enriched by a factor of three. That means that still 93% of genetic variants fall outside the proteins. Why is that difficult? Why is that a problem? The problem is that, when a variant falls outside the gene, you don't know what gene is impacted by that variant. You can't just say, "Oh, it's near this gene. Let's just connect that variant to the gene." And the reason for that is that the genome circuitry is very often long range. So, you basically have that genetic variant that could sit in the intron of one gene, and an- an intron is sort of the, the place between the exons that code for proteins. So, proteins are split up into exons and introns, and every exon codes for a particular subset of amino acids, and together, they're spliced together, and then make the final protein. So, that genetic variant might be sitting in an intron of a gene. It's transcribed with a gene. It's processed and then excised, but it might not impact this gene at all. It might actually impact another gene that's a million nucleotides away.
- LFLex Fridman
So, it's just riding along, even though it has nothing to do with, uh, with its, uh, nearby neighborhood.
- MKManolis Kellis
That's exactly right.Let me give you an example.
- LFLex Fridman
Oh, man.
- MKManolis Kellis
The strongest genetic association with obesity was discovered in this FTO gene, fat and obesity associated gene. So this FTO gene was studied ad nauseum. People did tons of experiments in, on it. They figured out that FTO is in fact, uh, RNA methylationtransferase. It basically cre- it, it sort of impacts something that we know, that we call the epitranscriptome. Just like the genome can be modified, the transcriptome, the transcripts of the genes can be modified. And we basically said, "Oh, great, that means that, that epitranscriptomics is hugely involved in obesity, because that, that gene, FTO, is, you know, uh, clearly where the genetic locus is at." My group studied FTO in collaboration with, you know, a wonderful team, uh, led by Melina Claussnitzer, and what we found is that this FTO locus, even though it is as associated with obesity, does not implicate the FTO gene. The genetic variant sits in the first intron of the FTO gene, but it controls two genes, IRX3 and IRX5, that are sitting 1.2 million nucleotides away, several genes away.
- LFLex Fridman
Oh, boy.
- MKManolis Kellis
(laughs)
- LFLex Fridman
Uh, so what am I supposed to feel about that? 'Cause isn't that, like, super complicated then?
- MKManolis Kellis
Uh, so, so the way that I was introduced at a conference a few years ago was, uh, "And here's Manolis Kellis who wrote the most depressing paper of 2015."
- LFLex Fridman
(laughs)
- MKManolis Kellis
(laughs) And the reason for that is that the entire pharmaceutical industry was so comfortable that there was a single gene in that locus, because in some loci, you basically have three dozen genes that are all sitting in the same region of association. And you're like, "Oh, gosh, which ones of those is it?" But even that question of which ones of those is it, is making the assumption that it is one of those, as opposed to some random gene just far, far away, which is what our paper showed. So basically what our paper showed is that you can't ignore the circuitry. You have to first figure out the circuitry, all of those long-range interactions, how every genetic variant impacts the expression of every gene in every tissue imaginable across hundreds of individuals. And then, you now have one of the building blocks, not even all of the building blocks, for then going and understanding disease.
- LFLex Fridman
So, okay. So embrace the, the wholeness of the circuitry.
- MKManolis Kellis
Correct.
- LFLex Fridman
But what ... So ba- back to the question of, of starting knowing nothing to the disease and, and going to the treatment-
- MKManolis Kellis
So, so you-
- LFLex Fridman
... what, what are the next steps?
- MKManolis Kellis
So you basically have to first figure out the tissue, and then describe how you figure out the tissue. You figure out the tissue by taking all of these noncoding variants that are sitting outside proteins, and then figuring out what are the epigenomic enrichments. And the reason for that, you know, thankfully, is that there is convergence, that the same processes are impacted in different ways by different loci. And that's a saving grace for our field, the fact that if I look at hundreds of genetic variants associated with Alzheimer's, they localize in a small number of processes.
- LFLex Fridman
Can you clarify why that's hopeful? So, like, they show up in the same exact way in the, in the specific set of processes that we-
- MKManolis Kellis
Yeah, so-
- LFLex Fridman
... were saying?
- MKManolis Kellis
... so basically there's a small number of biological processes that underlie, or at least that play the, the biggest role in every disorder. So in Alzheimer's, you basically have, you know, maybe 10 different types of processes. One of them is lipid metabolism. One of them is immune cell function. One of them is neuronal energetics. So these are just a small number of processes, but you have multiple lesions, multiple genetic perturbations that are associated with those processes. So if you look at schizophrenia, it's excitatory neuron function, it's inhibitory neuron function, it's synaptic pruning, it's calcium signaling, and so on and so forth. So when you look at disease genetics, you have one hit here and one hit there, and one hit there, and one hit there, completely different parts of the genome, but it turns out all of those hi- hits are calcium signaling proteins.
- LFLex Fridman
Oh, cool.
- MKManolis Kellis
You're like, "Aha, that means that calcium signaling is important." So those people who are focusing on one locus at a time cannot possibly see that picture. You have to become a genomicist. You have to look at the omics, the om-, the holistic picture, to understand these enrichments.
- LFLex Fridman
But you, you mentioned the convergence thing. So the-
- MKManolis Kellis
So-
- LFLex Fridman
... what-, whatever the thing associated with the disease shows up-
- MKManolis Kellis
So let me explain convergence.
- LFLex Fridman
Yeah.
- MKManolis Kellis
Convergence is such a beautiful concept.
- 1:28:13 – 1:39:50
CRISPR
- MKManolis Kellis
So you know CRISPR, right? CRISPR is this genome guidance and cutting mechanism. It's what George Church like to, likes to call genome vandalism. So you basically are able to... (laughs)
- LFLex Fridman
Good line.
- MKManolis Kellis
You can basically take a guide RNA that you put into the CRISPR, uh, system, and the CRISPR system will basically use this guide RNA, scan the genome, find wherever there's a match, and then cut the genome. So, um, you know, it's- I digress, but it's a bacterial immune defense system. So basically, bacteria are constantly attacked by viruses. But sometimes they win against the viruses, and they chop up these viruses, and remember as a trophy. Inside their genome, they have these loci, these CRISPR loci. That basically stands for clustered repeats interspersed, et cetera. So basically, it's, it's an intersperse repeats, uh, structure, where basically you have a set of repetitive regions, and then interspersed were these variable segments that were basically matching viruses. So when this was first discovered, it was basically hypothesized that this is probably a bacterial immune system that remembers the trophies of the viruses that it managed to kill, and then the bacteria pass on, you know, they, they sort of do lateral transfer of DNA, and they pass on these memories so that the next bacterium says, "Ooh, you killed that guy. When that guy shows up again, I will recognize him." And the CRISPR system was basically evolved as a bacterial adaptive immune response to sense foreigners that should not belong, and to just go and cut their genome. So it's an RNA-guided RNA-cutting enzyme, or an RNA-guided DNA-cutting enzyme. So there's different systems. Some of them cut DNA, some of them cut RNA, but all of them remember this, uh, sort of viral attack. So what we have done now as a field is, you know, through the work of, you know, uh, Jennifer Doudna, Emmanuelle Charpentier, Feng Zhang, and many others, is co-opted that system of bacterial immune defense as a way to cut genomes. You basically have this guiding system that allows you to use an RNA guide to bring enzymes to cut DNA at a particular locus.
- LFLex Fridman
That's so fascinating. Just, uh, so this is like already a natural mechanism-
- MKManolis Kellis
Mm-hmm.
- LFLex Fridman
... a natural tool for cutting that was useful-
- MKManolis Kellis
Mm-hmm.
- LFLex Fridman
... in this particular context.
- MKManolis Kellis
Yeah.
- LFLex Fridman
And we're like, "Well, we can use that thing to actually..." it's a nice tool that's already in the body.
- MKManolis Kellis
Yeah, yeah.
- LFLex Fridman
And then we-
- MKManolis Kellis
It's not in our body. It's in the bacterial body. It was discovered by the, by the yogurt industry.
- LFLex Fridman
(laughs)
- MKManolis Kellis
They were trying to make better yogurts, and they were trying to make their bacteria in their yogurt cultures more resilient to viruses. And they were studying bacteria, and they found that, wow, this CRISPR system is awesome. It allows you to defend against that. And then it was co-opted in mammalian systems that don't use anything like that as a ba- as a, as a targeting way to basically bring these DNA-cutting enzymes to any locus in the genome. Why would you want to cut DNA to do anything? The reason is that our DNA has a DNA repair mechanism, where if a region of the genome gets randomly cut, you'll basically scan the genome for anything that matches...... and sort of use it by homology. So, the reason why we're diploid is because we now have a spare copy. As soon as my mom's copy is deactivated, I can use my dad's copy. And somewhere else, if my dad's copy is deactivated, I can use my mom's copy to repair it. So, this is called homologous based repair.
- LFLex Fridman
So, all you have to do is the- the cutting and-
- MKManolis Kellis
That's exactly right.
- LFLex Fridman
... you don't have to do the fixing? (laughs)
- MKManolis Kellis
That's exactly right. You don't have to do the fixing.
- LFLex Fridman
(laughs) 'Cause it's already built in.
- MKManolis Kellis
That's exactly right. But the fixing can be co-opted-
- LFLex Fridman
That's awesome.
- MKManolis Kellis
... by throwing in a bunch of homologous segments-
- LFLex Fridman
Oh.
- MKManolis Kellis
... that instead of having your dad's version-
- LFLex Fridman
Interesting.
- MKManolis Kellis
... have whatever other version you'd like to use.
- LFLex Fridman
Aha. So the fi- so you c- you then control the fixing by throwing in a bunch of other stuff.
- MKManolis Kellis
That's exactly right.
- LFLex Fridman
Doesn't work.
Episode duration: 2:34:57
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode Aq9UPIXbtKI
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome