No PriorsNo Priors Ep. 6 | With Daphne Koller from Insitro
EVERY SPOKEN WORD
90 min read · 17,698 words- 0:00 – 1:49
Introduction
- SGSarah Guo
(digital intro music) Daphne, welcome to the podcast.
- DKDaphne Koller
Thank you, Sarah. It's a pleasure to be here.
- SGSarah Guo
So, as we were saying, we won't ask you to walk through every part of your amazing life story, but you came to biology as a computer science application years into your career. What sparked you going down that route?
- DKDaphne Koller
My initial interest in biology came from the technical side in the sense that the data sets, this is way back when in the mid, uh, 90s, the data sets that were available to machine learning research at the time were kind of boring and not very inspiring, so things like classifying text, um, into 20 different news groups. And I found that there were more interesting data sets technically to be had on the biology side back then as we were starting, for example, to measure the activity of, uh, genes across the entire genome, um, in multiple samples. So initially, it was really more f- more from a technological perspective, but then I ended up actually having, um, an interest in biology in its own right and ultimately ended up having a bifurcated lab at Stanford where half my lab did core machine learning work published in traditional computer science venues and the other half did core biology work that was published in Nature and Cell and Science. And what was really interesting is that most of my computer science colleagues had no idea that I did biology. Most of my life science colleagues had no idea I was in a computer science department, so it was a bit of a bifurcated existence, but it was a lot of fun.
- SGSarah Guo
Amazing. And we'll come back to, you know, full circle working on this problem now. One more historical question for you. Uh, you wrote the book on probabilistic graph models and I spent about half an hour looking in my house for this book. I have it somewhere, but I wanted to have it for the podcast. When I asked a mutual friend, Andrew, what I should ask you, he suggested, you know, what motivated that work and, um, and how that field has changed.
- DKDaphne Koller
So I think that, um, just like, like in most fields,
- 1:49 – 4:34
How Daphne combined her biology and tech interests and ran a bifurcated lab at Stanford
- DKDaphne Koller
there is a swing of the pendulum. A lot of, uh, the early work in probabilistic graphical models was hugely influential in bringing, um, artificial intelligence more into the world of machine learning and, uh, and working with numerical data rather than just symbolic AI. Um, and then I think the advent of, um, deep learning, uh, pushed that to the side a little bit because there was so much power that could be gained from basically the kind of pattern recognition, um, from raw inputs, um, raw images, text and so on, without having to worry very much about interpretable representations. What I think we're starting to see right now is a, uh, pendulum starting to swing back in the sense that there is a greater understanding that you really need a bit of both. You need that, uh, hugely powerful pattern recognition that we get from deep learning, but you also need the ability to reason about things like causality and you also need some interpretability of your deep learning model so that you can potentially convey to a clinician why you made the decision that you did. And so what we're ending up with as, uh, a really powerful paradigm is some kind of synthesis of the ideas from both of these disciplines coming together.
- SGSarah Guo
Let's focus on the problem you're working on, uh, now and for the last, um, h- how long's it been? Four, four years or...
- DKDaphne Koller
Four and a half, yeah.
- EGElad Gil
Be- before we even dive into that, I had a really quick question just on the career side still because, you know, you went from, um, Stanford, I believe and then going and co-founding Coursera with Andrew Ng, and then, uh, you went to Calico right after that, right? And I, I... or, uh, you know, a few years after that. I'm sort of curious, what made you decide to go into Calico because you mentioned your career was split between life sciences and computer sciences, and so you went down the computer science online learning route and then you went back into biology, so I'm a little bit curious what drove you back in.
- DKDaphne Koller
So actually I'm gonna go back and answer the earlier part of that which is what took me to Coursera in the first place because I think it feeds into what took me away. Um, so the, uh, throughout m- much of my career at Stanford, I had an increasing sense of urgency that I needed to make an impact in the world, a real impact on real people, not something that was at one step or two steps removed by training great students and having them go and do amazing things, but by something that I get to experience myself. And so when the work that I was doing at Stanford on technology-assisted education gave rise to the launch of those first, um, Stanford massive open online courses and we saw h- just how much impact those were having, I felt like it was too amazing of an opportunity to pass up and just assume that if I didn't do this then somehow other people would take on the flag and carry it forward. I felt like there was an incredible need to go and actually have that impact myself and make sure that it was done right. And so that led to what my departure from Stanford on what was supposed to be a two-year leave of absence to go and found Coursera and I had the full
- 4:34 – 14:14
Why Daphne resigned an endowed chair at Stanford to build Coursera
- DKDaphne Koller
intention to go back to Stanford at some later point and resume my faculty life. Um, that didn't happen. Stanford has a very strict leave of absence policy and when they came two years later and said, "So are you coming back?" and I responded that it wasn't really the right time, I needed to see the project through for another year or so, and they said that that was not an option, I ended up, uh, doing this completely crazy thing which is resigning an endowed chair from Stanford and, um, staying in industry. My mother thought I was nuts. I think she still thinks I'm nuts.
- SGSarah Guo
(laughs)
- DKDaphne Koller
But, um, I ended up staying at Coursera for a total of about five years. And so five years was kind of, um, a reasonable point to take a step back and, and reflect, and when I did that, this was in, um, early 2016, I realized that while I'd been deep in the trenches, um, building Coursera, the machine learning world had totally transformed because as a reminder I left, um, Stanford for Coursera in late 2011 just before the machine learning revolution really took off in 2012. And so I suddenly lifted my head, looked around me and said, "Wow, machine learning really is transforming the world, uh, but not really having much of an impact on the life sciences." Um...And so, um, I left Coursera in, in good hands. Coursera is a wonderful company but it's not really a deep technology company and certainly not a science company, and decided that where I could have a really disproportionate impact was in bringing these, these two disciplines together because there's just not a lot of people who had the benefit, as I did, of spending basically, you know, 20 years doing machine learning and maybe, um, a decade doing, uh, doing biology and could really speak both languages and figure out how to synthesize them. But since I'd been in industry for five years and away from science and even away from machine learning, I didn't quite know where I wanted to go and what I wanted to do, and so I turned, um, for advice, actually, more than anything else, to Art Levinson, who is the former CEO of Genentech, the former, um, chairman of, of Google and Apple. And I figured that if there was anyone who would know how to bring those two fields together, he was probably uniquely qualified to do that, um, and so I reached out to him for advice 'cause we'd run into each other a num- in a number of different places. I'd been on the thesis committee of his son Jesse, for example. Um, and so I asked him for advice and he was very, um, I think admittedly self-serving in his advice. He, he said, "You should come to Calico."
- EGElad Gil
(laughs)
- DKDaphne Koller
Uh, and, y- and honestly I didn't know much about what Calico did other than it worked on aging which seemed like a really important problem to think about, but I did know that it's not many times that one has the opportunity to work with a luminary like Art Levinson and Adalso, by that point met Hal Baron who was another person I have tremendous respect for. And I figured this was, you know, a really interesting way to spend some time and, and learn from these wonderful people, and so I did that. Um, and so ... And I learned a ton during my time at Calico. It was only 18 months because ultimately I realized that I didn't want to be at a company that focused on a particular biology but really f- uh, build a platform for doing drug discovery differently, addressing some of the points that you, Sarah, made in your introduction about how drug discovery is this incredibly fraught, um, largely unsuccessful and very expensive endeavor, and so how could I, uh, how could I make that happen differently? And it didn't seem like Calico was necessarily the right place to take on what was a platform company built and so that's why I left and founded insitro.
- EGElad Gil
Were there any specific insights from Calico that drove the founding of insitro or was it just more the exposure to biopharmaceuticals and how things are developed that really drove your thinking that maybe ML and AI would have a real application area there?
- DKDaphne Koller
Uh, I think that it was, um, really the exposure for the first time to how biopharmaceuticals were developed as you said. At, uh, Stanford I'd worked a lot at the intersection of machine learning, data science and biology, and, um, realized just, uh, you know, how much power, um, these, uh, machine learning technologies can have when applied even to small datasets and certainly as the technology had evolved tremendously since then, datasets were becoming considerably larger and richer. There was an even larger opportunity to make a huge difference, uh, and so that's what led my, uh, move back into that intersection, and then therefore to Calico, but, um, but I think it was really the realization that I- I guess two-fold. One is that, uh, the- the- the way in which you turned insights into, um, therapeutic interventions was so old-fashioned and so, um, un- um, unaccommodating of the use of data that I felt there had to be a better way to do this which I think that the industry has since started to demonstrate across the board, um, in many different companies. Uh, and I think the other thing that made me make that shift is that whereas data is, uh, in life science is growing tremendously, data in aging and specifically human aging is really hard to get because human aging is a very long process, and, um, and in order to get data on, um, the longitudinal trajectory of human aging today, we'll need it to start collecting data, you know, 20, 30 years ago and the cohorts are rather small. And so I felt like there was a huge opportunity in this intersection that maybe aging wasn't the first place where one could, uh, most beneficially apply it from at least my perspective.
- EGElad Gil
Yeah. When you look across, um, direct development, because I guess right now it costs a billion, a billion and a half dollars to desi- to develop a drug successfully. It takes a decade plus to actually get there, and then a lot of the, um ... When I look at the potential, uh, areas that are challenging in the industry, the sort of the initial small molecule selection and design or alternatively the pathway or cell type they are using, um, separate from that there's the clinical trial itself and how do you figure out who to enroll and how to deal with the data and the patients and everything else. There's all the calibration around diagnostics and endpoints and clinical endpoints and how you think- and all those places seem like there could be real uses of AI. How did you choose what insitro is actually gonna do given how much room there actually is to innovate in this area relative to data, to your point. I mean, it's just- it's shocking how little is done, right? It's like awful. (laughs)
- DKDaphne Koller
I completely agree and, um, and yeah, in some sense the wealth of opportunities here is one of the biggest challenges because everywhere you look there is a big opportunity for machine learning to be deployed in a potentially quite significant way. Um, sometimes I have these discussions with, uh, the increasingly fewer number of people within biopharma who think that, uh, yeah, this machine learning thing is a fad that will go away or maybe that machine learning is gonna be this thing that helps you in a particular point, um, area like, you know, X-ray crystallography. It can improve this narrow little vertical but that's pretty much what it's going to do, and my analogy is that it's not like X-ray crystallography. It's like computers. You're gonna use it everywhere and it's going to be transformative everywhere. It's not gonna be the silver bullet unless you figure out how to use it most effectively, um, but- but the opportunities are pretty much endless, um, across the entire process from beginning to end. So with that, how did we pick what, um, what we end up working on?You know, I thought about this and you could divide the, the process, as many do, into three large chunks. One is the original biology discovery, which is what targets do we employ in what indications and maybe in what patient population is kind of the first chunk. Then there's turning those targets into therapeutic matter, um, which is a molecular design process. And then at the end, there is the enablement of the clinical trials in terms of actually actualizing patient selection or, um, biomarkers for efficacy and things like that. And all of those are important and all of those are valuable, but if you look at the actual numbers of what makes drug discovery so expensive, it is the fact that 95% of drug programs fail. They just do not succeed. And the biggest reason why they don't succeed is not because the clinical trial was poorly designed, that still happens, but it's not the biggest reason, nor is it because the molecule doesn't hit its target and modulate it in the right way. That too happens, but again, it's- it's an increa- increasingly smaller number of situations because pharma companies have gotten better and better at making, uh, therapeutic matter. The place where most programs fail is because, um, is because we're just not modulating the right thing. It's the wrong target in the wrong indication or the wrong patient population. So if you really wanna bring down that $2.5 billion number, what you have to do is to bring down this completely mind-blowing s- uh, statistic of 95% of drug programs fail into something that is much more manageable so that a successful program doesn't have to carry on its back all of the many failures, um, expensive failures of all the things that didn't quite make it. And so I figured that it was maybe the hardest thing to do, but also the thing that was gonna be the most impactful.
- SGSarah Guo
So how do you approach that problem as a- a computer science and now computer science and biology person, the target identification problem?
- DKDaphne Koller
Yeah, you know, it's really hard, right? Because when you think about it, it's the one area where you really, uh, don't have the right type of training data, at least not obviously, because the question you're asking yourself is,
- 14:14 – 18:33
How insitro approaches target identification problems and training data
- DKDaphne Koller
"If I make this therapeutic intervention in this patient, what is it gonna do clinically?" And that is the thing about which you don't have data until the very end of the process, which is called the clinical trial. And so how do you train a machine learning model that doesn't have training data to train it, right? And so the, uh, the direction that we've chosen to take is actually a two-pronged approach and it's the synthesis of the two that we think is particularly powerful. Uh, we bring in t- We bring in data from two quite different sources. One is data from human individuals where we don't get to do experiments, but we have experiments of nature. Um, each of us is an experiment of nature, where nature has modulated our genetics into, um, you know, different types of, um, activity levels or- or, um, of- of individual genes, where some of them behave this way and others behave that way. And we can look at that mapping from genotype to phenotype as a surrogate of what a therapeutic intervention would do in those humans. So that's great, but it limits you to those experiments of nature, and the experiments of nature are not necessarily, um, the same as what a therapeutic intervention would do. And so what we've, uh, done in parallel is to create our own data in our own wet lab where we make interventions in cellular systems and measure the phenotypic consequences there, again, um, using very, uh, large-scale data, uh, with very high-content modalities. And so the machine learning is actually used, I would say, in three different ways. One is to interrogate, um, the phenotypic consequences of genetic variation in a human, looking at very high-content data like imaging, where we know machine learning works really well, like different types of omic modalities, transcriptomics, proteomics, and so on, to really understand that mapping between genetics and phenotype. We similarly look at the mapping between genetic interventions, which, in this case, we get to actually a- um, direct ourselves by doing genome editing of cells and say, "What is the phenotypic consequences of modulating this gene in this cell background and reading out a large, high-content amount of, uh, uh, high-content data to really understand how cell state, um, responds to these interventions?" And so the machine learning is used on each of those two separately and then also to bring them together so that you can kind of think about building cellular models that are predictive of human clinical outcomes, which is ultimately what we're looking to do, is to replace the sort of, um, untranslatable animal models with something that is much more driven from human biology.
- SGSarah Guo
When you think about, again, just like the focusing of Insitro, what domains do you decide to work in first? Because this approach should be quite horizontal. Of- of course, then you have, you know, complexity of what that cellular model can be.
- DKDaphne Koller
It for sure is, and again, uh, focusing has always been a challenge in the sense that there's so many opportunities and how do we say no to some of them? So we've, uh... T- What we've done is tried to go in areas where we think there is both a large unmet need in the sense that the current tools that we're deploying are just not very effective and at the same time, where we think that the technologies that we are developing internally, uh, provide us with a unique differentiated advantage. So one of those areas has been, um, in neuroscience because as we know, the unmet need there is humongous. There are so very few effective therapeutic interventions in neuroscience and that's partly because the, uh, model systems that we've been using, specifically animal models, while one can quibble about in which other therapeutic areas they are more or less, um, relevant, in neuroscience, it is very clear that they are probably not. And that's one of the reasons why things work so well in-... whatever, curing mice of schizophrenia, whatever the heck that means, and then not having much of an impact in human schizophrenia because it's not really even the same disease, right? Um, the other... So that's on the unmet need side. And on the opportunity side, we know that induced pluripotent stem cells are actually, um, relatively easily, um, uh, differentiated into neurons. You can actually see cellular phenotypes that
- 18:33 – 24:08
What are pluripotent stem cells and how insitro identifies individual neurons
- DKDaphne Koller
are quite disease relevant in those neurons.
- SGSarah Guo
Can I stop you for a second? Just 'cause we have mostly a, like, computer science, not biology audience. And can you just explain like how you get a Daphne or an ****** neuron at all?
- DKDaphne Koller
Okay. So, um, in order to get a Daphne neuron in the lab, um, you take either a white blood cell from me or a skin cell from me, and you go through a process of what's called reprogramming, um, which received... A technology which received the Nobel Prize number of years ago, which allows you to turn it into what is basically a stem cell, um, which means a cell that can then be... uh, that can then take any lineage. It doesn't have to form a skin cell, which is where it came from. It can form a liver cell or a heart cell or, or a brain cell. And so, and then with that, um, stem cell, which is why it's called an induced, because you force it to be pluripotent, which means it can go on any different direction, stem cell or it's called IPSC. Um, and you can, depending on what you do to it, it can now be transformed, as I said, into a, a neuron or a, or a cardiomyocyte, which is a heart cell, and so on and so forth. And so you can effectively get the effect of our genetics in these cellular systems. And similarly, uh, you can make an even more pointed change by editing those cells and say, if there is a, uh, genetic variant that we know causes a particular disease or significantly increases the chances of such a disease, we can introduce that into different genetic backgrounds and then do a, a sort of almost like an, uh, in vitro case control, which is same cell with and without the genetic variant, what are the differences? And that very carefully positioned, for tech people, it's like an AB test. Um, uh, this, this in vitro AB test is something that allows us to really get at those differences that are specifically associated with this disease-causing variant. So th- so that is one aspect of the capability that drove us towards our, um, therapeutic areas. The other is, as I said, we have a two-pronged strategy. One is the data that we produce in the lab, and one is data that we collect from humans. So we also looked for areas in which, um, d- data from humans is relatively readily available. And in neuroscience, we have an increasing number of brain MRIs. I think there will be even more now with the approval of some of the ear- earliest Alzheimer's drugs because it's gonna be part of the process by which, uh, people are, um, either selected to receive the drug or not, depending on whether their brain MRI shows certain f- um, certain aspects of disease. Um, the other areas that we've gone into are metabolism and oncology because again, those are areas where relevant, disease-relevant data that is high content, that is unbiased and truly informative about the disease state is collected quite abundantly as part of the standard of care. And so those are, again, we tried to look for areas where there's large unmet need and where the two types of capabilities that we bring to bear can be deployed.
- SGSarah Guo
I was gonna ask, uh, if, if you think about, um, something like, uh, you know, neurodegenerative diseases, Alzheimer's, et cetera, like, you know, is it single cell? Who can, uh, who can say, but feels, feels unlikely? Uh, h- what, what's beyond single cell? And do you guys do organoid research? Like what... Is that within the scope of insitro?
- DKDaphne Koller
Yeah. No, that's a great question. So, um, a lot of complex diseases don't, um, are not encompassed within a single cell lineage. However, I think even there, one can, um, study in many cases, not always, the disease state by looking at a cell type that is clearly relevant to the disease and perhaps pushing it out of its comfort zone. So, for example, in some of the work that we've done in metabolic disease, I mean, it's clear that hepatocytes are not the be-all and end-all of, uh, of what it takes to make a diseased liver, but you can push the hepatocyte out of its comfort zone by putting in the right combination of, you know, fatty acids and maybe various, um, immune system factors or whatever to create a disease state that is much more, um, similar to what you see in its, uh, in its natural environment. Um, that having been said, it's clearly the case that we're not going to be able to recapitulate the entire complexity of, um, a disease state for a lot of those diseases. And so one of the things that we do, and this is in the spirit of being pragmatic and, and prioritizing, there's plenty of things that, uh, we can do today where the, um, where the disease does manifest sufficiently in a single cell lineage. And so we go after those first and we defer some of the other ones to a later stage because technologies such as organoids, for example, that encompass multiple cell types in a single, you know, little micro brain or micro liver, whatever, or sometimes these things called organs on chips, which allow you to actually create things that are more than just even a single organ. They start to create sort of the flow between different organ systems. Those are technologies that other people are currently developing. They're getting better by the day. And so we feel like there's a lot of value that we can bring with the capabilities that are out there, even if we know they're reductionist, even if we know they don't fully capture the disease. Um, but they capture enough diseases so that we can bring medicines to patients. And maybe in three years, we'll have another tranche of diseases that are unlocked by the technological tidal wave that we're all riding.
- EGElad Gil
You mentioned there were sort of two areas of, um......
- 24:08 – 26:48
How insitro operates as an engine for drug discovery and partners to create the drugs themselves
- EGElad Gil
a, exploration, uh, for Insitra right now. One was, um, metabolic disease and, and cancer, I guess that's really three areas, and then the second is, um, neurological areas. I was just sort of curious how far you wanna take these in terms of the actual development of drugs in-house versus partnering out, and then I noticed you had things like relationships with BMS and others for ALS and dementia and a few other areas. So a little bit curious about how far you w- actually wanna take the development of drugs yourself versus partnering with others, and how you think about that in the context of building a company and culture.
- DKDaphne Koller
That's a, a great question and the answer is that we are going to be relatively pragmatic about this as well and, and, um, do what makes sense in terms of maximizing the impact that we have on patients. So one of the things that we have going for us, I think, over a lot of other companies is that what we've built is a... is an engine for generating novel insights, novel targets. So it's not, um, the situation that a lot of companies are in which is you have one program, two programs, and if you kind of sell those off, then you're left with an empty cupboard and then what do you do? You're, you're not a company anymore. So what we think is, is because we have this engine, we have the opportunity to have some of those programs be done in partnership with others, some of those perhaps, um, even be entirely out-licensed to others while the engine continues to give us additional insights, maybe even better insights as we expand, for example, into new indications using new technologies. On the other hand, to think about it from the complementary perspective, some of the targets that we, uh, that we find, uh, ourselves, uh, having emerged from our platform are ones around which there's already a drug available because, you know, there's only 20,000 genes and so sometimes someone may have developed a drug, just didn't deploy it in the right indication or didn't deploy it in the right patient population, and we don't believe that the only thing that, uh, makes, uh, our existence worthwhile is if we come up with a new chemical matter towards those targets. So we might go to the asset owner and say, "Hey, let's work together to bring that asset to patient faster," and, and that can usually, um, sa- sh- shave off, you know, two, three, maybe even five years from the development of a program because you've already made the drug, sometimes you've already put it in people, you've shown that it's safe, you have a good biomarker for when it's working and when it's not. All those things that can really slow down a program if you're starting from absolutely, um, square one and a brand new target. And so we, we hope to be very pragmatic in terms of what we develop in-house and what we develop with others, uh, uh, with the goal of really trying to maximize the impact that the platform can bring to as many
- 26:48 – 33:19
Role of regulations, clinical trials and disease progression in drug delivery
- DKDaphne Koller
patients as possible.
- EGElad Gil
How much work, if any, are you doing on the biomarker side? Because I think one of the points that you just raised is really interesting. When I look at a lot of clinical drug development, a lot of it is waiting for clinical endpoints that may take months or years to really substantiate, and so sometimes the FDA or others will be willing to accept certain clinical biomarkers as sort of intermediary steps or things that tend to vary relative to the trade or the outcome. Are you doing biomarker development as well? Because that seems like such a great area for the applications of ML and yet it seems like there's so little work in terms of actually translating ML into the real world for biomarkers in particular.
- DKDaphne Koller
I completely agree and I think there's, um, you know, there's research that shows that, uh, drugs that have a biomarker are about twice as likely to be successful in the clinic as ones that do not. By the way, there's also data that show that, um, drugs that have support in human genetics are twice as likely to succeed as ones that do not. And so we are deep believers in both of those and, um, I think that because our focus is so much on human data, a lot of the insights that come out of, um, analysis of human clinical data does actually give you a biomarker for which patients are likely to benefit from a parti- particular therapeutic intervention. And so in some ways you can think of biomarker, um, clinical biomarkers as coming out almost for free, if you will. Well, not for free but sort of as a consequence of the work that we're doing anyway, as long as we pay attention and don't just say, uh, as a lot of, um, a lot of companies do that, "Oh, we've found a target, we're just gonna go and apply it in all comers." Because honestly that is one of the big things that causes drugs to fail, is that you are, um, trying to apply it more broadly, if I'm being cynical, sometimes it's so as to maximize the revenues that you can get from a drug versus trying to figure out exactly in which patients it's gonna work. And one of the things you asked earlier, Elad, which, what did I learn at Calico and one of the things that I learned there, there were a lot of former Genentech people there as one would expect given, um, the pedigree of the company. Um, some... one of them told me that if, uh, one of the earliest precision oncology drugs was Herceptin and that goes after HER2-positive breast cancer patients, that if they had tried to run a Herceptin clinical trial in an all-comer breast cancer population, you would have needed a population of 10,000 in the clinical trial which is a very large clinical trial, and even then you might not have seen a sufficiently strong statistical signifi- statistically significant signal because the adverse side effects, and every drug has adverse side effects, in the non-responders may have outweighed the benefits in... with the very strong benefits in the responders. So the fact that they had the right patient population in the clinical development of Herceptin was absolutely critical to, um, creating a successful and, um, and w- you know, uh, reasonably sized clinical trial. And so I think that that is a, s- a, uh, a pattern that many more people in the drug development industry should be following and frankly a lot of them have started to see the benefits of this, so we're not the only ones going there. But I do think, to your point Elad, that we have a, um, differentiated technology stack that will hopefully allow us to get even better, more accurate biomarkers via machine learning and high content data.
- EGElad Gil
Yeah. You mentioned two really key points I feel to expediting drug delivery. There's a biomarker part and then there's finding the right patients relative to the drug and I think that, uh, that actually also is very famous for the HRD drugs where, um, there's a specific set of pathways that if you didn't actually select out the patients with specific mutations, the drugs didn't work, and the second you focused on that population, it worked extremely well. Right? And so there's lots of examples of that where you just have to figure out who you're actually targeting. There's a really great interview from a couple years ago with Janssen who started Janssen Pharmaceuticals where he talked about how he felt that a lot of drug regulation and the length of time it takes to develop drugs was driven by almost an overly safety-ist view of the world. Like, there wasn't a strong series of cost-benefit trade-offs or willingness to sub-segment patient populations or really look at data in a rich way.And we've seen recently with things like COVID that we can really expedite the drug development, vaccine development, everything, right? We, we did things in six months that normally would take 10 years during COVID because we decided we could do it. How much time do you think an ML-first company or a ML-first approach can really cut out of drug development? Or do you think it's purely a regulatory issue in terms of those timelines?
- DKDaphne Koller
Uh, well, I think is... That's a complicated question and I think has elements of both. I think, first, there does need to be a discussion with the regulators around, uh, what, uh, what might be feasible, um, from, uh, from a regulatory approval perspective about different kinds of biomarkers. There is also elements that I think are very legitimate questions, like how do you collect the relevant biomarker in a robust, reproducible way from different patients, what kind of lab protocols one would need in order to have that be collected robustly. That's not always trivial. You can have the most beautiful, sophisticated biomarker that works in a very carefully designed research environment and it's not gonna work in the wild as part of the standard of care. So I think the regulator does have legitimate questions that need to be answered there. Um, so, uh, it's... And the... So... But I do think that with that, um, with that discussion, and especially if you can front load that and have the discussion with the regulators not at the very end when you show up with your whatever, NDA package, but in an earlier stage saying, "Okay, what would it take in order to make this, uh, reasonable from your perspective? What questions would you like to see answered?" I think there is a, a legitimate opportunity to actually accelerate things. Having said that, I think one needs to be realistic about what is and is not feasible. In COVID, we were in the fortunate or unfortunate position, um, that there's, there were a lot of patients with COVID. Um, it was rampant, and so you were able to get a lot of... uh, you're, you were able to fill your clinical trials relatively quickly, and the disease progression was relatively fast. If you're doing an Alzheimer's trial, the disease progression is what it is, and you need to wait long enough to see a delta in the cognition, um, curve in order to convince yourself that there is, in fact, a difference, that your drug is making a difference. Now, I think there is an opportunity to try and create
- 33:19 – 39:50
Building a team and workplace culture that can bridge both bio and computer sciences
- DKDaphne Koller
proxy biomarkers. Amyloid beta is a, um, is an example of that. There's been questions about, is it the right proxy, um, for cognition or not? My guess would be that it is for some patients and probably not others. But, um... So it's a mixed bag to our earlier point about heterogeneity and finding the right patient population. Um, but I think that is a thing that we need to gain conviction around over time, and so ultimately there's only so much that you can speed up biology in certain cases because biology takes as long as it ta- as long as it takes.
- EGElad Gil
Yeah, it's, it's interesting because, um, I feel like that's a mindset that those of us who have worked in both computer science and biology have to learn, right? You are so used to just being able to manipulate some data in the cloud and then you get an answer versus waiting for years for a readout or to make progress. When you think about how you built out the team at insitro and how you built out the culture, how did you think about having each side learn about the different aspects that each side provides? And, in general, how did you think about the culture of a company that can bridge both things?
- DKDaphne Koller
You know, it's really hard, and I think building the right culture is one of the most challenging things that we had to do at insitro and, at the same time, I think a big competitive advantage because doing it is really not very easy. You have to bring in people who truly have both a learning mindset on their own in terms of being interested enough to learn about something that for many is a totally different, um, set of, um, concepts and, and even ways of thinking about the world. So you need computer scientists who are willing to learn about this fuzzy, ill-behaved field of biology where things don't do what they're supposed to do, you know? When you program a computer, yeah, you can have bugs, but ultimately, assuming you did the right things, the same thing will happen. And that's not true in biology.
- SGSarah Guo
We just don't know that much. (laughs)
- DKDaphne Koller
And these things are living beings, so they don't respond in the same way even day after day. And so, there's just... It's really hard, and then conversely, you have the scientist mindset th- that sometimes they get frustrated with the, "Okay, we can take those building blocks and put them together, and this is what will happen," and, and science is not like that, and so you have to create a bridge between the different cultures, the different jargons, the different mindsets, um, and really both get people who are willing to learn about the other discipline but also just engage in meaningful ways with people who are different to themselves.
- SGSarah Guo
What did that mean when you said science is not, just not like that in terms of, uh, manipulating building blocks?
- DKDaphne Koller
Uh, so it comes back to what I said earlier.
- SGSarah Guo
It's just a function call. (laughs)
- DKDaphne Koller
Th- th- there are so many variables that have a huge effect on, uh, on the system that sometimes we only are, uh, only vaguely appreciate. Sometimes we don't appreciate at all. A colleague told me an anecdote about, um, a, uh, an experiment where some days it went perfectly well and then the other days, the cells just died, and the qua- and they tried to figure out what was going on. It turns out the day the cells died were the days when there was a particular technician who had... really had a fondness for onion sandwiches and, and so it turns out that the onion in, uh, on his breath actually, um, ended up, you know, making the cells be less happy. And so you just don't even think about these things if you're an engineer, right? Um, the, the other really interesting mindset difference between mi- between s- how scientists and how engineers approach the world is when you show an engineer or computer scientist a bunch of dots...... usually, the natural inclination is to try and find the pattern, the thing that explains as many of the points as you can because that is the thing around which you will engineer your system. If you're a scientist, oftentimes what you look for are the outliers, the exceptions, because those exceptions are often the, um, beginnings of scientific discovery, because they're the beginning of a thread. It's like, "Why did this one behave differently from everybody else?" And that g-gives rise to a new discovery. So again, it's just, the mindsets are just so different.
- EGElad Gil
Was there anything you did from a, a process perspective to help bridge these things? So, for example, I remember at Color, we tried to often, um, embed a bioinformatician with a team of, uh, systems engineers and they'd learn off of each other. But then everybody on the team, you know, it could be, uh, varying scientists, it could be somebody else who would participate in a scrum, which was a concept that they weren't used to, right? On the biology side, for example, it was more of a way to set that everybody does things on weekly cadences and it's, you don't just do long-term planning, you also do way more short-term planning than you normally would in a lab or, you know, there's different approaches to almost try and bridge those divides. Were there any things that you specifically did along those lines or were there other approaches that you took from a tangible perspective?
- DKDaphne Koller
Well, so first of all, we do bring in people with their different mindsets and, and we try and create, um, sort of bridges between them, so we have product managers who do scrums and do, uh, you know, these, these agile planning processes, these- and we apply that also to our platform development even on the biology side. But at the same time, you know, drug discovery projects which are years long, you don't do scrums (laughs) . You, you know, there is a timeline and when you have a, whatever, uh, a 45-day differentiation for your IPS cells, it takes 45 days and y- y- there's no point to doing an agile scrum in the middle. You just need to wait for the cells to do their thing. And so, we have project managers and we have product managers and we make sure they communicate with each other, but they have- but they each deploy their discipline in their own way. But to your question about one of the things that we did, um, a lot of it comes down to really being deliberate about culture and values. And so, one of the things that we did at the very beginning of the company is, we laid out a set of behavioral norms which, you know, you can think of as values and the one that, um, is I think among my favorites, maybe my favorite, is actually the last one. They're ordered not in order of importance, but from what we do to how we do it, which is that we engage with each other openly, constructively, and with respect. Each of the words matter. Um, engagement means we don't silo ourselves and just sit with our tribe, we really have an engagement with others. Openly being open to asking naive questions,
- 39:50 – 43:12
What Daphne is paying attention to in the so-called golden age of machine learning
- DKDaphne Koller
um, and at the same time, being open to naive suggestions from someone from a discipline other than yourself, because sometimes the question of, "Why don't we do things this way?" is actually a really good idea when you don't come in with a preconceived notion of, "Oh, because that's how we've always done it." Um, constructively means that when you make these suggestions, it has to be with the goal of making the outcome better rather than being the smartest person in the room, which is a big problem in companies where you have a lot of smart people. Um, and the respect is really the respect for what everyone brings to the table, and I think that's really important because there's a lot of, um, and please forgive me a lot, but a lot of tech people who come in to, um, life sciences and they're like, "We have the silver bullet. We are the smartest. We're machine learning. We're gonna solve everything." And they don't respect the challenges of the other discipline, they- sometimes they don't even take the time to learn what the challenges of the other discipline are, and that creates immediate, um, heckle-raising on the other side and, you know, from there the conversation can only get worse. So, I think it's really important to have that respect for both si- for all sides.
- SGSarah Guo
We have a lot of, um, tech people, uh, engineers, founders, researchers as listeners. Um, what would you be working on if you weren't working on insitra? Like, what else are you paying attention to in digital, bio, or AI, assuming people are, uh, attuned to having that culture of openness and respect and constructive thinking?
- DKDaphne Koller
So, um, I think that's a great question and this really is the golden age of, um, AI and machine learning and there's just so many different ways in which that can be deployed in useful ways. I mean, my personal compass has always been that we should be deploying this towards areas where we make life better for people, so I've tried to veer towards applications that are really about improving life, improving health versus, you know, selling more ads or whatever. Not that, you know, I mean, I guess selling ads is good too, but, um, but for me, it's really about, how do we make l- life better? Um, so I think there is a lot of really exciting opportunities right now. I think that intersection or that interface, if you will, between biology and technology is one of the richest areas that exist today because each of these field has been making a huge amount of progress in its own right, but we all hear about, y- you know, AI much more in the news because of ChatGPT and so on, and it's something that everyone can really relate to and understand. But the toolkit that biologists have available to them with CRISPR and pluripotent stem cells and the huge advances in microscopy and such are maybe not quite as visible to the everyday person, but they are equally dramatic, I think, in terms of what they unlock. And so, bringing those two together creates so many opportunities for change in, uh, not just in drug discovery, which is where I happen to, uh, pick my own trajectory, but in agriculture technology; in, um, environmental technology; in energy; in biomaterials, maybe materials that are much less destructive to the environment and, and such with better properties; in food tech. Um-I think there is, uh, just a tremendous wealth of directions that one
- 43:12 – 46:57
Advice for leading a startup in edtech and healthtech
- DKDaphne Koller
can take those, um, those f-fields and bring them together in interesting ways. Having said that, I think there's other really beneficial societal, um, uh, directions that one can deploy this. I think we're only starting to see the applications of machine learning and AI to, say, energy other than things like biofuels because the data just haven't been as readily available, but I'm sure that will change. Similarly, I think, uh, going back to my Coursera days and even my Stanford days, the, um, benefits of, um, machine learning in education and really personalizing learning experiences to individual learners maybe having a more beneficial experience than just letting ChatGPT write their essays for them. Um-
- SGSarah Guo
(laughs)
- DKDaphne Koller
... I think there is a lot of, um, opportunities to really en- deepen and enhance learning experiences for-for students. So, I think there's almost unlimited things that one could do, j- one just needs to be committed to finding them versus falling into the sort of uncomfortable place of going to one of the tech giants and just doing something that earns you a lot of money, which is, I guess, nice for you but maybe not as good in terms of making the world better.
- SGSarah Guo
You've worked with great success in areas that are, uh, perhaps traditionally harder to make money in as a startup, ed tech, health tech, so I guess you don't have anything else to compare it to, but what advice would you give to founders who wanna work in these areas in particular?
- DKDaphne Koller
Uh, I think that... Are you talking about l- financial aspects like raising money or just, um-
- SGSarah Guo
No, just i- if- if there's a- there's a way to look at problem spaces where, you know, there's not traditionally a ton of budget or there's, um, an impedance mismatch, you know, you have regulatory controls or whatever it is that makes it more challenging traditionally than many other areas of software.
- DKDaphne Koller
So, I think that, um, there is, I'm hoping, a realization among investors that there are entire untapped ecosystems where technology can make a difference and hasn't. And, um, so I think that, uh, as you look at what we did at Coursera, for example, ed tech had always been a backwater of investment, um, and yet we were very fortunate to have been able to attract fairly significant funding even at the very early stages because we had an idea that our investors found compelling and differentiated from what others had done. So, I guess I'm a believer and maybe I'm an optimist that, um, if you have a really good idea that is differentiated from what others have done and where the impact is something you can make clear, as we were able to do with those first early moves, people will, um, have confidence that you can turn that into something that is revenue-bearing and will be willing to, you know, go with it for-for a while. So, um, uh, that having been said, I would say that ultimately, and this is, I guess, uh, how I feel about maybe the other half of the question, which is, is this gonna be the place where you make the most money with the greatest amount of certainty? Maybe not, but I believe that we only have one life to live and that ultimately, uh, what you want to be able to do is to look back on your life at some point and say, "I've done something that's really worthwhile and important," and I think that's something that, um, is important for people to keep in mind as they decide where to spend their time.
- SGSarah Guo
Daphne, thanks for an incredible conversation, and thank you for joining us on the podcast.
- DKDaphne Koller
Thank you very much. (instrumental music)
Episode duration: 46:57
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode k5FvyrJdEcI
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome