
No Priors Ep. 6 | With Daphne Koller from Insitro
Sarah Guo (host), Daphne Koller (guest), Elad Gil (host)
In this episode of No Priors, featuring Sarah Guo and Daphne Koller, No Priors Ep. 6 | With Daphne Koller from Insitro explores daphne Koller on Reinventing Drug Discovery With Machine Learning and Biology Daphne Koller traces her path from core machine learning and probabilistic graphical models to co-founding Coursera and ultimately founding insitro, an ML-first drug discovery company.
Daphne Koller on Reinventing Drug Discovery With Machine Learning and Biology
Daphne Koller traces her path from core machine learning and probabilistic graphical models to co-founding Coursera and ultimately founding insitro, an ML-first drug discovery company.
She explains why traditional drug development is so slow, expensive, and failure-prone, and argues that the biggest leverage point is choosing the right biological target and patient population, not just better molecules or trial design.
Insitro’s strategy combines large-scale human data (“experiments of nature”) with high-content cellular experiments (e.g., CRISPR-edited iPSC-derived cells) and modern ML to build more predictive, human-relevant disease models.
Koller also discusses building a hybrid tech–bio culture, the importance of biomarkers and genetics, and broader opportunities at the AI–biology interface in areas like agriculture, materials, and education.
Key Takeaways
The greatest leverage in drug discovery is choosing the right target and patients.
Most drug programs fail not because of bad molecules or trial design, but because they modulate the wrong biological target in the wrong indication or population. ...
Get the full analysis with uListen AI
Combine deep learning’s pattern recognition with structured, causal reasoning.
Koller argues the field is swinging back from purely deep learning toward a synthesis with probabilistic/causal modeling and interpretability, especially in high-stakes domains like medicine where you must explain and reason about decisions.
Get the full analysis with uListen AI
Use “experiments of nature” plus controlled cellular experiments to compensate for missing training data.
Because we rarely have direct data mapping interventions to clinical outcomes, insitro leverages human genetics (genotype–phenotype maps) and lab-based perturbation of cells, then uses ML to connect these layers and predict human-relevant effects.
Get the full analysis with uListen AI
High-content, unbiased data (imaging, omics, MRIs) is a key enabler for ML in biology.
Insitro intentionally chooses therapeutic areas like neuroscience, metabolism, and oncology where rich human data (e. ...
Get the full analysis with uListen AI
Biomarkers and human genetics roughly double the odds of clinical success.
Drugs backed by human genetic evidence and robust biomarkers are about twice as likely to succeed in trials. ...
Get the full analysis with uListen AI
Bridging tech and bio requires explicit cultural design, not just hiring both skill sets.
Insitro codified behavioral norms—engaging openly, constructively, and with respect—to avoid “ML will solve everything” arrogance, encourage naïve questions, and reconcile engineering’s desire for clean abstractions with biology’s messy reality.
Get the full analysis with uListen AI
AI–biology is a broad platform opportunity far beyond drug discovery.
Koller highlights parallel opportunities in agtech, environmental tech, energy, biomaterials, food tech, and education, arguing that the simultaneous tool revolutions in AI and experimental biology make this one of the richest frontiers for impact.
Get the full analysis with uListen AI
Notable Quotes
“It’s not like X‑ray crystallography. It’s like computers. You’re going to use machine learning everywhere across the drug discovery process.”
— Daphne Koller
“If you really want to bring down that $2.5 billion number, what you have to do is bring down this completely mind‑blowing statistic of 95% of drug programs fail.”
— Daphne Koller
“Each of us is an experiment of nature, where nature has modulated our genetics… and we can look at that mapping from genotype to phenotype as a surrogate of what a therapeutic intervention would do.”
— Daphne Koller
“There are so many tech people who come into life sciences and they’re like, ‘We are machine learning, we’re going to solve everything,’ and they don’t respect the challenges of the other discipline.”
— Daphne Koller
“We only have one life to live, and ultimately you want to be able to look back and say, ‘I’ve done something that’s really worthwhile and important.’”
— Daphne Koller
Questions Answered in This Episode
How can probabilistic graphical models and causal inference be most effectively integrated with deep learning in real-world biomedical applications?
Daphne Koller traces her path from core machine learning and probabilistic graphical models to co-founding Coursera and ultimately founding insitro, an ML-first drug discovery company.
Get the full analysis with uListen AI
What technical approaches does insitro use to ensure that in vitro cellular models truly predict human clinical outcomes, rather than overfitting to lab artifacts?
She explains why traditional drug development is so slow, expensive, and failure-prone, and argues that the biggest leverage point is choosing the right biological target and patient population, not just better molecules or trial design.
Get the full analysis with uListen AI
How should regulators evolve their frameworks to better incorporate ML-derived biomarkers and patient stratification without compromising safety?
Insitro’s strategy combines large-scale human data (“experiments of nature”) with high-content cellular experiments (e. ...
Get the full analysis with uListen AI
What specific cultural or organizational failures most often doom AI–bio collaborations, and how can new startups preempt them from day one?
Koller also discusses building a hybrid tech–bio culture, the importance of biomarkers and genetics, and broader opportunities at the AI–biology interface in areas like agriculture, materials, and education.
Get the full analysis with uListen AI
Beyond drug discovery, which AI–biology application (e.g., agtech, materials, food) does Koller believe is most underexplored yet poised for outsized impact?
Get the full analysis with uListen AI
Transcript Preview
(digital intro music) Daphne, welcome to the podcast.
Thank you, Sarah. It's a pleasure to be here.
So, as we were saying, we won't ask you to walk through every part of your amazing life story, but you came to biology as a computer science application years into your career. What sparked you going down that route?
My initial interest in biology came from the technical side in the sense that the data sets, this is way back when in the mid, uh, 90s, the data sets that were available to machine learning research at the time were kind of boring and not very inspiring, so things like classifying text, um, into 20 different news groups. And I found that there were more interesting data sets technically to be had on the biology side back then as we were starting, for example, to measure the activity of, uh, genes across the entire genome, um, in multiple samples. So initially, it was really more f- more from a technological perspective, but then I ended up actually having, um, an interest in biology in its own right and ultimately ended up having a bifurcated lab at Stanford where half my lab did core machine learning work published in traditional computer science venues and the other half did core biology work that was published in Nature and Cell and Science. And what was really interesting is that most of my computer science colleagues had no idea that I did biology. Most of my life science colleagues had no idea I was in a computer science department, so it was a bit of a bifurcated existence, but it was a lot of fun.
Amazing. And we'll come back to, you know, full circle working on this problem now. One more historical question for you. Uh, you wrote the book on probabilistic graph models and I spent about half an hour looking in my house for this book. I have it somewhere, but I wanted to have it for the podcast. When I asked a mutual friend, Andrew, what I should ask you, he suggested, you know, what motivated that work and, um, and how that field has changed.
So I think that, um, just like, like in most fields, there is a swing of the pendulum. A lot of, uh, the early work in probabilistic graphical models was hugely influential in bringing, um, artificial intelligence more into the world of machine learning and, uh, and working with numerical data rather than just symbolic AI. Um, and then I think the advent of, um, deep learning, uh, pushed that to the side a little bit because there was so much power that could be gained from basically the kind of pattern recognition, um, from raw inputs, um, raw images, text and so on, without having to worry very much about interpretable representations. What I think we're starting to see right now is a, uh, pendulum starting to swing back in the sense that there is a greater understanding that you really need a bit of both. You need that, uh, hugely powerful pattern recognition that we get from deep learning, but you also need the ability to reason about things like causality and you also need some interpretability of your deep learning model so that you can potentially convey to a clinician why you made the decision that you did. And so what we're ending up with as, uh, a really powerful paradigm is some kind of synthesis of the ideas from both of these disciplines coming together.
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome