At a glance
WHAT IT’S REALLY ABOUT
Building virtual cells with AI to accelerate drug discovery
- Arc Institute’s core bet is that modeling the cell—the fundamental unit of biology—via foundation models can unlock scalable, practical simulation of biology and accelerate experimental cycles.
- Scientific progress is slow due to incentives, fragmented multidisciplinary work, and the real-world time required for making and testing biological hypotheses, not just computing predictions.
- The “virtual cell” is operationalized as perturbation prediction: given a starting cell state and desired target state, a model proposes interventions (genes, drugs, combinations over time) to move cells across a learned state manifold.
- AI’s biggest near-term constraint in biotech is evaluation and ground-truth feedback: biology is harder to “see,” requires lab-in-the-loop validation, and suffers from missing measurements, even if transcriptomics can serve as a scalable mirror for other layers.
- Even if AI dramatically improves design and target selection, drug development bottlenecks remain in physical manufacturing, safety/efficacy testing, clinical trials, and regulatory timelines—driving the industry’s capital-intensity challenges.
IDEAS WORTH REMEMBERING
5 ideasStart with cells before attempting whole-body digital twins.
They argue it’s premature to model entire bodies over time if we can’t reliably predict outcomes in a single cell; accurate cell-level prediction is a more scoped, rigorous stepping-stone toward higher-level biological simulation.
The practical “virtual cell” product is a wet-lab co-pilot, not a benchmark winner.
Arc frames success as generating actionable experimental plans—e.g., which 12 perturbations to run next—rather than marginal improvements on abstract ML metrics like gene-expression error.
Biology AI is slowed by evaluation, not just modeling.
Unlike text or images, we don’t natively interpret DNA/cell outputs; progress requires lab-in-the-loop ground truth and better ways to interpret “fuzzy” model outputs against experimental reality.
Transcriptomics can be a scalable proxy even if it’s incomplete.
They acknowledge missing modalities (metabolites, spatial and temporal dynamics) but claim RNA at scale can reflect protein/metabolic signaling indirectly, enabling early capability while richer measurements mature.
Drug failures cluster into two buckets: wrong target or wrong molecule.
With ~90% clinical trial failure rates, they emphasize virtual cells could improve target ID and perturbation planning, but new “drug matter” is still needed for tissue-specific or pleiotropic targets.
WORDS WORTH SAVING
5 quotesI want to make science faster. Our moonshot is really to make virtual cells at Arc and simulate human biology with foundation models. Why are we so worried about modeling entire bodies over time when we can't do it for an individual cell?
— Patrick Hsu
It's this weird Gordian knot that ultimately comes down to incentives, right?
— Patrick Hsu
We don't speak the language of biology, right? You know, you know, at, at very best with an incredibly thick accent, right?
— Patrick Hsu
If we have ninety percent of drugs failing clinical trials, right, that kind of means two things, and you're not sure what percent of which, right? One is we're targeting the wrong target in the first place. The second is the composition, the drug matter that we're using doesn't do the job, right?
— Patrick Hsu
The, the, the crazy thing is the progress in just the short time that I've been doing this is insane... at Arc, in the next, you know, kind of n, like, I don't know, relatively short amount of time, we're gonna generate a billion perturbed single cells, right? That's like, I mean, how's, how's that for a Moore's Law?
— Patrick Hsu
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome