Biohub: The Future of Biology is Open-Source with Mark Zuckerberg, Priscilla Chan, and Alex Rives

Biohub started with an ambitious goal of curing, preventing, and managing all disease by the end of the century. A decade later, thanks to the convergence of frontier AI and biological data, that goal may have been too conservative. In this episode, Elad Gil and Sarah Guo sit down with Biohub co-founders Mark Zuckerberg and Priscilla Chan, alongside Biohub Head of Science Alex Rives. Together, they discuss Biohub’s $500 million virtual biology initiative, which integrates frontier AI with wet-lab work to build predictive world models of cells, proteins, and systems. They also talk about their newly announced open-source engine for digital protein and antibody design, ESMFold2; why Biohub is a nonprofit rather than a venture-backed startup; and how hierarchical simulations will soon allow doctors to treat patients at an individual, mechanistic level. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Biohub | @finkd | @alexrives | @ChanZuckerberg Chapters: 00:00 – Cold Open 01:02 - Mark Zuckerberg, Priscilla Chan, and Alex Rives Introduction 01:26 – Why Biohub and Their Mission 08:27 – Integrating Frontier AI and Frontier Biology 09:45 – Micro to Macro Biological Modeling 14:22 – Mechanistic Interpretiability 16:58 – Why Biohub is a Non-Profit 21:41 – Understanding How Biology Works 24:23 – Timeline for Curing All Diseases 26:25 – Translating Research to Patient Impact 28:04 – Launch of ESMFold2 32:13 – Tackling Off-Target Effects and Edge Cases 38:39 – Putting the Tech in Individual Hands 41:06 – Talent at Biohub 44:25 – What’s Next After ESMFold2 46:10 – Connecting ESMFold2 to Agentic Systems 46:51 – The Virtual Cell 49:33 – Defining Success for Biohub 51:52 – Biohub Strategy Update 56:20 – Conclusion

Mark ZuckerbergguestAlex RivesguestSarah GuohostElad Gilhost

Jun 10, 202656mWatch on YouTube ↗

CHAPTERS

Open tools for biology: the core thesis (cold open)
The episode opens with the central idea behind Biohub: build powerful biology tools and share them broadly, accelerating the entire scientific ecosystem. The guests frame success not as personally “curing diseases,” but enabling many scientists to move faster with open, generalizable models.
How “cure all disease” became a serious plan: origins of Biohub
Mark and Priscilla recount early conversations with leading scientists who laughed at the ambition, forcing them to ask what structurally slows biology. The key blockers they heard were silos, poor sharing, and missing durable tooling—prompting a tool-centric philanthropic strategy.
Biohub’s model: long-term tool development across hubs and universities
They describe the original Biohub approach: engineers + scientists working across institutions to build tools with long time horizons. Over time, the effort expanded geographically and became the primary philanthropic focus, unified by a “virtual biology” initiative.
Frontier AI + frontier biology: why biology needs new data generation
Unlike language models, biology lacks abundant “internet-scale” training data; much of the most valuable data doesn’t exist yet. The team explains why progress requires inventing new experimental methods—imaging, sensors, and cellular engineering—to create datasets that models can learn from.
From single-cell sequencing to community corpora: the tooling flywheel
Priscilla traces a through-line from early single-cell sequencing funding to the Human Cell Atlas and annotation tools like Cell by Gene. The broader point: shared datasets and software can seed communities that contribute far beyond the initial funder’s efforts, turning “stamp collecting” into foundation for modeling.
Micro-to-macro modeling: hierarchical world models from proteins to cells to systems
The group debates whether biology can be modeled “end-to-end” at higher levels or must be built up layer by layer. Their strategy is hierarchical: protein interactions enable cellular models, which enable tissue/immune-system-level understanding, with targeted experiments creating connective tissue across levels.
Mechanistic interpretability for protein language models: opening the black box
Alex explains applying mechanistic interpretability—popular in LLMs—to protein language models trained on billions of sequences. The aim is to extract biological insight from representations that capture latent structure/function “grammar,” connecting unknown proteins to known biology.
Why a nonprofit and why open-source: scale, time horizon, and ecosystem leverage
They argue a nonprofit structure better fits the ambition: generating new biological methods and datasets is not a simple pay-to-produce pipeline, and the work benefits from long horizons and open dissemination. Open-source also mobilizes broader talent and enables work on rare and niche diseases that markets may ignore.
What “understanding biology” means in practice: individualized mechanistic chains
Priscilla reframes disease impact as building mechanistic chains from genetic variants to proteins to phenotypes, enabling bespoke interventions. Today’s medicine often relies on coarse cohort analogies; the goal is to treat people as individuals with causal, testable understanding.
Timelines and early leverage points: systems like inflammation and immunity
On timelines for “curing all disease,” they emphasize dynamic complexity and uncertainty, but express increased optimism due to AI acceleration. Rather than picking single diseases, they highlight system-level targets—like inflammation and immune function—as high-leverage foundations others can translate into therapies.
From bench to bedside: what must change in clinical research and deployment
Priscilla notes translation to clinic is less clear than research acceleration; clinical research and safe deployment methods must evolve. They reference work like CRISPR Cures and examples such as rapid, targeted interventions in carefully chosen contexts (e.g., liver-delivered therapies) as early pathways.
ESMFold2 launch: a fast open “world model” of protein biology and design
Alex details ESMFold2 as a general protein model trained on billions of sequences that predicts atomic-resolution structures quickly and accurately. Beyond folding, its generality enables protein and antibody design digitally, followed by small-batch lab validation—illustrating the open discovery engine concept.
Validation, edge cases, and de-risking drug development: off-target effects and rare disease cohorts
They discuss pairing model-driven design with wet-lab confirmation (cell assays, cryo-EM) and using broader biological models to predict off-target effects (e.g., receptor expression in unexpected cell types). They also highlight patient-led rare disease registries and opt-in trial participation as accelerators for “edge case” learning and faster iteration.
What’s next: agentic pipelines, the “virtual cell,” and defining success in 5 years
Looking forward, they describe early integrations of ESMFold2 with agentic systems to automate design loops. The major research agenda is the virtual cell—models spanning genetic/transcriptomic/proteomic layers to phenotype with generalization to unseen interventions—while success is defined by world-class, uniquely better models that catalyze ecosystem-wide idea generation.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Open tools for biology: the core thesis (cold open)

How “cure all disease” became a serious plan: origins of Biohub

Biohub’s model: long-term tool development across hubs and universities

Frontier AI + frontier biology: why biology needs new data generation

From single-cell sequencing to community corpora: the tooling flywheel

Micro-to-macro modeling: hierarchical world models from proteins to cells to systems

Mechanistic interpretability for protein language models: opening the black box

Why a nonprofit and why open-source: scale, time horizon, and ecosystem leverage

What “understanding biology” means in practice: individualized mechanistic chains

Timelines and early leverage points: systems like inflammation and immunity

From bench to bedside: what must change in clinical research and deployment

ESMFold2 launch: a fast open “world model” of protein biology and design

Validation, edge cases, and de-risking drug development: off-target effects and rare disease cohorts

What’s next: agentic pipelines, the “virtual cell,” and defining success in 5 years

Get more out of YouTube videos.