a16zMark Zuckerberg & Priscilla Chan: How AI Will Cure All Disease
CHAPTERS
Big ambition, real gap: biology needs a “periodic table” moment
Mark Zuckerberg frames AI as massive leverage for biology, but argues the field lacks foundational shared references—an equivalent of a periodic table for biological systems. Priscilla Chan recalls early skepticism about their “cure and prevent disease” mission, setting up the core thesis: tools, not just grants, are the unlock.
- •AI creates outsized leverage for scientific discovery—but only if paired with the right biological tooling
- •Biology still lacks standardized, universally useful reference frameworks (the “periodic table” analogy)
- •Their long-term goal is enabling the scientific community, not single-handedly curing disease
- •Early reactions ranged from “crazy ambitious” (biologists) to “inevitable” (AI folks)
Why CZI focuses on tool-building to accelerate basic science
Zuckerberg explains that major scientific breakthroughs often follow new instruments that let researchers observe and measure differently (microscope/telescope analogy). CZI’s strategy is to build enabling tools and infrastructure that raise the overall velocity of basic science, filling a gap left by typical grant funding.
- •Scientific progress is often gated by new measurement/observation tools
- •Analogy: doing biology without tools is like coding without debugging
- •NIH-style funding favors smaller, near-term investigator projects
- •CZI targets longer-term, more expensive tool development to unlock broad downstream impact
A credible pathway: from “fund more labs” to “build shared platforms”
Chan describes how the initial disease-curing mission forced a practical question: what blocks a credible path to progress? The answer they heard repeatedly was missing shared tools, datasets, and large coordinated efforts—leading CZI toward building platforms others can use and extend.
- •Simply funding more individual lab grants won’t create a path to “cure disease”
- •Key bottlenecks: lack of shared tools, standardized data, and big coordinated projects
- •Philanthropy can fund work that doesn’t naturally earn traditional academic ‘credit’
- •Success depends on making many scientists, startups, and institutions more effective
Biohub’s positioning: frontier biology + frontier AI (closing the gap)
They articulate the Biohub as a rare organization trying to do cutting-edge AI and cutting-edge biology together, rather than in separate silos. AlphaFold is cited as a proof point for AI impact—yet also evidence that better purpose-built biological datasets are needed to train next-generation models.
- •Most organizations specialize in either AI or biology; Biohub aims to do both together
- •AlphaFold’s success depended on public datasets created decades earlier
- •Opportunity: generate datasets specifically designed to train biological AI models
- •CZI doubled down on science after seeing the highest returns from tool-building work
Working on 10–15 year horizons: choosing problems with a visible path
The conversation turns to why Biohub emphasizes “grand challenges” in a 10–15 year window—long enough to build deep platforms, but concrete enough to maintain momentum and credibility. Chan highlights their criteria: a credible path, strong leadership, and enough ambiguity to justify risk-taking.
- •10–15 years is long-horizon but still operationally tractable (like venture timeframes)
- •Selection criteria: visible path forward, not all problems solved, meaningful upside
- •They deliberately take on uncertainty where risk appetite can pay off
- •Biohub sites are designed around specific technical/biological frontiers
Biohub network strategy: three sites, each building different biological capabilities
Chan outlines the three Biohubs and how each is oriented to a different layer of biology and measurement. The locations tie to partner universities and are designed for interdisciplinary collaboration, leveraging local academic strengths while operating outside traditional lab constraints.
- •New York: cell engineering—cells as sensors/actors that can record signals and act
- •Chicago: tissues and cell–cell communication; inflammation and systems context
- •San Francisco: deep imaging and transcriptomics; spatial understanding of cell states
- •Sites are chosen with partner universities to enable collaboration and talent flow
Making sense of science with LLMs + defining “success” in therapeutics
They describe how LLMs changed what’s possible: once you can generate large datasets, the next challenge is interpreting them. In therapeutics, Chan defines success as enabling an ecosystem that can connect variants to downstream effects and then to precise targets—powering the next wave of precision medicine.
- •LLMs help interpret complex datasets that were previously hard to operationalize
- •Therapeutic impact goal: enable a community explosion in precision medicine
- •Connect variant → cell-state changes → protein expression → actionable targets
- •Use models to predict off-target effects and improve safety earlier
“Most diseases are rare diseases”: reframing personalization and diagnosis
Chan argues that common disease labels hide many distinct biological subtypes—making “rare disease thinking” broadly applicable. Current care often relies on trial-and-error; their vision is rapid, individualized treatment guided by a person’s unique biology.
- •Disease categories lump together heterogeneous biological realities
- •Variant of unknown significance (VUS) highlights today’s interpretability gap
- •Single-cell methods begin linking mutations to downstream cellular consequences
- •Goal: move beyond demographic proxies toward true individualized biology
Cell by Gene and the Cell Atlas: a network-effect platform for biology data
They explain how Cell by Gene began as an annotation workflow fix for single-cell data, then became a de facto standard. Standardized tooling led to standardized formats and community contribution—creating a shared Cell Atlas at scale, mostly funded by the broader ecosystem rather than CZI alone.
- •Single-cell science hit a bottleneck: annotation couldn’t keep up with data generation
- •Cell by Gene was built first as an annotation tool, not an atlas project
- •Standard tools created standard formats/metadata, enabling community aggregation
- •Community contributed ~75% of the atlas; CZI funded ~25%, catalyzing network effects
Why virtual cells: building a hierarchical “world model” of biology
Zuckerberg lays out the “virtual cell” as the next major tool: models spanning proteins, cell structures, and higher-order systems like the immune system. The aim is to generate hypotheses, simulate perturbations, and progressively combine specialized models into a more general biological simulator useful to researchers and (downstream) drug developers.
- •Virtual cell models aim to simulate biology across scales, from proteins upward
- •Value begins with hypothesis generation and approximation before full simulation
- •Strategy: train specialized models and combine them into increasingly general systems
- •Designed to be broadly useful to scientists; therapeutics is downstream adoption
De-risking science: virtual biology enables bolder experimentation
Chan emphasizes that wet-lab work is slow and expensive, and grant/tenure incentives discourage high-risk ideas. Virtual models could shift exploration into silico, allowing researchers to test riskier hypotheses cheaply and quickly before committing lab resources.
- •Today’s incentives push researchers toward safer, publishable bets
- •In silico testing can reduce time/cost barriers and expand risk appetite
- •Models don’t need perfect accuracy to be useful—directional signal can guide work
- •Virtual cell as a “model organism” with higher relevance to humans
What the models look like: variant prediction, diffusion-generated cells, and reasoning
They describe early model types: predicting outcomes of CRISPR edits, generating synthetic cell states via diffusion models, and an emerging “reasoning model over biology.” The long-run plan is hierarchical: best-in-class protein models feed into cellular models, which then extend to complex systems like the immune system.
- •Variant/perturbation models: predict cellular outcomes from CRISPR edits (e.g., “VariantFormer”)
- •Diffusion models: generate synthetic cells matching described properties, useful for rare configurations
- •Spatial models (e.g., imaging/cryo-EM-informed) add structural context
- •A nascent “reasoning” layer aims to move from correlation to mechanistic explanation
The Biohub master plan: unify CZI + Biohub into one operating flywheel
They announce organizational consolidation: bringing datasets, instrumentation, and AI modeling into a single operating philanthropy under unified leadership (Alex Reeves). The goal is a tight feedback loop where model gaps inform new experiments and datasets, and new datasets rapidly improve models—closing the “data ↔ model” loop.
- •Decentralized efforts are being unified into one team to pursue a singular goal
- •Operating flywheel: model identifies blind spots → labs generate new targeted data → model improves
- •Doing frontier biology and frontier AI shoulder-to-shoulder is the differentiator
- •Biohub becomes the main thrust of CZI’s philanthropy, with education/community work continuing
Democratizing discovery: interface design, collaboration, compute, and what’s next for AI in biotech
They argue that usability and interfaces are critical: tools should be accessible to non-experts and cross-disciplinary scientists to invite broader participation. They highlight Biohub’s collaboration model (co-locating disciplines), the shift toward compute as “new lab space,” shared GPU programs, and a forward-looking view that AI will accelerate biotech by enabling better tools and faster basic science translation.
- •Interfaces matter: lower the barrier so more scientists/founders can explore and contribute
- •Cross-functional collaboration is engineered by proximity—biologists and engineers sitting together
- •Compute becomes “lab space”; Biohub scales GPU clusters and offers access to external scientists
- •Long-term impact: tool-driven acceleration from basic science → biotech → pharma → global health