CHAPTERS
Why OpenAI is launching a Life Sciences model series
Andrew Main introduces OpenAI’s push into life sciences with Research Lead Joy Jiao and Product Lead Yunyun Wang. They frame the central aim: apply frontier AI to biology and medicine while deploying responsibly given dual‑use risks.
- •Life sciences as a new frontier for advanced model capabilities
- •Focus on what new models enable in biology/medicine
- •Commitment to responsible deployment alongside capability gains
What the Life Sciences model series is built for: workflows, tools, and repeatability
Yunyun explains the Life Sciences model series as biochemistry-focused models grounded in real research workflows. The team emphasizes mechanistic understanding (genomics/proteins), early discovery bottlenecks, and practical orchestration via products and plugins.
- •Biochemistry-focused models anchored in complex research workflows
- •Mechanistic understanding starting with genomics and proteins
- •Early discovery as the biggest leverage point for AI acceleration
- •Orchestration across ChatGPT (literature synthesis) and Codex (agentic workflows)
- •Life sciences research plugin: 50+ templatized, repeatable skills for enterprise reproducibility
From tool-using assistant to ‘biochemistry expert’ model
Joy outlines how models can already behave like computational biologists by calling external tools (e.g., protein structure prediction) and iterating on inputs/outputs. The next step is deeper biochemical intuition so the model can use tools more intelligently and converge faster.
- •Models already run scientific tools and interpret outputs iteratively
- •Protein structure prediction as an example of tool integration
- •Goal: move from general tool-use to domain intuition/expertise
- •Better intuition enables faster, more reliable experimental choices
Joy Jiao’s journey: systems biology to AI to speed up science
Joy recounts her path from a Harvard systems biology PhD to software and then OpenAI, motivated by the slow pace and manual nature of lab work. The ‘full circle’ is using AI to accelerate the kind of biology work she once did—without returning to repetitive bench tasks.
- •Background in systems biology; desire for faster iteration cycles
- •Friction of manual wet-lab work as a core motivation
- •Transition to software/AI as a way to increase research velocity
- •Vision of robots/automation removing pipetting from the loop
Autonomous lab collaboration with Ginkgo Bioworks: proving models can do biology
Joy describes experiments where GPT-5 was integrated with Ginkgo’s robotic lab to design and run protein-related experiments. Early results were unexpectedly positive—producing non-zero protein—helping shift internal beliefs from uncertainty to confidence that AI can accelerate lab science.
- •Model-in-the-loop autonomous wet lab setup with robotics
- •Initial skepticism due to limited biology training data vs math/CS
- •Surprise: first designs yielded measurable protein output
- •Rapid shift in expectations over ~6 months toward ‘obvious’ acceleration potential
Yunyun Wang’s path: biodefense roots and a dual perspective on life sciences
Yunyun shares how she started in biorisk mitigation and biodefense initiatives before moving into life sciences product work. This background shapes a ‘tackle both sides’ approach: unlock beneficial research while rigorously managing misuse risks.
- •Entry via biorisk mitigations and biodefense initiatives
- •Wet-lab exposure through infectious disease/virology work
- •Life sciences work informed by security and risk experience
- •OpenAI’s life sciences focus growing over ~2 years via evals and lab experiments
Scope of AI in the life sciences pipeline: from chemistry to drugs to regulation
The discussion expands to potential applications across chemical/protein/enzyme design, pathway understanding, and drug discovery. They also highlight longer-term opportunities in accelerating clinical and regulatory stages, not just early discovery.
- •AI-native areas: chemical design, protein/enzyme design
- •Modeling pathways, cellular interactions, and mechanisms of disease
- •Drug discovery: target to molecule design and optimization
- •Potential acceleration in FDA approval/safety review and clinical processes
Biorisk and safeguards: why intent is hard to infer from prompts
They explain why biosecurity is uniquely challenging: benign and harmful workflows can look identical at early steps. As a result, OpenAI leans risk-averse in general access, using refusals/high-level responses and layered mitigations, while exploring more nuanced approaches.
- •Bio is a severe dual-use risk tightly coupled to capability improvements
- •Information hazards can be indistinguishable until late in a workflow
- •Prompt ambiguity example: ‘help me clone a gene’ could be benign or toxic
- •General access mitigations: refusal patterns and high-level guidance
- •Need to balance scientific utility with preventing harmful enablement
Differentiated access: enterprise controls to unlock advanced capabilities responsibly
Joy and Yunyun argue that stronger capabilities require knowing who the user is and operating within controlled institutional environments. Verified researchers at regulated organizations enable higher-trust deployment because reagents, cell lines, and processes are tracked and audited.
- •Core concept: different user segments get different access levels
- •Institutional context provides accountability (tracked reagents/cell lines)
- •Enterprise-grade security and controls as enablers of capability
- •Goal: maximize beneficial science without broadening misuse pathways
What models can do in labs today: from pipetting optimization to idea triage
They describe current practical uses ranging from operational automation (spreadsheets, minimizing pipetting steps) to higher-level scientific assistance. Yunyun emphasizes models as ‘discriminators’ that stress-test hypotheses, narrow target lists, and prioritize feasible experiments.
- •Low-level lab productivity: spreadsheets and protocol optimization
- •Higher complexity: tool-assisted enzyme/protein design workflows
- •Models as skeptical reviewers to assess novelty and feasibility
- •Scaling target screening and hypothesis evaluation beyond human bandwidth
Building scientific infrastructure on Codex: agentic workflows and collaboration artifacts
Joy envisions Codex as the backbone for computational scientific workflows: running code across machines, monitoring logs, and building bespoke analysis/visualization software. A notable shift is sharing interactive outputs (HTML/UIs) instead of raw data, changing collaboration patterns.
- •Codex as a ‘do everything on the computer’ scientific runtime
- •Remote orchestration across dev boxes; monitoring and automation
- •Rapid creation of fit-for-purpose analysis and visualization tools
- •Collaboration changing via shareable UIs (e.g., interactive protein views)
- •Long-term extension: connect software workflows to robotics
Why compute matters for science: bigger models and test-time ‘thinking’
Joy distinguishes two compute axes: scaling model size and scaling test-time compute for deeper reasoning. Test-time compute enables extended deliberation—potentially days—allowing models to tackle higher-difficulty scientific problems and discovery-style tasks.
- •Axis 1: larger models produce emergent capabilities
- •Axis 2: test-time compute scaling for variable-depth reasoning
- •Reasoning models can allocate more ‘thinking time’ to harder problems
- •Compute framed as enabling long-horizon scientific exploration, not just chat
- •Team motto: ‘scale test time compute to cure all disease’
Near-term (6–12 months): drug repurposing, personalized medicine, and lab automation uplift
They discuss realistic paths to impact: drug repurposing suggestions, scaling RNA-based personalized therapies, and smoothing bottlenecks in translating protocols to automated platforms. The emphasis is on augmenting scientists—improving analysis rigor and throughput—rather than replacing them.
- •Shorter path to impact via drug repurposing and mechanism-based suggestions
- •Personalized medicine examples: ASOs and RNA-based treatments
- •Lab automation bottleneck: translating protocols into runnable robotic methods
- •Data analysis support: hypothesis probing, statistical test selection, bias detection
- •AI as accelerator with scientists remaining essential in the loop
Scientific evals and adoption: proving value, embracing skepticism, and advice for learners
They describe evaluation strategies using real experimental datasets, synthetic ‘messy data’ tests, and ultimately wet-lab validation. Adoption varies culturally; they advocate ‘show by doing’ through products and publications, and encourage students/researchers to learn via exploration, collaboration, and low-lift entry points.
- •Evals via known experiments: predict outcomes of held-out perturbations (e.g., single-cell data)
- •Synthetic data evals to test QC, bias detection, and statistical rigor
- •Wet lab as the ultimate validation: ‘nothing is real until proven in the real world’
- •Adoption differences (West Coast enthusiasm vs East Coast skepticism); skepticism welcomed
- •Advice: use AI for paper understanding and data analysis; start small, collaborate, adopt early
10-year vision: autonomous robot labs, democratized expertise, and stronger biosecurity defenses
Joy and Yunyun project a future of AI-connected autonomous labs that continuously design, run, and interpret experiments with humans providing high-level direction. They also emphasize societal benefits: democratizing expert knowledge, accelerating countermeasures, and improving environmental surveillance for emerging threats.
- •Robot-first autonomous labs executing experiments at scale
- •Human role: direction-setting, interpretation, and iterative decision-making
- •Democratization: expert-level capability accessible to more people
- •Medical countermeasure acceleration for flu/variants and broader preparedness
- •Continuous monitoring (wastewater/air) and faster response to biological threats
