Building AI for better healthcare — the OpenAI Podcast Ep. 14

Healthcare systems around the world are under strain, and both patients and clinicians are feeling the impact. OpenAI's Head of Health Dr. Nate Gross and Karan Singhal, who leads Health AI Research, discuss how AI can help address the biggest challenges. They cover how OpenAI is training models to handle sensitive health questions in collaboration with physicians, and how that foundation is unlocking a new generation of tools for patients, clinicians, and healthcare systems. Chapters: 00:00:38 – Origins of Nate and Karan’s interest in AI and healthcare 00:05:01 – Strategy for building AI tools for clinicians 00:06:57 – How AI models are trained for health use cases 00:10:15 – How OpenAI is able to score well on health evals 00:14:21 – Key challenges deploying AI in healthcare 00:21:05 – Collaboration with hospitals and healthcare systems 00:23:05 – Practical everyday uses of AI health assistants 00:26:43 – Biggest “wow” moment during development 00:28:46 – Feedback from clinicians and early users

Andrew MaynehostKaran SinghalguestDr. Nate Grossguest

Mar 15, 202630mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

OpenAI’s approach to safe, context-aware AI in healthcare workflows

OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.
Their health strategy centers on privacy-protecting consumer experiences (ChatGPT Health) that preserve user trust while enabling users to bring in personal context like records and wearable data.
Model quality is driven by physician-led evaluation and training (e.g., Health Bench), emphasizing multi-turn realism, appropriate escalation, uncertainty handling, and adaptive communication for different audiences.
They highlight deployment challenges—siloed systems, institutional variation, and the need for grounding in up-to-date guidelines—and describe connectors/standards to integrate clinical and consumer data responsibly.
Real-world pilots (e.g., a Nairobi clinical copilot study) suggest AI can reduce diagnostic and treatment errors when embedded carefully into clinician workflows and monitored post-deployment.

IDEAS WORTH REMEMBERING

5 ideas

Healthcare value depends on context, not just correct facts.

The speakers argue that one-size-fits-all health info (like search) fails because care depends on patient history, setting, and constraints; ChatGPT Health is positioned to incorporate user-consented context across visits.

Trust hinges on privacy boundaries and “no training on health data.”

They emphasize encryption and a “one-way valve” concept, plus commitments not to train on users’ healthcare conversations, as prerequisites for broad adoption in sensitive use cases.

Better health AI requires realistic, clinician-designed evaluations.

Health Bench focuses on multi-turn conversations and nuanced criteria (e.g., asking clarifying questions, tailoring to user type), created with ~250 physicians and tens of thousands of rubric dimensions.

Health performance gains come from integrating health throughout the training lifecycle.

OpenAI claims its models incorporate health considerations from pre-training through post-training and production monitoring, rather than treating healthcare as a late-stage fine-tune.

Knowing when to escalate and expressing uncertainty are core safety features.

They cite past “overconfident hallucinations” and describe training models to say when they don’t know, ask follow-ups, and guide users toward clinicians, tests, or appropriate next steps.

WORDS WORTH SAVING

5 quotes

We actually worked really closely with a…cohort of around two hundred and fifty physicians across every stage…of generation of this data.

— Karan Singhal

Healthcare is not multiple choice.

— Dr. Nate Gross

One of the most important things is that the model can be trained to better know when it doesn't know and say that.

— Dr. Nate Gross

When I'm biking next to a Waymo, I actually feel safer than if I was biking next to a human driver… I want everybody to have this protective effect with health AI.

— Karan Singhal

They actually felt that it was dangerous to have a group of clinicians not using the AI.

— Karan Singhal

Origins in health policy, safety research, and clinical practiceChatGPT Health privacy protections and user contextHealth Bench evaluation methodology and physician cohortAdaptive literacy (audience-tailored medical communication)Grounding in guidelines, literature, and local institutional practicesWorkflow deployment and post-deployment monitoringClinical copilot pilot with Penda Health in Nairobi

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.