OpenAIBuilding AI for better healthcare — the OpenAI Podcast Ep. 14
CHAPTERS
Why OpenAI is investing in healthcare AI: access, time, and better outcomes
Andrew Mayne sets the stage with OpenAI’s healthcare focus: training models that can handle sensitive, high-stakes questions and support clinicians, patients, and systems. Nate and Karan frame healthcare as a domain where safety, privacy, and practical usefulness must be built in from the start.
Nate Gross’s path: from health policy to frustration with clinical IT
Nate describes being drawn to healthcare through health policy and value-based care, then training in a public-hospital environment. He contrasts modern consumer tech with the outdated tools clinicians relied on, motivating his drive to modernize access and workflows.
Karan Singhal’s motivation: AGI, safety, and the healthcare opportunity gap
Karan explains his early interest in intelligence and the philosophy of mind, leading to AI research and a conviction that advanced AI would arrive within our lifetimes. He connects his safety/privacy background to healthcare, arguing the clinical world underestimated how impactful large language models could be.
Product strategy: ChatGPT Health as a secure, context-aware health companion
Nate outlines how massive consumer demand is already driving health use: a large share of ChatGPT queries are health-related. The strategy emphasizes both security (strong privacy protections) and empowerment (bringing user context into conversations to avoid “amnesiac” one-off search).
How health models are trained: start with evaluation, not hype
Karan explains that healthcare work began with safety/alignment motivations and an evaluation-first approach. OpenAI built HealthBench around realistic multi-turn conversations, co-designed with hundreds of physicians, to measure both usefulness and safety in situations that resemble real use.
Inside HealthBench: context-seeking, audience adaptation, and multifaceted scoring
HealthBench measures whether models ask for missing context, tailor responses to different users, and behave safely under uncertainty. Karan gives the example of ambiguous symptoms (“it burns”) where the safest move is to ask clarifying questions rather than overconfidently guess.
Why OpenAI models score strongly on health evals: health integrated across the training stack
Karan attributes strong benchmark performance to a cross-functional effort that spans pre-training through post-training, plus pre-deployment evaluation and production monitoring. Nate adds that clinically meaningful training focuses on escalation, literacy adaptation, and uncertainty—not multiple-choice test performance.
Deployment challenges in real healthcare: trust, grounding, and siloed systems
The discussion shifts to blockers for real-world adoption: clinicians need trustworthy, up-to-date, locally appropriate answers. Nate highlights the need to ground outputs in guidelines, literature, and institutional practices, while also connecting fragmented systems and data formats across organizations.
Collaboration with hospitals, government, and device ecosystems: making patient context portable
Nate explains that partnerships and standards are essential so patients can consent to bring records into ChatGPT in just a few taps. He points to EHR interoperability efforts (including government standards) and integrations with consumer devices, wearables, and biosensors to create richer, more actionable context.
Everyday AI health assistant use: from wearables to dinner plans and daily pacing
Andrew and Nate discuss practical examples where health context improves daily decisions—menu planning, activity goals, scheduling, and stress/sleep-informed planning. Nate emphasizes ChatGPT as a unifying layer that complements partner technologies, extending health insights into many daily workflows.
Clinician workflow support: “raise the floor, sweep the floor, raise the ceiling”
Nate introduces a three-part framework: broaden access to AI benefits, reduce administrative burden so clinicians regain time with patients, and enable new frontiers of medical capability. The emphasis is on augmenting care teams and improving continuity rather than replacing clinicians.
Wow moments: explosive adoption and new frontiers like repurposed medications
Karan’s “wow” is the rapid growth of health and wellness usage even before a dedicated product launch. Nate highlights the shift from interesting demos to useful, potentially transformative work—such as scaling experiments and identifying new value for shelved medications.
Clinician feedback from the field: Nairobi co-pilot study and the safety-net effect
Karan describes deploying an AI clinical co-pilot with Penda Health clinics in Nairobi, designed to monitor EHR entries and interrupt only when something looks concerning. The study found a statistically significant reduction in diagnostic and treatment errors, and clinicians became reluctant to run future trials that would deny AI to a control group.
Early-user stories: caregiver support, clinician relief, and rare “miracle” cases
Nate shares the most meaningful feedback: caregivers under strain, overwhelmed clinicians, and occasional cases where AI helps accelerate an elusive diagnosis or critical decision by surfacing missing context. The episode closes on AI as an amplifier that helps clinicians do more for patients.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome