Building AI for better healthcare — the OpenAI Podcast Ep. 14

Name: Building AI for better healthcare — the OpenAI Podcast Ep. 14
Uploaded: 2026-03-16T12:00:00Z
Duration: 30 min 54 s
Description: OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.

OpenAIMar 16, 202630m

Andrew Mayne (host), Karan Singhal (guest), Dr. Nate Gross (guest)

Origins in health policy, safety research, and clinical practiceChatGPT Health privacy protections and user contextHealth Bench evaluation methodology and physician cohortAdaptive literacy (audience-tailored medical communication)Grounding in guidelines, literature, and local institutional practicesWorkflow deployment and post-deployment monitoringClinical copilot pilot with Penda Health in Nairobi

In this episode of OpenAI, featuring Andrew Mayne and Karan Singhal, Building AI for better healthcare — the OpenAI Podcast Ep. 14 explores openAI’s approach to safe, context-aware AI in healthcare workflows OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.

OpenAI’s approach to safe, context-aware AI in healthcare workflows

OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.

Their health strategy centers on privacy-protecting consumer experiences (ChatGPT Health) that preserve user trust while enabling users to bring in personal context like records and wearable data.

Model quality is driven by physician-led evaluation and training (e.g., Health Bench), emphasizing multi-turn realism, appropriate escalation, uncertainty handling, and adaptive communication for different audiences.

They highlight deployment challenges—siloed systems, institutional variation, and the need for grounding in up-to-date guidelines—and describe connectors/standards to integrate clinical and consumer data responsibly.

Real-world pilots (e.g., a Nairobi clinical copilot study) suggest AI can reduce diagnostic and treatment errors when embedded carefully into clinician workflows and monitored post-deployment.

Key Takeaways

Healthcare value depends on context, not just correct facts.

The speakers argue that one-size-fits-all health info (like search) fails because care depends on patient history, setting, and constraints; ChatGPT Health is positioned to incorporate user-consented context across visits.

Get the full analysis with uListen AI

Trust hinges on privacy boundaries and “no training on health data.”

They emphasize encryption and a “one-way valve” concept, plus commitments not to train on users’ healthcare conversations, as prerequisites for broad adoption in sensitive use cases.

Get the full analysis with uListen AI

Better health AI requires realistic, clinician-designed evaluations.

Health Bench focuses on multi-turn conversations and nuanced criteria (e. ...

Get the full analysis with uListen AI

Health performance gains come from integrating health throughout the training lifecycle.

OpenAI claims its models incorporate health considerations from pre-training through post-training and production monitoring, rather than treating healthcare as a late-stage fine-tune.

Get the full analysis with uListen AI

Knowing when to escalate and expressing uncertainty are core safety features.

They cite past “overconfident hallucinations” and describe training models to say when they don’t know, ask follow-ups, and guide users toward clinicians, tests, or appropriate next steps.

Get the full analysis with uListen AI

Workflow-native AI can reduce clinical errors if it interrupts sparingly.

In the Penda Health pilot, the copilot monitored EHR entries and only intervened on potential issues; they report a statistically significant reduction in diagnostic and treatment errors versus control.

Get the full analysis with uListen AI

Interoperability and connectors are major blockers—and major leverage points.

They describe healthcare as siloed across tools and organizations and point to standards (e. ...

Get the full analysis with uListen AI

Notable Quotes

“We actually worked really closely with a…cohort of around two hundred and fifty physicians across every stage…of generation of this data.”
— Karan Singhal

“Healthcare is not multiple choice.”
— Dr. Nate Gross

“One of the most important things is that the model can be trained to better know when it doesn't know and say that.”
— Dr. Nate Gross

“When I'm biking next to a Waymo, I actually feel safer than if I was biking next to a human driver… I want everybody to have this protective effect with health AI.”
— Karan Singhal

“They actually felt that it was dangerous to have a group of clinicians not using the AI.”
— Karan Singhal

Questions Answered in This Episode

Health Bench measures “49,000 dimensions” of performance—what are the most decision-critical dimensions that correlate best with safer real-world outcomes?

OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.

Get the full analysis with uListen AI

How does ChatGPT Health handle conflicts between “latest guidelines,” a hospital’s local protocols, and region-specific resource constraints when generating recommendations?

Their health strategy centers on privacy-protecting consumer experiences (ChatGPT Health) that preserve user trust while enabling users to bring in personal context like records and wearable data.

Get the full analysis with uListen AI

What does the “one-way valve” privacy model technically mean in practice (storage, access controls, retention), and how is it audited?

Model quality is driven by physician-led evaluation and training (e. ...

Get the full analysis with uListen AI

In the Penda Health study, what types of errors were reduced most (diagnostic vs treatment vs medication safety), and what were the intervention thresholds that minimized alert fatigue?

Get the full analysis with uListen AI

What are the failure modes you still see most often in production health queries: missing context, hallucinated citations, inappropriate reassurance, or mis-triage?

Real-world pilots (e. ...

Get the full analysis with uListen AI

Transcript Preview

Andrew Mayne

Hello, I'm Andrew Main, and this is the OpenAI Podcast. Today, we're talking to Dr. Nate Gross, Head of Health, and Karan Singhal, who leads Health AI Research at OpenAI. We'll cover what went into training models to handle sensitive questions and how it's helping clinicians, patients, and healthcare systems.

Karan Singhal

We actually worked really closely with a, with a group, uh, a cohort of around two hundred and fifty physicians across every stage of, of generation of this data.

Dr. Nate Gross

And we're starting to see medications that have been sitting on a shelf that all of a sudden AI has found ways for them to have direct value in, in, in patient lives.

Andrew Mayne

How did you find your way into healthcare?

Dr. Nate Gross

So, uh, what, what drew me to healthcare initially was, uh, health policy.

Andrew Mayne

Mm-hmm.

Dr. Nate Gross

Um, I was, was very interested. This was before the, the first Obama election. Uh, value-based care was first becoming a thing. Um, I, I started studying, uh, different ways to make healthcare more accessible to more people and then, um, e-eventually went to, uh, Emory-

Andrew Mayne

Mm.

Dr. Nate Gross

for medical school, and, and what, what drew me to that was, uh, a large, uh, public hospital, Grady Hospital, you know, to make sure that you're, you're taking advantage of every clinical hour you have.

Andrew Mayne

So what kind of things were you doing?

Dr. Nate Gross

So I was mostly pissing off the IT department.

Andrew Mayne

[laughs]

Dr. Nate Gross

Um, uh, when, when I was in medical school, uh, the newsfeed came out, the iPhone came out, uh, Twitter came out, uh, the App Store came out, and, and so comparing the technology that we had as doctors, which was fax machine, clipboard, paper binder, the beginnings of electronic health records, to like what my friends had or what the patients had in the waiting room was pretty profound.

Andrew Mayne

So you come at it from the point of view as an AI researcher. Where did your interest in applying this to healthcare come from?

Karan Singhal

So I, I nerded out a lot when I was younger about things like philosophy of mind, and I thought a lot about, you know, intelligence and how far could intelligence go and could machines be intelligent. Um, and, um, a lot of those explorations took me towards, as I was learning about AI and starting to work on my first AI projects, thinking a lot about the ways in which AI could have a lot of impact on humanity in the future. And I thought something like-- I didn't, I didn't predict the future or how fast it would happen, but I thought something like AGI would happen within our lifetimes. So then once, once I had that conviction, I thought a lot about, you know, what are the ways in which I can have either positive impact and, and hopefully make that a really large upside for humanity, or think about the ways in which we could avoid downside. So since then, uh, in my career, I've been thinking a lot about both sides of that coin, thinking about that from the perspective as a safety researcher, which is part of my background, and then really some of that work on safety and privacy that I was working on previously, I started applying it in healthcare, and then I started being like, "Whoa, there is a really massive opportunity to think about the application of this technology, especially large language models in healthcare." And that's what took me to transitioning to it full-time, was just the size of that opportunity and the fact that I felt like the healthcare and clinical AI world was kind of not fully aware of that, of that gap. Um, and so I just thought it was kind of a really amazing opportunity and responsibility to, to bring us there.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome