
Building AI for better healthcare — the OpenAI Podcast Ep. 14
Andrew Mayne (host), Karan Singhal (guest), Dr. Nate Gross (guest)
In this episode of OpenAI, featuring Andrew Mayne and Karan Singhal, Building AI for better healthcare — the OpenAI Podcast Ep. 14 explores openAI’s approach to safe, context-aware AI in healthcare workflows OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.
OpenAI’s approach to safe, context-aware AI in healthcare workflows
OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.
Their health strategy centers on privacy-protecting consumer experiences (ChatGPT Health) that preserve user trust while enabling users to bring in personal context like records and wearable data.
Model quality is driven by physician-led evaluation and training (e.g., Health Bench), emphasizing multi-turn realism, appropriate escalation, uncertainty handling, and adaptive communication for different audiences.
They highlight deployment challenges—siloed systems, institutional variation, and the need for grounding in up-to-date guidelines—and describe connectors/standards to integrate clinical and consumer data responsibly.
Real-world pilots (e.g., a Nairobi clinical copilot study) suggest AI can reduce diagnostic and treatment errors when embedded carefully into clinician workflows and monitored post-deployment.
Key Takeaways
Healthcare value depends on context, not just correct facts.
The speakers argue that one-size-fits-all health info (like search) fails because care depends on patient history, setting, and constraints; ChatGPT Health is positioned to incorporate user-consented context across visits.
Get the full analysis with uListen AI
Trust hinges on privacy boundaries and “no training on health data.”
They emphasize encryption and a “one-way valve” concept, plus commitments not to train on users’ healthcare conversations, as prerequisites for broad adoption in sensitive use cases.
Get the full analysis with uListen AI
Better health AI requires realistic, clinician-designed evaluations.
Health Bench focuses on multi-turn conversations and nuanced criteria (e. ...
Get the full analysis with uListen AI
Health performance gains come from integrating health throughout the training lifecycle.
OpenAI claims its models incorporate health considerations from pre-training through post-training and production monitoring, rather than treating healthcare as a late-stage fine-tune.
Get the full analysis with uListen AI
Knowing when to escalate and expressing uncertainty are core safety features.
They cite past “overconfident hallucinations” and describe training models to say when they don’t know, ask follow-ups, and guide users toward clinicians, tests, or appropriate next steps.
Get the full analysis with uListen AI
Workflow-native AI can reduce clinical errors if it interrupts sparingly.
In the Penda Health pilot, the copilot monitored EHR entries and only intervened on potential issues; they report a statistically significant reduction in diagnostic and treatment errors versus control.
Get the full analysis with uListen AI
Interoperability and connectors are major blockers—and major leverage points.
They describe healthcare as siloed across tools and organizations and point to standards (e. ...
Get the full analysis with uListen AI
Notable Quotes
“We actually worked really closely with a…cohort of around two hundred and fifty physicians across every stage…of generation of this data.”
— Karan Singhal
“Healthcare is not multiple choice.”
— Dr. Nate Gross
“One of the most important things is that the model can be trained to better know when it doesn't know and say that.”
— Dr. Nate Gross
“When I'm biking next to a Waymo, I actually feel safer than if I was biking next to a human driver… I want everybody to have this protective effect with health AI.”
— Karan Singhal
“They actually felt that it was dangerous to have a group of clinicians not using the AI.”
— Karan Singhal
Questions Answered in This Episode
Health Bench measures “49,000 dimensions” of performance—what are the most decision-critical dimensions that correlate best with safer real-world outcomes?
OpenAI frames healthcare as a high-impact, high-stakes domain where AI can expand access, reduce gaps in care, and support both patients and clinicians at scale.
Get the full analysis with uListen AI
How does ChatGPT Health handle conflicts between “latest guidelines,” a hospital’s local protocols, and region-specific resource constraints when generating recommendations?
Their health strategy centers on privacy-protecting consumer experiences (ChatGPT Health) that preserve user trust while enabling users to bring in personal context like records and wearable data.
Get the full analysis with uListen AI
What does the “one-way valve” privacy model technically mean in practice (storage, access controls, retention), and how is it audited?
Model quality is driven by physician-led evaluation and training (e. ...
Get the full analysis with uListen AI
In the Penda Health study, what types of errors were reduced most (diagnostic vs treatment vs medication safety), and what were the intervention thresholds that minimized alert fatigue?
They highlight deployment challenges—siloed systems, institutional variation, and the need for grounding in up-to-date guidelines—and describe connectors/standards to integrate clinical and consumer data responsibly.
Get the full analysis with uListen AI
What are the failure modes you still see most often in production health queries: missing context, hallucinated citations, inappropriate reassurance, or mis-triage?
Real-world pilots (e. ...
Get the full analysis with uListen AI
Transcript Preview
Hello, I'm Andrew Main, and this is the OpenAI Podcast. Today, we're talking to Dr. Nate Gross, Head of Health, and Karan Singhal, who leads Health AI Research at OpenAI. We'll cover what went into training models to handle sensitive questions and how it's helping clinicians, patients, and healthcare systems.
We actually worked really closely with a, with a group, uh, a cohort of around two hundred and fifty physicians across every stage of, of generation of this data.
And we're starting to see medications that have been sitting on a shelf that all of a sudden AI has found ways for them to have direct value in, in, in patient lives.
How did you find your way into healthcare?
So, uh, what, what drew me to healthcare initially was, uh, health policy.
Mm-hmm.
Um, I was, was very interested. This was before the, the first Obama election. Uh, value-based care was first becoming a thing. Um, I, I started studying, uh, different ways to make healthcare more accessible to more people and then, um, e-eventually went to, uh, Emory-
Mm.
for medical school, and, and what, what drew me to that was, uh, a large, uh, public hospital, Grady Hospital, you know, to make sure that you're, you're taking advantage of every clinical hour you have.
So what kind of things were you doing?
So I was mostly pissing off the IT department.
[laughs]
Um, uh, when, when I was in medical school, uh, the newsfeed came out, the iPhone came out, uh, Twitter came out, uh, the App Store came out, and, and so comparing the technology that we had as doctors, which was fax machine, clipboard, paper binder, the beginnings of electronic health records, to like what my friends had or what the patients had in the waiting room was pretty profound.
So you come at it from the point of view as an AI researcher. Where did your interest in applying this to healthcare come from?
So I, I nerded out a lot when I was younger about things like philosophy of mind, and I thought a lot about, you know, intelligence and how far could intelligence go and could machines be intelligent. Um, and, um, a lot of those explorations took me towards, as I was learning about AI and starting to work on my first AI projects, thinking a lot about the ways in which AI could have a lot of impact on humanity in the future. And I thought something like-- I didn't, I didn't predict the future or how fast it would happen, but I thought something like AGI would happen within our lifetimes. So then once, once I had that conviction, I thought a lot about, you know, what are the ways in which I can have either positive impact and, and hopefully make that a really large upside for humanity, or think about the ways in which we could avoid downside. So since then, uh, in my career, I've been thinking a lot about both sides of that coin, thinking about that from the perspective as a safety researcher, which is part of my background, and then really some of that work on safety and privacy that I was working on previously, I started applying it in healthcare, and then I started being like, "Whoa, there is a really massive opportunity to think about the application of this technology, especially large language models in healthcare." And that's what took me to transitioning to it full-time, was just the size of that opportunity and the fact that I felt like the healthcare and clinical AI world was kind of not fully aware of that, of that gap. Um, and so I just thought it was kind of a really amazing opportunity and responsibility to, to bring us there.
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome