Live from DevDay — the OpenAI Podcast Ep. 7

Live from DevDay — the OpenAI Podcast Ep. 7

OpenAIOct 6, 20251h 1m

Andrew Mayne (host)

Safe, managed AI in classroomsTeacher workflows: prompts, tools, dashboardsAgent Builder / Agents SDK and permissionsMCP servers and partner integrationsEvals, monitoring, and quality at scaleNon-engineers shipping changes via PRsHealthcare documentation burden and ambient listeningIDE/agent workflows and context engineeringDogfooding and internal product-market fitDisposable one-off software vs long-lived apps

In this episode of OpenAI, featuring Andrew Mayne, Live from DevDay — the OpenAI Podcast Ep. 7 explores devDay highlights: agents, evals, and domain tools transforming work Host Andrew Mayne interviews leaders from SchoolAI, Jam.dev, Abridge, and Cursor about what DevDay announcements unlock for their products and customers.

DevDay highlights: agents, evals, and domain tools transforming work

Host Andrew Mayne interviews leaders from SchoolAI, Jam.dev, Abridge, and Cursor about what DevDay announcements unlock for their products and customers.

Across domains, the conversation centers on moving beyond “models” to reliable products: orchestration/agents, guardrails and permissions, evaluations, and faster iteration loops.

Education and healthcare emphasize trust, safety, and domain-specific definitions of failure (e.g., “hallucination”), while Jam.dev and Cursor focus on empowering non-engineers and speeding software creation.

A recurring theme is that better tooling (Agent/Agent Builder, apps, MCP integrations, evals/optimizers) shifts advantage from raw engineering capacity toward domain expertise, product design, and customer understanding.

Key Takeaways

AI adoption in schools is moving from bans to literacy to embedded tutoring.

Caleb Hicks describes a common progression: initial prohibition, then teacher productivity, and now recognition that students must learn AI to stay competitive—culminating in classroom-connected tutors that reflect actual lessons and goals.

Get the full analysis with uListen AI

Teachers shouldn’t need to become prompt engineers.

SchoolAI “enriches” teacher prompts and packages common tasks as forms and tools (lesson plans, adapted readings), reserving complexity (orchestration/meta-prompting) for the platform rather than the educator.

Get the full analysis with uListen AI

The real product is outcomes and workflows—not the AI tools themselves.

Mayne and Hicks stress that DevDay primitives are a platform layer; winning products come from deep understanding of educator needs (dashboards, exit tickets, actionable student support), not just wiring APIs.

Get the full analysis with uListen AI

Guardrailed, observable tutoring enables a “GPS for impact.”

SchoolAI’s managed, one-time tutors produce real-time insight into student understanding and social-emotional status, helping teachers prioritize attention—especially when they may support hundreds of students.

Get the full analysis with uListen AI

Evals become non-optional when usage scales; small error rates explode.

A 2–3% issue rate becomes massive with millions of users; speakers highlight integrated eval tooling as crucial because teams often defer building eval suites despite their importance.

Get the full analysis with uListen AI

Editing production UI should be as easy as editing a doc—then shipping a PR.

Dani Grant’s “Please Fix” vision removes engineering bottlenecks for small changes: PMs/designers/marketers edit in-browser and generate clean, design-system-compliant pull requests for engineers to review.

Get the full analysis with uListen AI

We’re entering “read, write, think” computing with apps inside ChatGPT.

Jam. ...

Get the full analysis with uListen AI

Two software futures: polished long-term products and disposable agent-built tools.

Grant predicts agents will dynamically generate one-off dashboards/tools on demand, alongside enduring apps like Canva/Zillow—changing how organizations think about building versus generating software.

Get the full analysis with uListen AI

Healthcare requires bespoke definitions of hallucination and rigorous remediation.

Abridge’s Zach Lipton argues “hallucination” depends on context and authorization: even plausible facts are errors if not supported by the conversation; their pipeline detects and remediates sentence-level issues, reporting ~97% recall.

Get the full analysis with uListen AI

Trust in high-stakes domains is earned continuously, not by launch day.

In education and medicine, adoption depends on ongoing delivery: security, reliability, service, and incremental scope expansion (beyond scribing to full visit support, billing/approval prep, decision support).

Get the full analysis with uListen AI

Coding tools are shifting from autocomplete to autonomous agents and RL loops.

Cursor’s Lee Robinson describes progress from simple completion to agents that refactor and self-correct, plus rapid online reinforcement learning updates for autocomplete driven by accept/reject signals.

Get the full analysis with uListen AI

Internal dogfooding and “internal PMF” can drive faster, better releases.

Cursor ships features internally first, watches adoption/churn, iterates, and only then broadens rollout—using weekly demos and even incentives to pressure-test value before public release.

Get the full analysis with uListen AI

Notable Quotes

Teachers should never have to become prompt engineers.

Caleb Hicks (SchoolAI)

When you have five million students using your platform, 2 to 3% means a whole ton of issues every day.

Caleb Hicks (SchoolAI)

It lets any PM, designer, marketer… fix what's broken instantly without writing code.

Dani Grant (Jam.dev)

I think we just saw a new way to browse the web… read, write, think.

Dani Grant (Jam.dev)

‘I actually got to have dinner with my family every night this week for the first time in like 10 years’… ‘Abridge is saving my marriage.’

Zach Lipton (Abridge)

Questions Answered in This Episode

SchoolAI: What specific safeguards and permissioning do you require for student-facing “one-time tutors,” and how do you prevent prompt injection or off-task behavior in class?

Host Andrew Mayne interviews leaders from SchoolAI, Jam. ...

Get the full analysis with uListen AI

SchoolAI: How do you design the teacher dashboard so it flags the “middle 80%” effectively without overwhelming teachers with yet another data feed?

Across domains, the conversation centers on moving beyond “models” to reliable products: orchestration/agents, guardrails and permissions, evaluations, and faster iteration loops.

Get the full analysis with uListen AI

DevDay tooling: Which parts of your in-house orchestration stack do you expect to delete or replace with the Agent Builder/Agents SDK—and what will you keep because it’s domain-specific?

Education and healthcare emphasize trust, safety, and domain-specific definitions of failure (e. ...

Get the full analysis with uListen AI

Evals: What does an eval suite look like for classroom tutoring (helpfulness vs. giving away answers), and how do you measure learning impact rather than just response quality?

A recurring theme is that better tooling (Agent/Agent Builder, apps, MCP integrations, evals/optimizers) shifts advantage from raw engineering capacity toward domain expertise, product design, and customer understanding.

Get the full analysis with uListen AI

Jam.dev: How does “Please Fix” ensure the PRs are safe (no unintended side effects) and aligned with the design system—especially for complex component libraries?

Get the full analysis with uListen AI

Transcript Preview

Andrew Mayne

[upbeat music] Welcome to the OpenAI Podcast, where we're live from OpenAI DevDay. Here sitting with me from SchoolAI is Caleb Hicks. Caleb, hello.

Speaker

Hi, thanks for having me. This will be fun.

Andrew Mayne

So, Caleb, you're working on tools for helping educators and helping people, uh, basically in the classroom understand progress of students?

Speaker

That's right, yeah.

Andrew Mayne

So first off, what was your reaction so far to DevDay?

Speaker

Uh, a ton of fun, I think makes it... Uh, a, a, a lot of things to be excited about that help us build, but also help students and teachers be more creative as well, so that'll be fun.

Andrew Mayne

So what have you been working on over the last year? What has changed with AI that's accelerated what you've been doing?

Speaker

Ooh, um, I think probably the biggest advancement over the last year for us- so we, we, uh, put AI in students' hands. That's the main, uh, the main thing that we focus on, is safe, managed, uh, AI that can act as kind of one-time personal tutors for students.

Andrew Mayne

Mm-hmm.

Speaker

And so probably the biggest change from OpenAI has been model progression.

Andrew Mayne

Hmm.

Speaker

I think we get two advantages from that. One is, uh, significant leaps in intelligence.

Andrew Mayne

Mm-hmm.

Speaker

Uh, and the other one is, uh, you know, improvements in cost. Because we are working with, uh, an industry that isn't known for paying big dollars for software, uh, it's been important for us to be able to manage students using this in a cost-effective way. So those have been the two areas that AI progression has helped. From our work, it has been a lot of, uh, orchestration, which I'm sure we'll talk about a little bit.

Andrew Mayne

Mm-hmm.

Speaker

Just, just getting different AI agents and models to work together, uh, for the best outputs for students in particular.

Andrew Mayne

So a couple of the releases we saw today, one was the Agents SDK.

Speaker

Yeah.

Andrew Mayne

Now, and you've talked about that. How much have, one, tools changed the ability of, one, to work faster, and, two, the scope of what you find is capable now?

Speaker

Yeah, I think we're seeing teams across industries work way faster-

Andrew Mayne

Mm

Speaker

... and building better software because they've got kind of this always on expert, hyper-senior, uh, engineer next to them that they're pair programming with, right? And, uh, so we see that with our teams as well, and that just allows us to, to build better software faster, uh, and get it in the hands of teachers and students, which is what we're here to do.

Andrew Mayne

What has been the biggest shift you've seen in talking to educators or people like that in regards to AI in general?

Speaker

Yeah, great question. So every teacher, school, and district is on a very similar journey. It starts with permission, right?

Andrew Mayne

Mm-hmm.

Speaker

Two-and-a-half years ago, it was everyone under the sun was just banning AI altogether.

Andrew Mayne

Right.

Speaker

Uh, we've, we've moved past that into productivity-

Andrew Mayne

Mm

Speaker

... as teachers and, and school leaders realizing, "Hey, this helps me in my job."

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome