No Priors Ep. 118 | With Anthropic Co-Founder Ben Mann

No PriorsJun 12, 202541m

Sarah Guo (host), Ben Mann (guest), Elad Gil (host)

Claude 4 capabilities, benchmarks, and release philosophyAgentic behavior, long-horizon tasks, and Claude Code for software developmentModel self-improvement and AI’s role in accelerating AI research and infrastructureHuman feedback vs AI feedback, Constitutional AI, and preference modelingAI safety, Responsible Scaling Policy, and high-stakes domains like biologyProvider competition, enterprise focus, and vertical integration into key applicationsModel Context Protocol (MCP) as an open standard for tool and context integration

In this episode of No Priors, featuring Sarah Guo and Ben Mann, No Priors Ep. 118 | With Anthropic Co-Founder Ben Mann explores anthropic’s Ben Mann on Claude 4, agents, safety, and MCP’s future Anthropic co-founder Ben Mann discusses the Claude 4 release, emphasizing major improvements in coding reliability, long-horizon autonomy, and agentic workflows, particularly through Claude Code. He outlines how Anthropic balances model capability with safety, including reinforcement learning from AI feedback (RLAIF), Constitutional AI, and their Responsible Scaling Policy focused on high-risk domains like biology. Mann also explores how models will increasingly help build and improve future models via coding, research assistance, and synthetic environments. The conversation closes with Anthropic’s ecosystem strategy, including Model Context Protocol (MCP) as an open standard for tools and integrations across providers.

Anthropic’s Ben Mann on Claude 4, agents, safety, and MCP’s future

Anthropic co-founder Ben Mann discusses the Claude 4 release, emphasizing major improvements in coding reliability, long-horizon autonomy, and agentic workflows, particularly through Claude Code. He outlines how Anthropic balances model capability with safety, including reinforcement learning from AI feedback (RLAIF), Constitutional AI, and their Responsible Scaling Policy focused on high-risk domains like biology. Mann also explores how models will increasingly help build and improve future models via coding, research assistance, and synthetic environments. The conversation closes with Anthropic’s ecosystem strategy, including Model Context Protocol (MCP) as an open standard for tools and integrations across providers.

Key Takeaways

Claude 4 significantly improves coding reliability and reduces unwanted code changes.

Compared to previous Claude versions, Claude 4 (especially Sonnet and Opus) is much better at doing exactly what was requested in code, avoiding reward-hacking behaviors like deleting code to pass tests or making over-eager refactors.

Get the full analysis with uListen AI

Agentic, long-horizon workflows are now practical for real-world tasks.

Customers are using Claude for hours-long unattended tasks—such as large-scale code refactors or transforming videos into slide decks via tools and APIs—showing that multi-step, multi-tool orchestration is becoming production-ready.

Get the full analysis with uListen AI

Models will increasingly accelerate their own development pipelines.

Claude is already valuable for systems coding, experiment analysis (e. ...

Get the full analysis with uListen AI

Human expert feedback is becoming a bottleneck; AI feedback fills the gap.

As models surpass typical human expertise in domains like coding, Anthropic leans on RLAIF and Constitutional AI, using models to critique and refine their own outputs under human-written principles and small amounts of high-quality expert preferences.

Get the full analysis with uListen AI

Safety work is shifting toward empiricism and domain-specific, real-world feedback.

For areas where correctness is hard to judge (medicine, law, biology), Anthropic envisions empirical feedback loops with partners—such as pharma and healthcare companies—feeding observed outcomes back into models instead of relying solely on abstract judgments.

Get the full analysis with uListen AI

Biological misuse is a focal risk, and uplift over search engines is the key metric.

Anthropic classifies Claude 4 Opus as ASL-3 partly because it significantly improves novice capabilities relative to Google Search in wet-lab biology, and they deploy classifiers and safeguards specifically to detect and block harmful bio-assistance.

Get the full analysis with uListen AI

MCP aims to standardize how models use tools, democratizing integrations across providers.

Model Context Protocol lets any service—internal or external—expose standardized capabilities to any compliant client or model, and has been adopted or endorsed by major AI companies, opening the door to self-written, on-the-fly integrations by agents themselves.

Get the full analysis with uListen AI

Notable Quotes

“More agentic, longer horizon tasks are newly unlocked with Claude 4.”
— Ben Mann

“The new models, they just do the thing… and that’s really useful for professional software engineering where you need it to be maintainable and reliable.”
— Ben Mann

“We pioneered RLAIF, which is reinforcement learning from AI feedback… and the method that we used was called Constitutional AI.”
— Ben Mann

“It has to boil down to empiricism… at some point, we’re gonna need to work with companies that have actual bio labs.”
— Ben Mann

“MCP is sort of a democratizing force in letting anybody… integrate against a fully fledged client, regardless of what model provider or long tail service provider you have.”
— Ben Mann

Questions Answered in This Episode

How far can AI self-improvement go before human oversight becomes ineffective or purely supervisory rather than substantive?

Anthropic co-founder Ben Mann discusses the Claude 4 release, emphasizing major improvements in coding reliability, long-horizon autonomy, and agentic workflows, particularly through Claude Code. ...

Get the full analysis with uListen AI

Where should labs draw a hard line on AI safety research that’s too risky to conduct, even under strong containment?

Get the full analysis with uListen AI

How will enterprises practically decide when to delegate long-horizon, high-stakes workflows to agentic systems like Claude versus keeping humans in the loop?

Get the full analysis with uListen AI

What governance structures should exist around open standards like MCP to prevent concentrated control by a few dominant providers?

Get the full analysis with uListen AI

As AI begins outperforming domain experts in medicine, law, and biology, who should own responsibility and liability for decisions made with AI assistance?

Get the full analysis with uListen AI

Transcript Preview

Sarah Guo

Hi listeners, and welcome back to No Priors. Today we have Ben Mann, previously an early engineer at OpenAI, where he was one of the first authors on the GPT-3 paper. Ben was then one of the original eight that abandoned ship in 2021 to co-found Anthropic, with a commitment to long-term safety. He has since led multiple parts of the Anthropic organization, including product engineering and now labs, home to such popular efforts such as Model Context Protocol and Claude Code. Welcome, Ben. Thank you so much for doing this.

Ben Mann

Of course. Thanks for having me.

Sarah Guo

So congratulations on the Claude 4 release. Maybe we can even start with, like, how do you decide what qualifies as a release these days?

Ben Mann

It's definitely more of an art than a science. We have a lot of spirited internal debate of what the number should be, and before we even have a potential model, we, we have a road map where we try to say, based on the amount of chips that we get in, uh, when will we theoretically be able to train a model out to the Pareto-efficient compute frontier? So it's all based on scaling laws. And then once we get the chips, then we try to train it and inevitably things are less than the best that we could possibly imagine because that's just the nature of, of the business. It's, it's pretty hard to train these big models, so dates might change a little bit. And then at some point it's, like, mostly baked and we're sort of, like, slicing off little pieces close to the end to try to say, like, how is this cake gonna taste when it comes out of the oven? But, uh, as Dario has said, until it's really done you, you don't really know. You can get sort of a directional indication. And then if it feels like a major change, then we give it a major version bump, but we're definitely still learning and iterating on this process, so yeah.

Sarah Guo

Well, the good thing is that you guys are, uh, you know, no less torture than anybody else in your (laughs) naming scheme here.

Ben Mann

Yes.

Elad Gil

The naming schemes in AI are s- are something else, so you, you folks have a, a simplified version (laughs) in some sense. Do you wanna, um, mention any of the highlights from 4 that you think are especially interesting, or, you know, those things around coding and other areas, we'd just love to hear your perspective on that.

Ben Mann

By the benchmarks, 4 is just dramatically better than any other models that we've had. Even for Sonnet, is dramatically better than 3-7 Sonnet which was our prior best model. Some of the things that are dramatically better are, for example, in coding it is able to, uh, not do its, uh, sort of off target mutations or over-eagerness, or reward hacking. Those are two things that people were really unhappy with in, in the last model where they were like, "Wow, it's so good at coding, but it also makes all these changes that I definitely didn't ask for." It's like, "Do you want fries and a milkshake with that change?" And you're like, "No, just do the thing I asked for," and then you have to spend a bunch of time cleaning up after it. The new models, they just do the thing, and, uh, and, and so that's really useful for professional software engineering where you need it to be maintainable and reliable.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome