Skip to content
ClaudeClaude

What is Claude Managed Agents?

Claude Managed Agents is a suite of APIs for building production-ready agents. You define the tools, environments, and success criteria. Claude works until the job is done. Outcomes, multi-agent orchestration and memory are available as limited research preview. Apply here for early access: http://claude.com/form/claude-managed-agents

Apr 9, 20263mWatch on YouTube ↗

CHAPTERS

  1. Claude Managed Agents overview: APIs for scalable, sandboxed agent execution

    The video opens by defining Claude Managed Agents as a suite of APIs for building and deploying agents at scale. It emphasizes configuring agent personas/tools and running work inside isolated containers with filesystem access, bash, and web search.

    • Managed Agents as an API suite for building/deploying agents at scale
    • Agents defined by tools, personas, and capabilities
    • Configurable sandbox environments (packages + network controls)
    • Sessions launched from your app; Claude executes in isolated containers
    • Built-in access patterns: filesystem, bash execution, web search
  2. Kanban-triggered agent sessions: turning a ticket into an automated workflow

    A Kanban board UI is used to demonstrate how moving a ticket to “In progress” automatically starts an agent session. The backend creates the session and attaches it to a preconfigured environment with the needed tooling and repo access.

    • Drag-and-drop ticket movement triggers an agent session
    • Backend creates and configures the session automatically
    • Environment includes pre-installed dependencies (e.g., Lighthouse, Puppeteer)
    • Mounting a GitHub repo into the container for direct code access
    • Task is framed with an explicit rubric for success
  3. Website performance optimization loop: audits, changes, and streaming tool output

    The agent runs performance audits and makes concrete optimizations (images, CSS, scripts) while streaming tool calls back to the Kanban board in real time. This section highlights tight feedback via an event stream during execution.

    • Agent runs a Lighthouse-style audit within the sandbox
    • Performs optimizations: compress images, inline CSS, defer scripts
    • Targets a defined rubric (e.g., Lighthouse score > 90, no render-blocking)
    • Tool calls/events stream back live to the UI via an event stream
    • Demonstrates end-to-end automation from ticket to changes
  4. Separate grading context: rubric-based evaluation and iterative resubmission

    A dedicated grader (running separately) evaluates the agent’s output against predefined criteria. The agent reads feedback, corrects misses, and resubmits until it meets the bar.

    • Rubric-based evaluation drives quality control
    • A separate grader runs in its own context window
    • Agent consumes grader feedback to improve output
    • Iterative loop: fix issues and resubmit
    • Example outcome: performance score reaches 96
  5. Parallelism: multiple sessions and containers running concurrently

    The workflow demonstrates that multiple tickets can be run at the same time. Each session gets its own isolated container, enabling true parallel task execution without interference.

    • Second ticket can be started while the first is still running
    • Two sessions map to two separate containers
    • Isolation across tasks prevents cross-contamination
    • Parallel execution supports scaled, multi-workstream teams
    • Illustrates how Managed Agents enables concurrency by design
  6. SaaS pricing intelligence agent: web search, change tracking, and cost analysis

    A second agent is showcased that monitors pricing and plan changes across SaaS vendors. It gathers current data via web search and performs cost analysis in Python inside the sandbox.

    • Automated tracking of pricing and plan tier changes across SaaS tools
    • Uses web search to find current pricing pages and updates
    • Flags new features that may impact contracts
    • Runs quantitative cost analysis in Python within the sandbox
    • Produces a report suitable for team review before standup
  7. Operational reporting integrations: spreadsheets, Slack, Asana via MCP

    The pricing agent generates outputs in familiar business formats and pushes results into existing workflows. It uses spreadsheet capabilities and posts updates to collaboration/project tools through MCP servers.

    • Uses an Excel/spreadsheet skill to structure results
    • Writes an executive summary for stakeholders
    • Posts a report link to Slack when complete
    • Creates a review task in Asana
    • Integrations are performed through MCP servers
  8. Memory-backed weekly deltas: avoiding repeat work and highlighting changes

    Memory is used to compare current findings to prior weeks, so the report focuses on what changed rather than repeating static information. The agent reads previous state before running and writes back updated state after finishing.

    • Agent reads from a memory store before starting
    • Stores new observations after completing the run
    • Weekly reports emphasize deltas (what changed) vs. static listings
    • Example delta: “Cloud compute 15% lower since last week”
    • Memory enables continuity across recurring workflows
  9. Monitoring alert to agent session: ingesting tool results from your backend

    An incident workflow begins with an alert from a monitoring stack. A custom backend tool forwards the alert payload into a new session as a tool result, kicking off automated triage.

    • Alert originates in the monitoring stack
    • Custom backend tool receives the alert payload
    • Payload is injected into a new agent session as a tool result
    • Shows how Managed Agents plugs into existing ops pipelines
    • Sets up automated incident response execution
  10. Multi-agent coordination: coordinator + specialists sharing a filesystem

    The triage session uses multi-agent coordination: a coordinator delegates to multiple specialists. Each specialist runs in its own context window while sharing a common filesystem, then reports back for synthesis.

    • Coordinator agent delegates work to three specialists
    • Specialists run in separate context windows
    • Shared filesystem enables collaboration on the same artifacts
    • Specialists report findings back to the coordinator
    • Coordinator synthesizes results into a unified incident summary
  11. Human-in-the-loop permissions: drafting an update and requiring approval

    Before sending an incident update externally, a permissions policy triggers an approval step. The user reviews a draft message on screen and approves it before it posts to Slack.

    • Permissions policy enforces gated actions
    • Agent produces a draft incident update
    • User reviews/approves before external posting
    • Slack posting occurs only after approval
    • Balances automation with operational safety controls
  12. Incident memory and pattern recognition: leveraging past context for faster diagnosis

    Memory connects incident workflows by letting the coordinator reference past incidents and detect recurring patterns. This enables the agent to start future responses with relevant context rather than re-diagnosing from scratch.

    • Coordinator checks past incidents in the memory store
    • Flags patterns matching historical issues
    • Example: DNS resolution issue linked to misconfigured TTL
    • Future alerts can start with prior context and hypotheses
    • Reduces time-to-diagnosis through continuity
  13. Wrap-up: building a managed, stateful agent experience with clear outcomes

    The closing summarizes the Managed Agents building blocks and the developer’s role in defining success criteria. The core promise is outcome-driven execution: you define what “done” means, and Claude iterates until it gets there.

    • Managed components: agents, sessions, environments, tools, MCP, memory, outcomes
    • Support for stateful experiences and multi-agent coordination
    • Developers define success criteria (“what done looks like”)
    • Claude works iteratively toward the defined outcome
    • Positioning: fully managed agent execution for production-scale workflows

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.