Skip to content
ClaudeClaude

Ship your first Managed Agent

Build and ship a working incident-investigator agent on Anthropic's Managed Agents platform: define an Agent, Environment, and Session, stream events, and wire up custom tools, all in six functions. You'll leave with a running agent, the mental model for the server-side loop, and a roadmap to production features like subagents, vaults, and memory.

May 26, 202637mWatch on YouTube ↗

CHAPTERS

  1. Session goal and roadmap: from managed-agent concepts to a working IR agent

    Isabella Hee sets expectations for a hands-on build: understand Claude Managed Agents at a conceptual level, then implement an incident-response (SRE) agent in a workshop repo. She previews follow-on topics like “dreaming” for memory/self-improvement.

  2. Why Managed Agents: evolution from Messages API → Agent SDK → fully managed harness

    The talk traces how developers previously had to build core agent primitives themselves, then progressed to the Agent SDK, and finally to Managed Agents for production-grade scaling and reliability. The shift is driven by increasing model capability and the rising complexity of context/tool orchestration.

  3. Harnesses must evolve with models: mitigations like “context anxiety”

    Isabella explains that harness logic must change as model behaviors change. Managed Agents aims to absorb this maintenance burden so developers can focus on domain tools and task configuration.

  4. Core building blocks: Agent (brain), Environment (hands), Session (binding + streaming)

    She introduces the three main resources used to build with Managed Agents and how they map to a mental model of agent cognition vs action. Sessions connect everything and enable durability even if the client disconnects.

  5. Key architecture choice: decoupling agent loop from tool execution

    Managed Agents separates “brains” (agent loop) from “hands” (tool execution), improving security and latency. This enables stronger sandboxing/credential separation and dramatically reduces time-to-first-token by avoiding per-session container spin-up for the loop.

  6. Workshop setup: clone repo, configure environment, run Streamlit app

    Attendees are guided through cloning a prepared repository, installing dependencies, setting API keys, and launching a Streamlit UI. The README mirrors the steps for later self-paced completion.

  7. Incident response scenario: why an SRE agent matters

    The app simulates a painful on-call incident workflow and frames an agent as the first responder that can inspect metrics/logs/deployments. The goal is to reduce human toil (and 3AM wake-ups).

  8. Defining the Agent: model choice, system prompt, and tool access

    Isabella builds the agent definition by copying from a completed reference file. The system prompt is intentionally simple: define the SRE role and grant access to incident-relevant tools.

  9. Defining the Environment: networking policy and deployment flexibility (BYOC)

    She configures the environment where actions run, highlighting recent support for bring-your-own containers/compute. Networking is controlled via an allowlist, and private connectivity options (MCP tunnels) are mentioned for secure deployments.

  10. Adding data/context: uploading logs via the Files API for analysis

    The agent is given incident artifacts (metrics/logs) as files so it can process them during the run. Isabella emphasizes context engineering as a major lever for agent performance and customization.

  11. Creating Sessions and streaming event-based execution (not request/response)

    Sessions bind agent + environment + mounted resources, and output is streamed as a sequence of events. This supports better UX (see tool calls as they happen) and better observability (everything logged).

  12. Wiring local tools and session deletion: making the agent actually act

    The agent can respond conversationally before tools are connected, but cannot perform incident investigation until tool handlers are wired. The workshop adds local tool implementations and a session-delete capability for control and security hygiene.

  13. Live run: debugging the latency incident and returning a root cause + actions

    The completed agent runs the investigation: inspects logs, checks recent deployments, and correlates signals to identify the issue. It concludes the P99 spike is due to DB pool exhaustion introduced by a recent refactor and suggests remediation steps.

  14. Persistence, session states, and under-the-hood recap + next features

    Isabella demonstrates session durability across refreshes and explains session state transitions (idle/running/rescheduling/terminated). She closes by mapping the workshop pieces to the managed-agent architecture and previewing advanced features like subagents, memory/dreaming, outcomes, and vaults.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.