At a glance
WHAT IT’S REALLY ABOUT
Hands-on guide to building a production-ready incident response agent fast
- The speaker contrasts the Messages API, Agent SDK, and Claude Managed Agents to explain why managed infrastructure matters for scaling, safety, and production reliability.
- Claude Managed Agents center on three primitives—agent (persona/capabilities), environment (execution “hands”), and session (binds them together)—with the agent loop running server-side for durability.
- A key architectural choice is decoupling the agent loop (“brain”) from tool execution (“hands”), improving security boundaries (credentials) and dramatically reducing time-to-first-token latency.
- The hands-on portion walks through cloning a repo and incrementally wiring an SRE incident-response agent: defining the agent prompt/tools, configuring an environment, mounting logs/files, creating sessions, and streaming event-based outputs.
- The demo incident shows the agent using tools (metrics, deploys, diffs/logs) to diagnose a P99 latency regression and propose remediation, then highlights persistence, session state management, and observability features.
IDEAS WORTH REMEMBERING
5 ideasManaged Agents remove most “agent harness” operational burden.
Instead of building compaction, caching, agent loops, scaling, and durability yourself (as with raw model access or SDK-only approaches), the managed harness provides production-ready infrastructure so you focus on domain logic, tools, and configuration.
Think in three building blocks: agent, environment, and session.
The agent defines persona/model/tools (“brain”), the environment is where actions run (“hands”), and a session binds them—plus resources like mounted logs—so the agent can act and you can stream progress back to users.
Decoupling brain and hands improves both security and latency.
Separating the agent loop from tool execution reduces credential exposure risk (enables stronger sandboxing/vault patterns) and avoids spinning up heavy tool containers for every interaction, yielding large TTFT improvements (claimed >90% P95 reduction).
Event streams are the core UX and reliability primitive, not request/response tokens.
Sessions append structured events (user messages, tool calls, agent outputs) enabling real-time streaming to the UI, robust observability logs, and resumability even if execution containers restart.
Context engineering is where most agent quality comes from.
The demo emphasizes uploading/mounting logs and metrics files so the agent can analyze real artifacts; in production, those same tools can be swapped from local JSON to services like Datadog using the same wire protocol.
WORDS WORTH SAVING
5 quotesPart of the reason why we built Claude Managed Agents is becau-because harnesses should evolve alongside your agents.
— Isabella Hee
So the takeaway there is that it's a lot of work to maintain harnesses and make sure that they actually evolve alongside your agents, which is why with Claude Managed Agents, we want to make it really easy for Claude and Anthropic to handle all the complexities that come with compash-compaction, caching, things like context anxiety, all these various primitives that come with actually making an agent production ready and getting the most out of Claude.
— Isabella Hee
Previously, with the agent loop and tool execution in the same box, you had to spin up containers for every single session that you're spinning up an agent, which contributed to additional latency from a time to first, time to first token perspective.
— Isabella Hee
But with this now decoupled, our teams actually saw reductions in time to first token along the lines of over ninety percent reduction in TTFT for our P95 metrics on latency.
— Isabella Hee
With Cloud Managed Agents, instead of just having a request response, we actually have events appended to logs.
— Isabella Hee
High quality AI-generated summary created from speaker-labeled transcript.
