CHAPTERS
Why “memory” is the next key primitive for long-running agents
Mahesh frames the talk around a missing capability in today’s agents: continuous self-learning and long-horizon context management. He positions memory as the primitive that enables agents to improve based on experience across hours- or days-long tasks.
What agents should learn: tasks, environments, and other agents’ discoveries
The talk breaks down the kinds of information agents need to retain to become more effective over time. Memory isn’t just personal notes—it’s a shared substrate for learning across a multi-agent environment.
Memory in Claude Managed Agents: public beta and early customer impact
Anthropic launched Memory in Claude Managed Agents (public beta) to provide a “frontier memory system” that works out of the box while still fitting enterprise needs. Early adopters report major quality gains and efficiency improvements.
Requirement 1 — Maximize intelligence by default (and avoid over-constraining agents)
Mahesh explains how earlier memory approaches were more constrained (notes files, strict tool schemas). The newer direction is to give models more autonomy to decide what to store and how to organize it.
File-system-based memory: memory as editable files Claude can manage
Managed Agents models memory as a file system with hierarchy and formats that Claude can update using familiar tools (bash/grep). This matches modern model strengths and keeps memory flexible and evolvable.
Requirement 2 — Scaling memory for multi-agent systems (permissions + concurrency)
As enterprises run hundreds or thousands of agents concurrently, memory must support safe collaboration. The design centers on scoped access and safe concurrent updates.
Requirement 3 — Enterprise controls: auditability, attribution, and portability
Production deployments need strong governance: visibility into changes, who made them, and when. Customers also want to manage memory outside the managed environment for compliance and operations workflows.
A layered view of “frontier memory”: storage, structure/content, and process
Mahesh outlines a framework for thinking about memory systems. Beyond storing data and structuring it, the hardest part becomes the process layer—how memory gets updated and curated over time.
Observed limitations in real deployments: silos, missed patterns, and curation overhead
As multi-agent usage grows, per-session memory updates can miss cross-session learnings and shared failure patterns. Memory stores can also become large and messy without an efficient holistic maintenance process.
Dreaming (research preview): batch synthesis of learnings into organized memory
Anthropic introduces “dreaming,” an asynchronous batch process that scans recent session transcripts to detect patterns, mistakes, and effective strategies—then proposes structured memory updates. Early testing shows substantial improvements in task outcomes.
How dreaming works operationally: triggers, review flow, and applying diffs
Dreaming can be run on a schedule, invoked via API, or triggered after tasks complete. It outputs an updated memory state (diff) that can be applied immediately or reviewed with additional checks.
Why dreaming matters: out-of-band, multi-agent perspective, and no hot-path latency
Dreaming is designed to keep task execution focused while separately optimizing memory quality. Its cross-agent vantage point makes it better at finding shared patterns than any single agent working in isolation.
Scaling to enterprise knowledge bases: using extra compute to curate memory
As memory stores become enterprise-scale, dreaming acts like an indexing/curation layer that amortizes effort across many downstream agents. The analogy is to test-time compute and search indexing—spend compute upfront for faster, better retrieval later.
Demo walkthrough: SRE agents, scoped memory stores, auditing, and dreaming-generated improvements
A demo shows an SRE alert workflow where agents share read-only org runbooks and a read-write SRE memory store. The system demonstrates faster incident response via shared notes, enterprise audit controls, and dreaming’s ability to deduplicate, remove stale info, and add validated insights from cross-session analysis.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome