Memory and dreaming for self learning agents

How memory and dreaming turn Claude Managed Agents into self-learning systems. This session walks through design considerations for memory architectures and how dreaming verifies and enriches memory between sessions.

May 21, 202621mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Anthropic’s memory and dreaming enable self-improving, multi-agent systems at scale

Memory is introduced as a way for agents to improve from task to task by retaining and reusing learnings, rather than starting each session from a blank slate.
Anthropic’s Memory is designed as a file-system-like store because models are already strong at navigating, editing, and organizing files with familiar tools such as Bash and Grep.
Multi-agent requirements drive features like shared stores, read-only vs read-write scopes, optimistic concurrency control to prevent clobbered writes, and versioned audit trails with attribution.
A new research-preview feature called Dreaming runs out-of-band batch jobs over multiple session transcripts to detect recurring mistakes/inefficiencies and propose curated, better-organized memory updates.
A demo SRE alert-triage system shows cross-session coordination (e.g., knowing a fix is in flight) and Dreaming-derived insights (e.g., recurring alert timing after CPU spikes) that improve future agent responses.

IDEAS WORTH REMEMBERING

5 ideas

Memory shifts agent performance from “reset each task” to “learn each task.”

Persisting structured knowledge lets agents carry forward strategies, tool usage patterns, and prior mistakes, improving outcomes over successive sessions and environments.

Model-friendly formats matter; file-system memory leverages existing strengths.

Because Claude is strong at browsing and editing files, representing memory as files reduces bespoke tooling and lets the model autonomously decide what to save and how to structure it.

Scoping and sharing are foundational for organizational multi-agent systems.

Read-only org-wide memory can disseminate stable guidance (runbooks, policies), while smaller read/write stores enable task-specific learning without polluting shared knowledge.

Concurrency control is essential once multiple agents write to shared memory.

Optimistic concurrency prevents one session from overwriting another’s updates, enabling simultaneous collaboration without corrupting or losing important learnings.

Enterprise observability turns memory into a governable system, not a black box.

Version history, diffs, and attribution create an audit trail of how memory changes over time and which agent/session made each change, supporting debugging and compliance.

WORDS WORTH SAVING

5 quotes

We now have the building blocks for agents to learn over time and improve from one task to the next.

— Ravi

Memory lets agents learn. It lets agents carry forward learnings from their previous tasks.

— Ravi

So with Memory, we've modeled it as a file system to Claude. Again, the key principle is getting out of Claude's way and letting it use the capabilities it already has that are very strong at. Or as we like to say, let it cook.

— Ravi

And the general theme was memory was being updated in a locally optimal way, but it wasn't globally optimal.

— Ravi

Dreaming truly enables continuous self-learning. It closes the loop on memory.

— Ravi

Long-horizon agent tasks and context managementFile-system-based memory representationShared memory stores and permission scopesOptimistic concurrency control for multi-writer memoryVersioning, diffs, attribution, and audit logsStandalone Memory API (CRUD, export, redaction)Dreaming as out-of-band memory curation across sessions

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.