Using LongMemEval to Improve Agent Memory

Sam Bhagwat, co-founder of Mastra and author of Principles of Building AI Agents, shares how they’ve been pushing the limits of agent memory. He explains the Long Mem Eval benchmark, breaks down why memory matters for reasoning across long conversations, and shows how simple changes—like tailored templates, targeted updates, and better data structures—led to state-of-the-art results. Chapters: 00:12 - Overview of the Long Mem Eval Benchmark 01:15 - Understanding Memory in AI Agents 01:59 - Information Extraction in Memory 02:30 - Multi-Session Reasoning 03:27 - Temporal Reasoning 04:10 - Knowledge Updates in Memory 05:13 - Handling Missing Information 05:46 - Types of Memory in Masra Agents 05:58 - Semantic Recall Explained 06:51 - Working Memory and Templates 07:12 - Initial Benchmark Results 07:43 - Improving Memory Implementation 11:08 - Configuration Matters 12:14 - Future Steps and Conclusion

Sam Bhagwathost

Aug 25, 202513mWatch on YouTube ↗

EPISODE INFO

Released: August 25, 2025
Duration: 13m
Channel: YC Root Access
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Sam Bhagwat, co-founder of Mastra and author of Principles of Building AI Agents, shares how they’ve been pushing the limits of agent memory. He explains the Long Mem Eval benchmark, breaks down why memory matters for reasoning across long conversations, and shows how simple changes—like tailored templates, targeted updates, and better data structures—led to state-of-the-art results. Chapters: 00:12 - Overview of the Long Mem Eval Benchmark 01:15 - Understanding Memory in AI Agents 01:59 - Information Extraction in Memory 02:30 - Multi-Session Reasoning 03:27 - Temporal Reasoning 04:10 - Knowledge Updates in Memory 05:13 - Handling Missing Information 05:46 - Types of Memory in Masra Agents 05:58 - Semantic Recall Explained 06:51 - Working Memory and Templates 07:12 - Initial Benchmark Results 07:43 - Improving Memory Implementation 11:08 - Configuration Matters 12:14 - Future Steps and Conclusion

SPEAKERS

Sam Bhagwat
host
Co-founder and CEO of Masra (a TypeScript agent framework) and author of “Principles of Building AI Agents.”

EPISODE SUMMARY

In this episode of YC Root Access, featuring Sam Bhagwat, Using LongMemEval to Improve Agent Memory explores benchmark-driven improvements to AI agent memory using LongMemEval framework LongMemEval evaluates agent memory across five subskills—information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and handling missing information.

RELATED EPISODES