This Startup Is Trying To Solve The AI Memory Problem

While LLMs continue to evolve, they still struggle with memory. The startup Mem0 is working to change that by building the memory layer for AI agents. In this episode of Founder Firesides, YC’s Nicolas Dessaigne sat down with co-founders Taranjeet Singh and Deshraj Yadav to discuss why agents need persistent memory to improve over time, how Mem0 reduces cost and latency compared to native context stuffing, and why memory must remain neutral across models as AI becomes more agent-driven. Chapters: 00:05 What Is Mem0? 00:49 Traction & Open Source Adoption 01:24 Why Memory Improves AI Agents 02:01 Saving Cost and Latency 02:31 Founder Origins & YC Pivot 05:13 How Mem0 Works Under the Hood 06:04 Hybrid Memory Architecture 07:10 Custom Memory Rules & Expectations 08:00 Real-World Use Cases 10:05 Competing With Model-Native Memory 11:48 Fundraising & What’s Next

Nicolas DessaignehostDeshraj YadavguestTaranjeet Singhguest

Jan 22, 202617mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Mem0 builds neutral, portable memory layer for AI agents and apps

Mem0 addresses the core limitation that LLMs are stateless by adding a dedicated memory layer for AI agents and applications.
The company has strong open-source traction, citing 14M+ Python downloads, 41k GitHub stars, and adoption across major agent frameworks.
Mem0 claims it reduces cost and latency by storing and retrieving only the most relevant memories rather than stuffing full history into the context window.
Under the hood, Mem0 uses a hybrid architecture (key-value, semantic chunks, and graph relationships) to retrieve accurate memories in real time.
The founders argue memory should be decoupled from any single model provider, positioning Mem0 as neutral infrastructure even as labs add model-native memory.

IDEAS WORTH REMEMBERING

5 ideas

Memory is becoming a default primitive for AI agents.

The founders frame memory as essential for agents to improve over time, similar to how apps require storage and identity; without it, agents restart from scratch each interaction.

“Stuff everything into the context window” is the expensive anti-pattern Mem0 replaces.

Instead of sending full conversation histories as tokens (higher cost and latency), Mem0 retrieves the most relevant facts/preferences so prompts stay small and targeted.

Accuracy and speed require more than vector search alone.

Mem0’s hybrid approach classifies incoming information into key-value facts, semantic chunks, and graph relations, then pulls from all three to balance precision, recall, and low-latency retrieval.

Memory is an “expectation problem,” so customization is critical.

Different apps and users define “important” differently; Mem0 supports plain-language rules to include/exclude categories of memories and align behavior to the product’s intent.

Forgetting is a feature, not a bug.

They implement multiple decay modes—hard cutoff, exponential weighting toward recency, and domain-specific retention (e.g., travel preferences may stay relevant indefinitely).

WORDS WORTH SAVING

5 quotes

LLMs are stateless. They don't remember things like human remembers.

— Taranjeet Singh

The most naive way… is by passing everything into the context window.

— Taranjeet Singh

Memory is an expectation problem.

— Taranjeet Singh

You would not want to tie your memory to any… model provider out there.

— Taranjeet Singh

Make it work, make it neutral, and make it portable.

— Taranjeet Singh

LLM statelessness and agent memoryOpen-source distribution and adoption metricsPrompt/context optimization for cost and latencyHybrid memory store: key-value, semantic, graphCustomization via natural-language rulesMemory decay strategies (hard, exponential, domain-specific)Neutral, portable memory vs model-provider lock-in

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.