Does Mem0 support graph memory?

Yes, but only on the Pro tier at 249 dollars per month. The free and basic tiers are vector-only. Zep includes graph memory at all tiers since deprecating its Community Edition in April 2025.

What happened to Zep Community Edition?

Zep deprecated its free Community Edition in April 2025. Self-hosting Zep now requires running Graphiti plus a compatible graph database such as Neo4j, FalkorDB, or Kuzu.

What is context engineering vs memory layer?

Memory layers store and retrieve what was said in conversations. Context engineering, as Zep uses the term, extends this to structured knowledge about entities. Atlan uses context layer to mean governed semantic definitions, lineage, and access policies for enterprise data agents, which is a different architectural layer.

Zep vs Mem0: Benchmarks, Pricing, and When to Use Each

Q: What is the difference between Zep and Mem0?

Zep is graph-first using Graphiti, a temporal knowledge graph that tracks when facts change. Mem0 is vector-first with optional graph enhancement behind its Pro tier. Both solve agent memory but for different architectural needs.

Q: Is Zep better than Mem0 for AI agents?

Neither is universally better. Zep is stronger for temporal reasoning and tracking how facts change over time. Mem0 is stronger for broad ecosystem coverage and ease of setup. Choose based on your specific use case.

Q: What is Graphiti and how does it relate to Zep?

Graphiti is the temporal knowledge graph engine that powers Zep. It stores facts with bi-temporal validity windows so agents can reason about when facts were true, not just what is currently true.

Zep and Mem0 are the two leading frameworks for giving stateless LLMs persistent memory, but they take fundamentally different architectural approaches: Mem0 is vector-first with optional graph enhancement, while Zep is built on Graphiti, a temporal knowledge graph that models when facts were true, not just that they were true. On LongMemEval, Zep scores 63.8% vs Mem0’s 49.0% on GPT-4o; on LOCOMO, the two vendors dispute each other’s methodology, the benchmark war is real and unresolved. This guide breaks down what each tool does, how their architectures differ, what the benchmark numbers actually mean, when each fits, and where both hit an architectural ceiling for enterprise data agents.

Dimension	Zep	Mem0
What it is	Context engineering platform powered by Graphiti temporal knowledge graph	Memory layer with vector-first storage and optional graph enhancement (Mem0g)
Core storage	Bi-temporal knowledge graph (valid time + transaction time)	Vector embeddings (base) + directed labeled graph (Mem0g)
Who owns it	Zep AI (startup, active commercial SaaS + Graphiti OSS)	Mem0 (YC + Basis Set; Apache 2.0 open source)
Key strength	Temporal reasoning: tracks when facts changed, not just what they are	Breadth + ecosystem: AWS Strands, CrewAI, Flowise, 41K+ GitHub stars (at Series A)
Best for	Agents requiring causal/temporal reasoning; enterprise graph relationships	Developers who need a functional memory layer quickly; consumer + B2B copilots
LongMemEval score	63.8% (GPT-4o)	49.0% (GPT-4o)
Self-hosting	Community Edition deprecated April 2025; requires Graphiti + graph DB	Full stack self-hostable; Apache 2.0; Docker-ready
Pricing	Credit-based; Graphiti graph at all tiers (~$25/mo Flex)	Free to $19/mo Starter; graph locked to Pro ($249/mo, 13x jump)

What’s the difference between Zep and Mem0?

Zep and Mem0 both solve the stateless LLM problem, they give agents a way to remember past interactions, but their architectures diverge at the storage layer. Mem0 extracts salient facts into vector embeddings and optionally a directed graph. Zep stores everything in Graphiti, a temporal knowledge graph that tracks when facts were true, enabling queries like “what did the user prefer last month?”

The core architectural distinction

Both tools extract facts from conversations and return relevant context at query time. That’s the shared job. Where they diverge:

Mem0: extraction-first pipeline, an LLM identifies salient facts, stores them in a vector DB, and optionally promotes them to Mem0g’s directed graph; retrieval is by semantic similarity
Zep: graph-first pipeline, every conversation episode becomes a graph update; Graphiti models entities, relationships, and validity windows; retrieval combines semantic embeddings, BM25 keyword search, and graph traversal

A concrete example: a user changes their shipping address. With Mem0’s base configuration (absent an explicit contradiction signal in the update phase), the old address may be retrieved if it is semantically closer to the query than the updated fact. With Zep/Graphiti, the old address is marked invalid with a timestamp, and only the current address surfaces on subsequent queries. (Sources: arXiv 2501.13956; arXiv 2504.19413)

Why the distinction matters now

Gartner predicts that 40% of enterprise applications will feature task-specific AI agents by 2026, up from less than 5% in 2025. More agents means a growing requirement for memory that reasons about change over time, not just static facts. Zep’s v3 rebrand to “context engineering platform”, citing Andrej Karpathy and Shopify’s Tobias Lutke as endorsers, signals that the market is maturing past simple vector retrieval.

Why confusion persists

Both tools call themselves memory layers, and both now support graph-based retrieval in 2026. The benchmark dispute makes objective comparison harder: Zep originally claimed 84% on LOCOMO; Mem0 corrected this to 58.44% (alleging adversarial category inclusion errors); Zep counter-claimed 75.14%. Neither vendor’s benchmarks measure enterprise data context, governed definitions, lineage, access policy, which is where the real distinction from enterprise requirements emerges. (Source: GitHub: getzep/zep-papers/issues/5)

What is Zep?

Zep is a memory and context engineering platform for AI agents, built on Graphiti, an open-source temporal knowledge graph engine. Unlike vector-first memory systems, Zep models facts with validity windows: when a fact was true, and when it was recorded. This enables agents to reason accurately about change over time. Zep’s DMR benchmark score is 94.8%; the Graphiti engine has 24K+ GitHub stars.

Core definition and purpose

Zep started as a stateless LLM memory service. With v3, it repositioned as a “context engineering platform”. Its core engine is Graphiti (Apache 2.0), a real-time temporal knowledge graph for AI agents built on three subgraph layers:

Episodic layer: raw conversation sessions stored as episodes; every episode drives graph updates without overwriting history
Semantic layer: extracted entities (people, organizations, preferences, facts) stored with typed edges and validity windows (9 node types, 8 relationship types in v3)
Community layer: higher-order clusters aggregating related entities; reduces retrieval noise for long-running agent sessions

Bi-temporal modeling gives every fact two timestamps: valid time (when it was true in the world) and transaction time (when Graphiti ingested it). This supports queries like “What was the contract status in March?”, a capability that pure vector retrieval cannot provide. (Source: arXiv 2501.13956)

Why Graphiti matters now

Graphiti MCP Server v1.0 shipped November 2025, compatible with Claude Desktop, Cursor, and any MCP client, reaching hundreds of thousands of weekly MCP users. Zep scaled 30x in two weeks in late 2025; infrastructure optimizations brought P95 graph search from 600ms to 150ms. The vector database vs knowledge graph distinction is central to Zep’s competitive advantage here.

Maturity and evolution

Zep Community Edition was deprecated in April 2025, with additional feature retirements in February 2026. Self-hosting now requires Graphiti plus a compatible graph database, Neo4j, FalkorDB, or Kuzu, meaning at minimum three systems to provision. A Docker image staleness issue (v0.10 vs current v0.22, reported six months post-CE deprecation) signals an operational gap for smaller teams. Zep’s positioning shift from “memory” to “context engineering” is genuine, but important to parse: Graphiti is still a conversation graph, not a governed enterprise context layer.

Core components of Zep

Graphiti temporal knowledge graph: Open-source engine (Apache 2.0) modeling entities, relationships, and validity windows with bi-temporal timestamps
Episodic memory layer: Stores raw conversation sessions as episodes; every episode drives graph updates without overwriting history
Semantic entity layer: Extracted entities with typed edges and validity windows (9 node types, 8 relationship types in v3)
Community summaries: Higher-order clusters aggregating related entities; reduces retrieval noise for long sessions
Hybrid retrieval: Combines semantic embedding search, BM25 keyword search, and graph traversal, not vector-only
MCP Server: Native integration with Claude Desktop, Cursor, and any MCP-compatible client (v1.0, November 2025)

What is Mem0?

Mem0 is an open-source memory layer for AI agents that extracts salient facts from conversations, stores them as vector embeddings, and retrieves relevant context at inference time. Its graph variant (Mem0g) adds a directed labeled graph for relationship modeling. On LOCOMO, Mem0 scores 66.9% vs OpenAI memory’s 52.9%, a 26% relative improvement. It has 41,000+ GitHub stars (at Series A in October 2025) and a $24M Series A.

Core definition and purpose

Mem0 addresses the fixed context window problem: instead of replaying full conversation history, it extracts key facts (preferences, entities, decisions) and retrieves the relevant subset at inference time. Two variants exist:

Base Mem0: vector + LLM extraction pipeline
Mem0g: directed labeled graph (entities as nodes, typed relationships as edges) for relationship-aware retrieval

The incremental processing pipeline runs an extraction phase (LLM identifies salient facts from new conversation) followed by an update phase (consolidates with existing memory, resolves contradictions). The result: 90%+ token savings vs full-context baseline; p95 latency 91% reduction, practical production benefits that are real and documented. This makes it a strong choice alongside the broader ecosystem of AI agent memory frameworks.

Why Mem0 matters now, ecosystem momentum

AWS selected Mem0 as the exclusive memory provider for the Strands Agent SDK in May 2025, the most significant enterprise validation signal in the memory layer market to date. A $24M Series A followed in October 2025, led by Basis Set Ventures with strategic investors including Dharmesh Shah, and the CEOs of Datadog, Supabase, and PostHog. API call growth from 35M (Q1 2025) to 186M (Q3 2025), approximately 30% month-over-month, confirms real production adoption.

Mem0 integrates natively with CrewAI, Flowise, Langflow, and AWS Strands, giving it the broadest ecosystem coverage in the category. For teams evaluating options, the Mem0 alternatives landscape is also worth reviewing.

Maturity and evolution

Mem0g adds approximately 2% overall score improvement over base Mem0, meaningful but not transformational; the graph is supplementary, not architectural. Graph memory is locked to the Pro tier at $249/mo, a 13x jump from Starter at $19/mo, which is the top developer complaint in community forums. GitHub Issue #2066 documents extreme graph token costs in production: saving 62 photo descriptions took over an hour at costs 15x higher than generation. Full stack self-hostable under Apache 2.0, a clear advantage over Zep post-Community Edition deprecation.

Core components of Mem0

Vector memory store: Fact extraction pipeline that converts conversations into semantic embeddings; retrieval by cosine similarity
LLM extraction layer: Configurable LLM identifies salient facts, user preferences, and entities from raw conversation text
Mem0g graph variant: Directed labeled graph (entities as nodes, typed relationships as edges) for relationship-aware retrieval, Pro tier only on the managed API
Conflict resolution: Update phase resolves contradictions between new facts and existing memory (e.g., old and new address for the same user)
Multi-agent memory: Shared memory namespaces across agent sessions, supports cross-agent context passing
SDK integrations: Native SDKs for Python and JavaScript; integrations with CrewAI, Flowise, Langflow, and AWS Strands Agent SDK

Not sure which architecture fits your agent stack? See our decision framework: How to choose an AI agent memory architecture

Zep vs Mem0: Head-to-head comparison

The sharpest differences between Zep and Mem0 appear in three dimensions: storage architecture (temporal knowledge graph vs vector-first), pricing model (graph available at all Zep tiers vs Mem0’s $249/mo Pro paywall), and self-hosting posture (Mem0 maintains full Apache 2.0 stack; Zep deprecated Community Edition). On benchmarks, both claim state-of-the-art performance, and both contest the other’s methodology.

Dimension	Zep	Mem0
Primary storage	Bi-temporal knowledge graph (Graphiti)	Vector embeddings + optional Mem0g directed graph
Temporal reasoning	Native: fact validity windows, temporal queries	Improving: Mem0g adds ~2% on temporal benchmarks over base
LongMemEval (GPT-4o)	63.8% (independent test)	49.0% (same benchmark)
LOCOMO benchmark	Disputed: 84% (original) → 58.44% (Mem0 corrected) → 75.14% (Zep counter)	66.9% LLM-as-Judge; 26% relative improvement vs OpenAI memory
Self-hosting	Graphiti (Apache 2.0) only; full Zep SaaS stack not self-hostable	Full stack (Apache 2.0); Docker images maintained
Graph access pricing	All tiers (credit model; ~$25/mo Flex)	Pro tier only ($249/mo, 13x jump from $19/mo Starter)
MCP ecosystem	Graphiti MCP Server v1.0 (November 2025)	No MCP server in current release
Ecosystem integrations	MCP clients (Claude Desktop, Cursor)	AWS Strands, CrewAI, Flowise, Langflow
GitHub traction	24K+ stars (Graphiti)	41K+ stars at Series A (mem0)
Failure mode	Self-hosting complexity post-CE deprecation; SaaS gaps for smaller users	Graph paywall blocks production evaluation; graph token costs at scale

The benchmark dispute, what it actually means

Zep’s original 84% LOCOMO claim was challenged by Mem0, which published a correction asserting that Zep had included adversarial question categories that the benchmark specification explicitly excludes, reducing Zep’s score to 58.44%. Zep rebutted this, claiming Mem0 misconfigured Zep in their evaluation and that the corrected Zep score is 75.14%, approximately 10% above Mem0’s best configuration. Both methodologies have been questioned by independent observers.

The honest read: treat any single benchmark number from either vendor with caution. The dispute itself reveals that benchmark construction for conversational memory is still immature.

Real-world example: a B2B SaaS customer success agent

Your agent assists account managers by remembering customer preferences, open issues, and renewal history. A customer calls to change their billing contact and mentions that the Q4 upsell discussion from November is no longer relevant.

With Mem0 (base): The old billing contact remains in vector memory unless explicitly contradicted; the Q4 upsell context may still surface by semantic similarity in future queries.
With Zep/Graphiti: The billing contact update marks the old entity invalid with a timestamp; the Q4 opportunity is flagged as superseded. A future query for “active opportunities for Acme Corp” returns only current state.

Both approaches work. The difference matters when your agent’s reasoning depends on what is true now vs what was said at some point in the past.

Build Your AI Context Stack

A practical guide to layering memory, context, and governance in production agent systems.

Get the Stack Guide

When to use Zep, when to use Mem0, and when you’ll need something more

Zep and Mem0 solve the same core problem, stateless LLMs forgetting what happened in previous sessions, with different trade-offs. In most production agent stacks, you choose one, not both. Where both hit their limit is the same place: neither was designed to serve as the governed context layer for agents that reason about enterprise data assets, business definitions, or access-controlled data pipelines.

When to prioritize Zep

Your agent needs to reason about how facts change over time (customer status, preferences, entity relationships)
You are building on MCP-compatible tooling (Claude Desktop, Cursor) and want native Graphiti integration
Your use case involves complex entity graphs where relationship history matters
You have engineering capacity to manage Graphiti plus a compatible graph database

When to prioritize Mem0

You need a functional memory layer quickly on a managed SaaS or self-hosted Apache 2.0 stack
Your primary integrations are CrewAI, Flowise, Langflow, or AWS Strands
You are building for consumer or B2B copilot use cases where semantic similarity is sufficient
You want graph memory later but need to evaluate before committing to the Pro pricing tier

The benchmark dispute as a signal

Zep’s “Stop Using RAG for Agent Memory” post and the LOCOMO dispute both reveal something instructive: the industry is arguing over recall accuracy on conversation benchmarks, LoCoMo and LongMemEval measure how well an agent remembers what a user said, not whether an agent knows what net_revenue means in your finance data warehouse.

For teams building enterprise AI agent memory systems, recall accuracy matters, but context quality and governance matter more. Understanding the full landscape of types of AI agent memory helps clarify where conversational memory fits relative to other memory types.

When you’ll need something more

McKinsey finds that 8 in 10 companies cite data limitations, not recall accuracy, as the primary roadblock to scaling agentic AI. Gartner predicts that more than 40% of agentic AI projects will be canceled by end of 2027, citing governance gaps and unclear business value.

The failure mode isn’t that Zep or Mem0 retrieves the wrong conversation fact. It’s that neither tool can tell your agent what revenue_recognized means in your data warehouse, who certified the orders table, or whether this agent is authorized to query that data at all. This is the memory layer vs context layer distinction, and it is architectural, not a feature gap.

Inside Atlan AI Labs & The 5x Accuracy Factor

How governed context produces materially better results than memory alone.

Download E-Book

How Atlan approaches the context problem that Zep and Mem0 don’t solve

Atlan operates at a different layer from Zep and Mem0. Where they provide conversation memory (what the user said before), Atlan provides governed enterprise context: what net_revenue means in finance vs product, who certified the orders table, and whether this agent is authorized to query that column at runtime. Gartner named Atlan a Leader in its 2026 Data & Analytics Governance Platforms Magic Quadrant, citing its metadata control plane as central to agentic solutions.

The challenge

Most enterprise AI teams discover the architectural gap late: the agent recalls user preferences correctly (Mem0/Zep doing their job) but pulls revenue_recognized instead of net_revenue because no governed definition was available at inference time. Neither Zep nor Mem0 was designed to surface semantic layer definitions, column-level lineage, or runtime access policies, these are architectural gaps, not feature gaps. Gartner’s projection that 40%+ of agentic AI projects will be canceled by 2027 cites governance gaps, not recall failures, as the primary cause.

Atlan’s unified approach

Atlan’s context layer addresses five governed memory types that neither Zep nor Mem0 provides:

Semantic definitions: governed business metrics like net_revenue, distinct from revenue_recognized
Ontology: cross-system identity resolution, the same “customer” in Salesforce, Snowflake, and SAP
Operational playbooks: routing and disambiguation logic for agent decisions
Column-level data lineage: provenance across cloud systems, tracked and certified
Runtime access enforcement: governance at inference time, not just retrieval, agents cannot access unauthorized data even when memory retrieval succeeds

The Context Engineering Studio provides a systematic workflow for building, testing, and deploying governed enterprise context through specialist AI agents and human-in-the-loop workflows. The Context Repo makes every agent read governed definitions via MCP, the same protocol Graphiti’s MCP Server uses, but serving governed enterprise context rather than conversation history. Snowflake named Atlan its 2025 Data Governance Partner of the Year and selected Atlan as the launch partner for Snowflake Intelligence.

See how Atlan’s context layer works as enterprise memory for AI agents: atlan.com/know/atlan-context-layer-enterprise-memory/

AI Context Maturity Assessment

Find out whether your team is ready to move from memory layers to governed context.

Check Context Maturity

Real stories: enterprise context layers in production

"AI initiatives require more context than ever. Atlan's metadata lakehouse is configurable, intuitive, and able to scale to hundreds of millions of assets."

— Andrew Reiskind, Chief Data Officer, Mastercard

Watch Now →

"Context is the differentiator. Atlan gave our teams the shared vocabulary and lineage to move from reactive data management to proactive AI enablement across CME Group."

— Kiran Panja, Managing Director, Data & Analytics, CME Group

Watch Now →

Memory architecture is a genuine architectural decision

Zep and Mem0 are both serious, production-ready frameworks, and the choice between them is a genuine architectural decision, not a “pick the better one” verdict. Zep’s temporal knowledge graph excels when your agent needs to reason about change over time. Mem0’s vector-first approach with broad ecosystem coverage is the faster path to a functional memory layer for most teams.

The benchmark dispute between the two vendors is real, but it signals immaturity in the evaluation methodology, not a clear winner. Both tools are optimizing for recall accuracy on benchmarks that measure conversation memory, not enterprise data context. Zep’s pivot to “context engineering” is the clearest market signal that the field is evolving past simple vector recall.

For enterprise data agents, systems that need to reason about governed business definitions, certified data lineage, and access-controlled pipelines, both Zep and Mem0 are necessary layers, but neither is sufficient as the enterprise context layer. That requires a different architectural component underneath them.

Explore how a governed context layer works alongside agent memory frameworks: Memory layer vs context layer: which do you actually need?

Ready to see how Atlan's context layer works with your agent stack?

Book a Demo

FAQs about Zep vs Mem0

1. What is the difference between Zep and Mem0?

Zep stores agent memory in Graphiti, a temporal knowledge graph that tracks when facts were true, not just what they were. Mem0 uses vector embeddings to extract and retrieve salient facts, with an optional graph variant (Mem0g). The core difference: Zep is graph-first with native temporal reasoning; Mem0 is vector-first with graph as an add-on. Both solve the stateless LLM problem; they disagree on the best storage architecture.

2. Is Zep better than Mem0 for AI agents?

It depends on the use case. On LongMemEval with GPT-4o, Zep scores 63.8% vs Mem0’s 49.0%, a meaningful gap. For agents that need temporal reasoning (tracking how facts change over time), Zep’s Graphiti architecture is more capable. For developers who need broad ecosystem integrations, AWS Strands, CrewAI, Flowise, and a fully self-hostable open-source stack, Mem0 is the faster path to production. Neither is universally better.

3. What is Graphiti and how does it relate to Zep?

Graphiti is Zep’s open-source temporal knowledge graph engine (24K+ GitHub stars, Apache 2.0). It powers all of Zep’s memory capabilities. Graphiti stores conversation episodes as graph updates, models entities with validity windows, and supports hybrid retrieval combining semantic embeddings, BM25 keyword search, and graph traversal. As of April 2025, Graphiti is Zep’s only open-source component, Zep Community Edition was deprecated.

4. Does Mem0 support temporal memory?

Partially. Base Mem0 retrieves by semantic similarity, it can surface outdated facts if they are semantically closer to the query than the updated fact. Mem0g (graph-enhanced variant) improves temporal handling by about 2% on benchmarks. Zep’s Graphiti explicitly models fact validity windows, making temporal reasoning a first-class feature rather than an improvement layered on top of vector retrieval.

5. Can I self-host Zep in 2026?

You can self-host Graphiti (Apache 2.0), but not the full Zep application stack, Zep Community Edition was deprecated in April 2025, with additional feature retirements in February 2026. Self-hosting Graphiti requires provisioning the Graphiti service plus a compatible graph database (Neo4j, FalkorDB, or Kuzu). Mem0 maintains a fully self-hostable Apache 2.0 stack with Docker support.

6. What is the LongMemEval benchmark and how do Zep and Mem0 compare?

LongMemEval is an evaluation framework for AI agent memory that tests recall accuracy over long conversation histories. On GPT-4o, Zep scores 63.8% and Mem0 scores 49.0%, a 14.8 percentage point gap in Zep’s favor according to an independent test. There is also a separate LOCOMO benchmark dispute: Zep originally claimed 84%, Mem0 corrected this to 58.44%, and Zep counter-claimed 75.14%. Both vendors contest the other’s methodology; treat any single benchmark figure with caution.

7. When should I use Zep instead of Mem0?

Use Zep when your agent needs to reason about how facts change over time, changing customer status, evolving relationships, temporal queries like “what was the account state in Q3?” Use Zep if you are building on MCP-compatible tooling (Claude Desktop, Cursor) and want Graphiti’s native MCP server integration. Use Mem0 if you need the fastest path to a production memory layer with broad ecosystem coverage and full open-source self-hosting.

Sources

Zep: A Temporal Knowledge Graph Architecture for Agent Memory, arXiv, January 2025
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory, arXiv, April 2025
Mem0 vs Zep (Graphiti): AI Agent Memory Compared (2026), Vectorize.io, 2026
Mem0 raises $24M Series A, TechCrunch, October 2025
Graphiti: Build Real-Time Knowledge Graphs for AI Agents, GitHub, getzep/graphiti
Revisiting Zep’s 84% LoCoMo Claim, GitHub Issue #5, getzep/zep-papers
Is Mem0 Really SOTA in Agent Memory?, Zep Blog
Graphiti hits 20K Stars + MCP Server 1.0, Zep Blog, November 2025
Zep v3: Context Engineering Takes Center Stage, Zep Blog
AWS and Mem0 partner for Strands Agent SDK, Mem0, May 2025
Gartner: 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Gartner, August 2025
Gartner: 40%+ Agentic AI Projects Canceled by 2027, Gartner, June 2025
Building the Foundations for Agentic AI at Scale, McKinsey
Shopify CEO and ex-OpenAI researcher agree: context engineering beats prompt engineering, The Decoder
Announcing a New Direction for Zep’s Open Source Strategy, Zep Blog, April 2025
Mem0 graph token cost issue, GitHub Issue #2066, mem0ai/mem0

Share this article

Zep vs Mem0: Which AI Memory Layer Fits Your Stack?

Key takeaways

What is the difference between Zep and Mem0?

What’s the difference between Zep and Mem0?