Semantic memory is what an AI agent knows: facts, definitions, entity relationships. Procedural memory is how it behaves: rules, skills, decision logic. Most enterprise teams treat both as interchangeable “memory” and store them in the same places. That architectural conflation produces specific, traceable failures: agents that answer confidently with outdated business rules, policy drift across agent fleets, and governed metrics retrieved from unverified vector stores. This page gives you a diagnostic framework to separate them and build agents that get both right.
| Dimension | Semantic Memory | Procedural Memory |
|---|---|---|
| What it is | The agent’s world knowledge: facts and concepts it can state and reason over | The agent’s behavioral programming: skills and rules it follows automatically |
| What it stores | Definitions, business terms, entity properties, domain facts, certified metrics | Decision rules, workflows, agent persona, tool-selection logic, constraint policies |
| Where it lives | Vector DB, knowledge graph, structured DB, enterprise data catalog | LLM weights (in-weights), system prompt (explicit instructions), agent executor code (code-embedded) |
| How it’s updated | Automated extraction, episodic consolidation, human curation, continuous metadata ingestion | Fine-tuning / RLHF, prompt engineering, LangMem prompt optimization, code deployment |
| Failure mode | Authority vacuum: conflicting facts with no resolution mechanism; silent staleness | Governance gap: unversioned per-agent prompts; procedural drift via self-modification |
| Best for | “What is X?” for knowledge retrieval, entity resolution, fact grounding | “How should I act?” for workflow enforcement, behavioral constraints, compliance rules |
Semantic vs procedural memory: what’s the difference?
Permalink to “Semantic vs procedural memory: what’s the difference?”Semantic memory covers what the agent knows: facts, definitions, relationships it can retrieve and state. Procedural memory covers how the agent behaves: rules, skills, and decision patterns it follows without consciously retrieving them.
The distinction originates in cognitive science. Endel Tulving (1972) described semantic memory as a “mental thesaurus,” structured, context-independent world knowledge that underlies language use. Larry Squire (1987) formalized procedural memory as “knowing how” vs. semantic memory’s “knowing that,” categorizing it as non-declarative implicit memory that operates automatically. For AI agents, the distinction is architectural: the two types require different storage backends, different update mechanisms, and different governance regimes.
A concrete example illustrates the difference: “What does net_revenue_q4 mean?” draws on semantic memory; the agent retrieves a fact. “Always use certified tables for revenue queries” is procedural memory; the agent follows a rule automatically. The CoALA framework (arXiv:2309.02427, TMLR 2024) formalizes both as distinct substrates in cognitive architectures for language agents.
The reason teams conflate them is straightforward: both live in what practitioners call the “memory layer”, so teams reach for the same tool (vector DB) for both. The early RAG pattern (“embed everything, retrieve everything”) trained engineers to think of vector stores as universal memory. But the CoALA framework is explicit that procedural memory “must be initialized by the designer” and is the riskiest learning target because code modifications can introduce bugs. Conflation causes business rules stored in vector DBs to have no authority mechanism, while facts in system prompts have no update trigger; both fail silently in production.
What is semantic memory in AI agents?
Permalink to “What is semantic memory in AI agents?”Semantic memory in AI agents is the store of facts, definitions, and world knowledge the agent retrieves at inference time to ground its answers. In enterprise settings, it holds business terms, metric definitions, entity relationships, and certified domain knowledge. Unlike episodic memory (what happened) or procedural memory (how to act), semantic memory answers the question “what is X?”; its reliability depends entirely on whether the underlying store has authority mechanisms, not just similarity search.
Semantic memory takes two forms: explicit (retrievable from an external store) and implicit (baked into model weights at training, the in-weights semantic memory that cannot be updated without retraining). The in-weights form explains the knowledge cutoff problem: facts encoded during pretraining cannot be corrected by retrieval alone. For enterprise data agents, this is why external, governed semantic stores are not optional; the agent’s in-weights world knowledge is both frozen and generalized, lacking the certified business definitions a production data agent requires.
In enterprise contexts, semantic memory stores: asset definitions and business terms (“What is ARR?”), certified metric definitions with approval metadata (net_revenue_q4, version 3.2, approved by Finance 2026-01-15), cross-system entity resolution (“customer” in CRM = “org” in support = “account” in billing), column-level lineage and ownership, and user profile data consolidated from episodic interactions. CME Group cataloged 18 million assets and over 1,300 certified glossary terms in year one; production evidence that governed semantic memory at enterprise scale is achievable, not aspirational.
Semantic memory evolves via four mechanisms: automated extraction (LLMs pull facts from conversation turns, Mem0’s approach), episodic consolidation (patterns distilled from episodic memories become durable semantic facts), human curation (domain experts write and certify definitions, the enterprise governance path), and continuous ingestion (metadata crawlers extract facts from connected sources). In-weights semantic memory cannot be updated without fine-tuning; it carries the highest update cost of any memory type.
Core components of semantic memory
Permalink to “Core components of semantic memory”- Knowledge base / fact store: Structured repository of definitions, concepts, and entity properties; the queryable layer of what the agent “knows”
- Ontology and schema: The relational structure connecting facts, showing how “revenue” relates to “ARR” relates to “net revenue,” enabling multi-hop reasoning beyond single-fact retrieval
- Factual assertions with provenance: Each fact linked to its source, update timestamp, and (in governed systems) approval status and version history
- Temporal validity tracking: Mechanism to flag stale or deprecated content. Native in governed catalogs; absent by default in vector DBs
- Entity resolution layer: Cross-system deduplication ensuring “customer,” “org,” and “account” resolve to the same canonical entity across sources
- In-weights world knowledge: The implicit semantic substrate baked into LLM parameters. Broad but frozen; cannot be corrected at inference without RAG or fine-tuning
What is procedural memory in AI agents?
Permalink to “What is procedural memory in AI agents?”Procedural memory in AI agents encodes how the agent behaves: its skills, decision rules, workflow patterns, and behavioral constraints. Unlike semantic memory, it is not retrieved at inference time; it operates automatically, shaping every action the agent takes. Squire’s cognitive science framing maps directly to AI agents: “knowing how” to perform without conscious access to the encoding.
Critically, procedural memory is not the same as semantic memory about procedures. A description of “what our invoice approval process is” is semantic content. The rule that makes the agent follow that process is procedural content. The distinction matters because the two require different storage and different governance; the most common conflation failure begins here.
The CoALA framework (arXiv:2309.02427) identifies three distinct storage substrates for procedural memory. Understanding the in-context vs. external memory axis helps contextualize these: in-weights memory is in-model, while code-embedded and system prompt substrates are external.
- In-weights (LLM parameters): Skills baked in through pretraining and fine-tuning: reading, coding, reasoning. Deepest and most stable; highest update cost. Risk: catastrophic forgetting during fine-tuning.
- Code-embedded (agent executor): Routing logic, tool definitions, workflow graphs (LangChain tool schemas, LangGraph edge conditions). Auditable via version control; requires deployment to change.
- Explicit instruction sets (system prompt / managed rule libraries): The most flexible substrate. LangMem (LangChain) focuses here specifically; agents update their own system prompt instructions via meta-prompt optimization. Updatable at runtime without model or code changes, but ungoverned by default.
Update mechanisms follow the substrate: fine-tuning for in-weights (slow, expensive, risk of catastrophic forgetting), prompt engineering for explicit instructions (fast, cheap, brittle at fleet scale), LangMem prompt optimization for automated instruction refinement, and code deployment for executor-embedded logic. The MACLA framework (arXiv:2512.18950) represents an emerging research approach: a frozen LLM plus external procedural store, updated via Bayesian selection and contrastive refinement.
Build Your AI Context Stack
The practical guide to building a governed context stack for enterprise AI agents: semantic memory, context layers, and what to build in what order.
Get the Stack GuideCore components of procedural memory
Permalink to “Core components of procedural memory”- Action policies: Step-by-step rules governing how the agent handles specific task types (invoice approval, PR review, data access request)
- Tool schemas and selection heuristics: Which tools to call, in what order, under what conditions; the routing logic of the agent
- Reasoning patterns: Chain-of-thought scaffolding, decomposition strategies, fallback logic when primary approaches fail
- Constraint and compliance rules: Data access policies, GDPR/SOX/HIPAA enforcement rules, source-of-truth routing (always use certified tables for revenue)
- Agent persona and format rules: Tone, response structure, and escalation triggers; the behavioral style layer
- Learned behavioral updates: User-specific interaction patterns refined over time via LangMem or similar prompt optimization
Semantic vs procedural: head-to-head comparison
Permalink to “Semantic vs procedural: head-to-head comparison”The sharpest way to distinguish semantic from procedural memory is to ask what breaks when each is misused. Semantic memory stored in ungoverned vector DBs produces authority vacuums: contradictory answers with no resolution mechanism. Procedural memory stored in per-agent system prompts produces governance gaps: policy drift across agent fleets with no audit trail. The failure modes are different, the storage requirements are different, and the governance regimes are different.
| Dimension | Semantic Memory | Procedural Memory |
|---|---|---|
| Primary focus | What the agent knows (“knowing that”) | How the agent behaves (“knowing how”) |
| Storage substrate | Vector DB, knowledge graph, structured DB, enterprise data catalog | LLM weights, system prompt, agent executor code |
| Retrieval mechanism | Explicit retrieval at inference: similarity search or graph traversal | Implicit; automatically applied with no retrieval call for in-weights or system prompt |
| Update frequency | Continuous (automated extraction) to periodic (human curation) | Infrequent for in-weights; on-demand for prompts |
| Update cost | Low for vector writes; high for certification and lineage | Low for prompt edits; very high for fine-tuning |
| Who owns it | Data teams, domain experts, knowledge management | ML engineers (in-weights), platform engineers (code), agent designers (prompts) |
| Failure mode | Authority vacuum: similarity retrieval returns conflicting or stale facts without flagging | Governance gap: prompt changes unversioned; fleet-wide policy updates require manual per-agent edits |
| Enterprise risk | Stale metric definitions used for production decisions; conflicting “revenue” definitions across teams with no resolution mechanism | Procedural drift (SSGM, arXiv:2603.11768): gradual reinforcement of suboptimal workflows; no rollback path |
| Tool examples | Mem0, Zep, LangMem entity profiles, Atlan Context Layer | LangMem prompt optimizer, Letta instruction blocks, MACLA (arXiv:2512.18950) |
| What breaks when misused | Business rules stored here → retrieval inconsistency, version confusion, outdated procedures applied confidently | Domain facts stored here → facts don’t update when world changes; silent staleness across agent fleet |
The certified table scenario. An enterprise data agent receives the rule: “Always use certified tables for revenue calculations.” The team stores this rule in a vector DB alongside factual content: metric definitions, schema documentation.
The result: the rule is retrieved by cosine similarity, sometimes. When a semantically similar but outdated version exists in the same store (“prefer certified tables where available”), retrieval is non-deterministic. On some queries, the agent applies the current rule; on others, it retrieves the older version. The agent never flags the inconsistency; both answers arrive with equal confidence.
The fix: the rule belongs in the system prompt (explicit procedural memory) or, at enterprise scale, in a centrally managed instruction library with version control and deployment governance. The fact about what “certified” means belongs in the semantic store: certified by whom, when, and per what lineage. Same topic, different memory types, different storage, different update path.
The conflation problem: what goes wrong when teams confuse them
Permalink to “The conflation problem: what goes wrong when teams confuse them”The most common enterprise agent failure is not a capability gap; it is an architectural misclassification. Teams store business rules (procedural) in vector databases (semantic infrastructure) and hard-code facts (semantic) into system prompts (procedural infrastructure). Both patterns fail silently, at scale, in production.
Problem 1: authority vacuum (procedural content in semantic stores). Vector databases resolve queries by cosine similarity, not correctness or recency. When a business rule is stored alongside similar-but-outdated rules, retrieval is non-deterministic; the agent may retrieve the current rule on one query and the deprecated version on the next. No conflict resolution mechanism, no staleness flag, no “current version” concept. As practitioners have noted, “Memory Is Not a Vector Database”; agents need beliefs, not just storage. Teams see agents following inconsistent procedures with no error signal. An enterprise data catalog with certification and lineage, like Atlan’s context layer, eliminates the authority vacuum by providing a single, governed semantic source that similarity search cannot override.
Problem 2: silent staleness (semantic content in procedural stores). Facts hard-coded into system prompts do not update when the world changes. A metric is redefined; a product is renamed; a data source is deprecated. The system prompt does not know. At hundreds-of-agents scale, the update surface becomes unmanageable: every agent requires an independent edit with no centralized trigger. EY’s survey (March 2026) found 78% of leaders admit AI adoption already outpaces their ability to manage the risks it creates. Silent staleness is one concrete mechanism behind that figure.
Problem 3: procedural drift via self-modification. LangMem and similar frameworks enable agents to rewrite their own system prompt instructions based on conversation feedback. Without consistency verification and temporal decay modeling, agents gradually reinforce suboptimal workflows. The SSGM framework (arXiv:2603.11768, 2026) formalizes this as “procedural drift,” a documented production failure mode requiring explicit governance. The SSGM recommendation, consistency verification and temporal decay modeling, provides the governance scaffolding LangMem’s prompt optimization currently lacks. The stability-plasticity dilemma is real: agents that learn too readily lose the stable behavioral foundation that makes them trustworthy.
The diagnostic rule:
- If it answers “what is X?” → it is semantic content → it belongs in a governed semantic store with authority and freshness mechanisms.
- If it answers “how should I act?” → it is procedural content → it belongs in a versioned, centrally managed instruction layer with rollback.
How semantic and procedural memory work together
Permalink to “How semantic and procedural memory work together”Semantic and procedural memory are not alternatives. Well-architected agents need both, working in concert. Semantic memory provides the facts the agent reasons over; procedural memory determines how it applies those facts. The failure modes compound when either is absent: an agent with rich semantic memory but weak procedural memory will know what “certified table” means but not consistently enforce the rule to use only certified tables.
Semantic provides the facts; procedural applies the rules
Permalink to “Semantic provides the facts; procedural applies the rules”Semantic memory answers: “What does net_revenue_q4 mean? Who certified it? What is its lineage?” Procedural memory answers: “When this agent runs a revenue query, it must use only certified, Finance-approved tables, no exceptions.” The two work in tandem at inference: the agent retrieves the semantic fact and applies the procedural rule governing how to use it. If the procedural rule is absent or inconsistent, having a certified semantic definition is not enough; agents may still use uncertified sources.
When to invest in each
Permalink to “When to invest in each”- Early-stage agents: Invest in semantic memory first. Build the knowledge base and entity resolution layer before tuning behavioral rules. Agents without facts cannot benefit from rules about how to use them.
- Scaling agents: Shift investment to procedural governance. Centralized, versioned instruction management becomes critical once agents operate across dozens of workflows. Unversioned per-agent prompts become unmanageable at this stage.
- Enterprise-regulated agents: Both must be governed simultaneously. Data access policies (procedural) and certified metric definitions (semantic) carry equal compliance weight. A well-defined metric in an ungoverned query rule is just as dangerous as an ungoverned metric in a well-enforced rule.
- Signal: If your agent answers “what” questions poorly → semantic gap. If it behaves inconsistently → procedural gap.
LangMem as a framework handling both
Permalink to “LangMem as a framework handling both”LangMem (LangChain) is currently the only major framework with first-class support for both types. It implements semantic memory via entity profiles stored in a key-value plus vector store, backed by LangGraph. It implements procedural memory via prompt optimization; agents update their own system instructions based on feedback, using meta-prompt, gradient, or single-step algorithms. The important caveat: LangMem’s procedural memory requires governance guardrails to prevent drift (the SSGM warning applies). First-class support for procedural memory is not the same as governed procedural memory. For enterprise agent memory frameworks more broadly, consult the full comparison.
Inside Atlan AI Labs and The 5x Accuracy Factor
How enterprise teams are achieving measurable accuracy improvements by grounding AI agents in governed semantic memory, the research behind Atlan AI Labs findings.
Download E-BookHow Atlan approaches semantic and procedural memory
Permalink to “How Atlan approaches semantic and procedural memory”Enterprise data agents face a specific version of the semantic memory governance problem that chatbot-centric frameworks do not address: the need for certified, versioned, lineage-aware definitions at inference time. Standard semantic memory tools (Mem0, Zep, LangMem entity profiles) return facts by similarity. They cannot distinguish between a certified definition and a deprecated one. The authority vacuum is most acute in data agents: “revenue” commonly carries conflicting definitions across finance, marketing, and operations; agents need a single authoritative answer.
Atlan’s context layer was built specifically for this problem. It is not a generic vector store; it is a governed semantic memory substrate where every fact carries approval metadata, version history, and cross-system entity resolution.
What this means in practice:
- Certified canonical definitions:
net_revenue_q4approved by Finance, 2026-01-15, version 3.2. A governed fact with approval metadata, not just an embedding. - Cross-system entity resolution: “customer” in CRM = “org” in support = “account” in billing, resolved to a single canonical entity, enabling consistent agent reasoning across systems.
- Column-level lineage: Agents know not just what a metric means but where it comes from, who owns it, and what transformations produced it.
- Active metadata: Continuously updated from 100+ connected sources, not a static snapshot subject to silent staleness.
- Inference-time policy enforcement: Governance enforced at reasoning time, not only at ingestion; agents cannot use uncertified definitions for production decisions.
For procedural memory, Atlan’s governance layer serves as the authoritative source for data access policies, enabling centralized rule updates with audit trails, solving the governance gap that unversioned per-agent prompts create.
CME Group cataloged 18 million assets and 1,300+ certified glossary terms in year one using Atlan. The data catalog as LLM knowledge base pattern that CME Group deployed demonstrates governed semantic memory at enterprise scale is production-ready. For the full architectural picture, see how Atlan’s context layer functions as enterprise memory. Similarly, Workday’s revenue agent “couldn’t answer one question” until it gained access to shared semantic definitions via Atlan’s MCP server, a direct demonstration of the authority vacuum problem and its resolution.
Real stories from real customers: governed semantic memory in production
Permalink to “Real stories from real customers: governed semantic memory in production”"We're excited to build the future of AI governance with Atlan. All of the work that we did to get to a shared language at Workday can be leveraged by AI via Atlan's MCP server…as part of Atlan's AI Labs, we're co-building the semantic layer that AI needs with new constructs, like context products."
— Joe DosSantos, VP of Enterprise Data & Analytics, Workday
"Atlan is much more than a catalog of catalogs. It's more of a context operating system…Atlan enabled us to easily activate metadata for everything from discovery in the marketplace to AI governance to data quality to an MCP server delivering context to AI models."
— Sridher Arumugham, Chief Data & Analytics Officer, DigiKey
Architectural clarity is the competitive advantage
Permalink to “Architectural clarity is the competitive advantage”Most enterprise teams building AI agents are not failing on capability; they are failing on memory architecture. The distinction between semantic and procedural memory is not academic; it maps directly to specific production failure modes that no LLM capability improvement will fix. An agent that stores business rules in a vector database will fail non-deterministically regardless of the model powering it. An agent with facts hard-coded into system prompts will serve stale answers regardless of how accurate its retrieval is.
The resolution is straightforward in principle: semantic memory belongs in a store with authority, freshness, and governance, not just similarity. Procedural memory belongs in a versioned, centrally managed instruction layer, not a per-agent system prompt edited by hand. Gartner research projects that by 2030, 50% of enterprise AI agent deployment failures will be due to insufficient AI governance platform runtime enforcement, not capability gaps. The architectural clarity this page provides is the starting point for avoiding that outcome.
FAQs
Permalink to “FAQs”1. What is the difference between semantic memory and procedural memory in AI agents?
Permalink to “1. What is the difference between semantic memory and procedural memory in AI agents?”Semantic memory is what the agent knows: facts, definitions, entity relationships it can retrieve and state. Procedural memory is how the agent behaves: rules, skills, and decision patterns it follows automatically without an explicit retrieval step. Semantic answers “what is X?”; procedural answers “how should I act?” The cognitive science origin: Tulving (1972) for semantic (“knowing that”), Squire (1987) for procedural (“knowing how”).
2. Where is procedural memory stored in AI agents?
Permalink to “2. Where is procedural memory stored in AI agents?”The CoALA framework (arXiv:2309.02427) identifies three substrates: in-weights (baked into LLM parameters via pretraining/fine-tuning), code-embedded (agent executor logic, tool definitions, workflow graphs), and explicit instruction sets (system prompts, managed rule libraries). LangMem is currently the only major framework with first-class support for updating the explicit instruction set substrate at runtime.
3. How do AI agents update their semantic memory?
Permalink to “3. How do AI agents update their semantic memory?”Via four mechanisms: automated extraction (LLMs pull facts from conversation turns, Mem0’s approach), episodic consolidation (patterns from episodic memories become durable facts), human curation (domain experts write and certify definitions), and continuous ingestion (metadata crawlers extract from connected sources). In-weights semantic memory, world knowledge baked into LLM parameters, cannot be updated without retraining.
4. Can an AI agent rewrite its own procedural memory?
Permalink to “4. Can an AI agent rewrite its own procedural memory?”Yes, with frameworks like LangMem. Agents can update their own system prompt instructions via prompt optimization algorithms. This is powerful but carries stability risk: ungoverned self-modification leads to procedural drift, gradual reinforcement of suboptimal workflows. The SSGM framework (arXiv:2603.11768, 2026) documents this as a production failure mode and recommends consistency verification and temporal decay modeling.
5. What is the best storage backend for semantic memory in AI agents?
Permalink to “5. What is the best storage backend for semantic memory in AI agents?”It depends on the use case. Vector databases (Pinecone, Weaviate, pgvector) are fast and scalable but have no concept of authority or staleness. Knowledge graphs (Neo4j) enable multi-hop reasoning and entity deduplication. Hybrid approaches (Mem0’s graph-enhanced memory: 68.4% accuracy vs. 66.9% vector-only) offer the best of both. For enterprise data agents specifically, governed data catalogs add the certification and lineage layer that standard memory tools lack.
6. What happens when you store business rules in a vector database?
Permalink to “6. What happens when you store business rules in a vector database?”Retrieval becomes non-deterministic. Vector databases resolve queries by cosine similarity, not by recency, authority, or “currently in effect” status. When similar but conflicting versions of a rule exist in the store, agents may retrieve the current rule on one query and a deprecated version on the next. There is no conflict resolution, no staleness flag, and no error signal. Teams see agents following inconsistent procedures with no apparent cause. Business rules are procedural memory and belong in a versioned, governed instruction layer, not a similarity-retrieval store.
7. Is the system prompt an example of procedural memory?
Permalink to “7. Is the system prompt an example of procedural memory?”Yes. The system prompt is the most accessible substrate for procedural memory in AI agents. It encodes behavioral rules, constraints, personas, and decision logic that the agent follows automatically on every inference. It corresponds to CoALA’s “explicit instruction sets” substrate. The key limitation: system prompts are typically unversioned, per-agent, and manually maintained, which is why enterprises with large agent fleets face governance gaps when policies change.
8. How do LangMem and Mem0 implement different memory types?
Permalink to “8. How do LangMem and Mem0 implement different memory types?”LangMem has first-class support for both: semantic memory via entity profiles (key-value + vector store) and procedural memory via prompt optimization (agents rewrite their own instructions). Mem0 focuses primarily on semantic memory, using a hybrid vector + graph architecture optimized for fact extraction, entity deduplication, and retrieval accuracy (91.6 on LoCoMo, 93.4 on LongMemEval). Neither addresses enterprise-grade governance for semantic memory (certification, lineage, approval chains); that gap is Atlan’s specific angle.
Sources
Permalink to “Sources”- Cognitive Architectures for Language Agents (CoALA), Sumers et al., TMLR 2024
- Governing Evolving Memory in LLM Agents: SSGM Framework, arXiv
- Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement (MACLA), arXiv
- Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory, arXiv
- State of AI Agent Memory 2026, Mem0
- LangMem SDK for Agent Long-Term Memory, LangChain Blog
- Memory Is Not a Vector Database: Why AI Agents Need Beliefs Not Storage, DEV Community
- A Benchmark for Procedural Memory Retrieval in Language Agents, arXiv
- Episodic and Semantic Memory, Tulving 1972, APA PsycNet
- Memory and Brain, Squire 1987, Oxford University Press
Share this article
