| What it is | Org-specific definitions, data ownership records, access policies, and institutional history: the semantic layer that tells AI what data means in this org |
| Why it’s missing | Not present in foundation models, not auto-extractable from schemas, and rarely governed at the metadata layer |
| Failure mode | Systematic wrong decisions from shared vocabulary mismatches, not random hallucination |
| Where it lives | Business glossaries, data catalogs, certification records, ownership metadata, data lineage systems |
| The fix | A governed metadata layer that encodes business context and exposes it to AI pipelines |
What is business context for AI?
Permalink to “What is business context for AI?”Business context for AI is the four-part body of org-specific knowledge that determines what data means inside a given organization: definitions (what “revenue” means here), ownership (who governs each data asset), policies (what agents can and cannot access), and institutional history (why certain data exists or was deprecated). None of it is derivable from the data itself.
The four components of business context
Permalink to “The four components of business context”Think of business context as four distinct layers that sit on top of raw data. First, org-specific definitions: the certified meanings of terms like “revenue,” “active user,” and “customer” within your specific data stack. Second, data ownership: who governs each asset, domain by domain. Third, access policies: what agents are permitted to see based on role and sensitivity classification. Fourth, institutional history: lineage decisions, why metrics were defined a certain way, what changed, and when.
Together, these four components form what contextual intelligence in AI actually requires at the enterprise level. A model may know what “revenue” means globally, but it cannot know that your org’s revenue is ARR-only and excludes professional services booked in a specific fiscal quarter.
As Martin Nevicky observed in November 2025, the missing context problem is not about model intelligence. It is about the absence of a governed vocabulary layer the model can query. Naming the problem is the first step. Encoding its solution is the work.
Why AI agents need business context
Permalink to “Why AI agents need business context”Without business context, agents operate on the wrong shared vocabulary. They don’t malfunction visibly. They produce confident, plausible, systematically wrong outputs. The problem compounds at scale: every agent call on an undefined term is a compounding decision error.
This is the core argument that context infrastructure for AI agents must address at its foundation. You can wire every tool via MCP, write governance policy documents, and stand up sophisticated retrieval pipelines. But if the data layer underneath is ungoverned, agents resolve ambiguous terms using model-inferred defaults that have nothing to do with your org.
The three-definition problem: why missing business context fails AI
Permalink to “The three-definition problem: why missing business context fails AI”The three-definition problem occurs when a single business term carries multiple valid definitions across different teams or systems. When an AI agent queries “revenue,” it may retrieve ARR from Finance, run rate from the CRO dashboard, or gross bookings from the data warehouse. All are legitimate. All are wrong for the original question. The result is systematic failure, not random hallucination.
Gartner’s February 2025 research found that 57% of organizations lack AI-ready data, with 60% of AI projects predicted to be abandoned for this reason. McKinsey’s 2025 State of AI global survey puts the consequence in revenue terms: 88% of organizations report regular AI use, but only 39% achieve measurable EBIT impact. The gap between adoption and outcome is the vocabulary gap.
What the three-definition problem looks like in practice
Permalink to “What the three-definition problem looks like in practice”Consider three terms that appear in almost every enterprise data stack:
- Revenue: ARR in Finance, run rate in the CRO deck, gross bookings in the data warehouse
- Active user: Daily login to the product team, any session in the past 30 days to the growth team
- Customer: Paid accounts in Salesforce, includes trials in the CRM, excludes churned in the billing system
Each definition is defensible. None is wrong in isolation. But an agent picks one without knowing which context applies, producing outputs that are confident and materially incorrect. In financial services, this pattern has caused agents to report materially different revenue figures to the same executive question from different systems. Publicis Sapient captured it directly: “2026 tools are working with 1990s context.”
Why this produces systematic failure, not random hallucination
Permalink to “Why this produces systematic failure, not random hallucination”Random hallucination is an error of knowledge. The three-definition problem is an error of vocabulary. Agents using the wrong definition don’t fail once. They fail consistently, in the same direction, every time a term appears.
This is the distinction that makes context engineering different from prompt engineering. Writing clearer prompts doesn’t resolve an ambiguous definition. It just makes the wrong definition slightly more legible. As Publicis Sapient argues in their enterprise context graph analysis, context rot compounds: when agents receive repeated corrections because they lack a governed vocabulary source, those corrections pile up in the context window, and performance degrades. The root cause is always the same: no upstream certified definition, so every session must re-teach the agent what the org means.
Why the model cannot solve this
Permalink to “Why the model cannot solve this”This is not a model knowledge problem. No foundation model can know what “active user” means inside a specific org’s data stack. The definition exists in a Confluence doc, a disputed Slack thread, or an undocumented assumption in a dbt model. It is not in any training corpus.
a16z, citing MIT’s State of AI in Business 2025, found that most AI deployments fail due to lack of contextual learning and misalignment with day-to-day operations. The a16z framing is precise: data agents need context, not a better model, not a larger context window, but org-specific semantic meaning made machine-readable. The fix requires org-specific context governance.
You can read more about how AI agent governance fails when it operates only at the agent orchestration layer, without addressing the ungoverned data layer underneath.
Where business context lives: and why it’s hard to access
Permalink to “Where business context lives: and why it’s hard to access”Business context lives across four governed metadata layers: business glossaries (certified definitions), ownership records (who governs what), access policies (what agents can see), and institutional history (lineage and decision records). These layers exist in governed data infrastructure, not in the data itself, not in foundation models, and not in any MCP tool call.
Business glossaries
Permalink to “Business glossaries”The primary home of org-specific definitions. A well-maintained business glossary holds certified, version-controlled definitions for every term that crosses team boundaries: “revenue,” “churn,” “active user,” “customer.” When agents query a glossary-backed metadata layer, they retrieve the org-certified meaning, not a model-inferred one.
Snowflake’s agent context layer research identifies the business glossary as a prerequisite for trustworthy data agents. Without it, agents are retrieving schema metadata, which tells them what data exists but not what it means.
Data ownership metadata
Permalink to “Data ownership metadata”Every data asset has a human owner, someone accountable for its accuracy, governance, and updates. Ownership metadata tells agents not just what data means, but who governs it and who to escalate to when a definition is disputed. This is accountability infrastructure for AI. It turns the business glossary from a static document into a living governance artifact with clear chain of custody.
Access policies
Permalink to “Access policies”Agents must inherit the same access controls that govern human analysts. Access policies encoded in the metadata layer, organized by domain, role, and sensitivity classification, ensure agents cannot surface restricted data in unrestricted contexts. Without this, an agent acting on behalf of a junior analyst could inadvertently return executive compensation data or customer PII.
Institutional history
Permalink to “Institutional history”Why was this metric defined this way? When did the fiscal year definition change? Why was this data source deprecated? Lineage records and historical decision logs are the institutional memory that prevents agents from making historically uninformed decisions.
Teams govern current definitions but rarely capture the decisions that led to them. This is what Atlan’s MCP server exposes to AI agents: not just what data means now, but why it means that, and when it changed.
| Dimension | AI without business context | AI with business context |
|---|---|---|
| Term resolution | Model-inferred (generic) | Org-certified (glossary-backed) |
| Ownership visibility | None | Data owner mapped per asset |
| Access enforcement | Not enforced | Policy-inherited from metadata layer |
| Historical decisions | Unknown | Lineage and decision log available |
| Failure mode | Systematic wrong answers | Governed, auditable outputs |
Why business context is the hardest context to encode
Permalink to “Why business context is the hardest context to encode”Business context is harder to encode than technical metadata or conversational context because it is simultaneously org-specific (no external source has it), historically accumulated (it cannot be derived from current data alone), constantly evolving (definitions drift, ownership changes), and dependent on human curation (no crawler or LLM can extract it automatically).
Gartner predicts 40% of agentic AI projects will fail by 2027 because legacy systems cannot support modern AI execution demands, and the data layer is the primary legacy constraint. More than 70% of AI project failures trace back to data problems rather than algorithmic shortcomings, a finding consistent across Gartner, Deloitte, and McKinsey research.
It’s organization-specific: no foundation model has it
Permalink to “It’s organization-specific: no foundation model has it”Foundation models are trained on publicly available text. Org-specific definitions live in internal governance documents, Slack threads, and tribal knowledge. There is no external source that contains what “active user” means in your product analytics schema.
This is the moat. Organizations that encode business context into a governed metadata layer create an AI accuracy advantage that cannot be replicated by switching models. A competitor with the same foundation model but a governed context layer will consistently outperform a team running a more sophisticated model against ungoverned data.
It accumulates historically: not derivable from current data alone
Permalink to “It accumulates historically: not derivable from current data alone”Business context is the residue of past decisions. A metric defined five years ago to serve a different business model still shapes how data is structured today. Agents operating without that history repeat the mistakes the org already learned from, including deprecated data sources, corrected definitions, and schema changes that left artifacts behind.
It requires ongoing governance: definitions drift
Permalink to “It requires ongoing governance: definitions drift”Definitions change as the business changes. A “customer” definition that excluded trials last year may include them now. Without active governance, business context decays, and agent outputs decay with it. This is why business context cannot be treated as a one-time setup project. It is operational infrastructure that requires the same maintenance cadence as any other data pipeline.
It requires human curation: not auto-extractable
Permalink to “It requires human curation: not auto-extractable”No crawler, embedder, or LLM can reliably extract certified business definitions from raw data. Someone has to decide: this is what “revenue” means here, this is who governs it, this is when it changed. That curation is irreplaceable. It is also, in most organizations, the work that has never been systematically done. This is why the enterprise context layer is the missing piece in almost every AI readiness conversation.
How to give AI systems business context
Permalink to “How to give AI systems business context”Giving AI systems business context requires building and governing a semantic metadata layer that agents can query before acting. This means inventorying your business glossary, assigning domain ownership, encoding access policies, connecting the metadata layer to the AI pipeline, and maintaining currency as definitions evolve. Without active governance, business context decays faster than agents can learn.
Prerequisites checklist
Permalink to “Prerequisites checklist”Before starting:
- [ ] Identify which business terms cross team boundaries and create definitional ambiguity
- [ ] Confirm whether a business glossary exists and who owns it
- [ ] Map data domain ownership across your org (Finance, Marketing, Product, etc.)
- [ ] Determine how your AI pipeline will consume metadata (MCP, RAG, API)
Step 1: Inventory your business glossary
Permalink to “Step 1: Inventory your business glossary”Audit which definitions exist, which are certified versus disputed, and which terms agents are most likely to query. Prioritize terms that appear across multiple systems with conflicting meanings: revenue, active user, customer, churn. For each term, document who owns the definition, when it was last certified, and which system is authoritative.
Step 2: Assign data ownership across domains
Permalink to “Step 2: Assign data ownership across domains”For every major data asset, identify a human owner accountable for its definition and accuracy. This is the governance layer agents will inherit. Without it, definitions are technically present but practically orphaned. No one can certify them as current, and agents have no way to escalate when a definition is disputed.
Step 3: Encode access policies that agents will inherit
Permalink to “Step 3: Encode access policies that agents will inherit”Map access rules to the metadata layer. Agents should receive the same access restrictions as the roles they are acting on behalf of, not unrestricted access to the full data warehouse. Role-based context inheritance is not optional. It is the mechanism that prevents agents from producing outputs that violate data governance policies.
Step 4: Connect your metadata layer to the AI pipeline
Permalink to “Step 4: Connect your metadata layer to the AI pipeline”Expose business context via MCP, RAG retrieval, or API. The goal: when an agent resolves a term, it queries the governed metadata layer first, before falling back to model-inferred meaning. The MCP-connected data catalog pattern is the most direct implementation: agents issue a tool call to the catalog, retrieve the certified definition, and use it to ground their response.
Snowflake’s agent context layer research confirms that retrieval architecture alone does not solve the business context problem. What agents retrieve matters as much as how they retrieve it. RAG on raw data returns data without meaning. RAG on a governed metadata layer returns meaning without ambiguity.
Step 5: Maintain currency, because business context decays
Permalink to “Step 5: Maintain currency, because business context decays”Build a review cadence for definitions. Assign expiry dates to certifications. Flag assets whose ownership has changed. Business context that is not actively maintained becomes a liability rather than an asset. The failure mode is the same as no context at all: agents resolving terms against stale definitions produce confidently wrong outputs at scale.
Common pitfalls
Permalink to “Common pitfalls”- Using RAG on raw data without a business context layer: retrieval returns data without meaning
- Treating business context setup as a one-time project: definitions drift, ownership changes
- Encoding context at the prompt level (hardcoded definitions): this does not scale and breaks when terms change
- Conflating technical metadata (schemas, data types) with business context (what the data means)
The CIO’s guide to context graphs covers the full implementation architecture in detail, including how business context connects to the broader context stack teams are building for production AI.
How Atlan encodes business context for AI
Permalink to “How Atlan encodes business context for AI”Atlan is the governed business context layer for AI agents. Its business glossary, certifications, ownership metadata, lineage graph, and active metadata, all exposed via MCP, give agents access to org-specific semantic meaning that no foundation model contains. When an agent asks “what does ‘customer’ mean here?”, the answer lives in Atlan.
Most organizations have data pipelines, data warehouses, and AI tooling, but no governed business context layer that AI can actually consume. MCP wires tools to agents. RAG retrieves documents. Neither resolves the fundamental question: what does this term mean in this org, certified by the person accountable for it?
| Capability | What it provides |
|---|---|
| Business glossary | Certified, org-specific term definitions, version-controlled and domain-linked |
| Certifications | Trust signals on which definition or asset is authoritative for agent queries |
| Ownership metadata | Accountability chain per data asset; agents know who governs what |
| Domain classifications | Context organized by business domain (Finance, Marketing, Product), not just schema |
| Data lineage | Why a metric exists, where it came from, what transformed it; institutional history made machine-readable |
| Active metadata | Context that updates live as business conditions and definitions change |
| MCP exposure | All of the above accessible to AI agents via Atlan’s MCP server |
Agents that query Atlan’s metadata layer resolve terms against org-certified definitions, not model-inferred guesses. The result: agents that operate on the same shared vocabulary as your analysts, producing outputs that are auditable, governed, and trustworthy.
This is the most direct expression of the cohort conviction: teams building AI agents are solving the context problem at the wrong layer. Wiring MCP to an ungoverned data catalog does not create context-aware agents. Atlan is the governed context layer that makes the rest of the stack actually work in production.
Real stories from real customers: shared vocabulary as AI infrastructure
Permalink to “Real stories from real customers: shared vocabulary as AI infrastructure”"We're excited to build the future of AI governance with Atlan. All of the work that we did to get to a shared language at Workday can be leveraged by AI via Atlan's MCP server...as part of Atlan's AI Labs, we're co-building the semantic layer that AI needs with new constructs, like context products."
-- Joe DosSantos, VP of Enterprise Data & Analytics, Workday
"Context is the differentiator. Atlan gave our teams the shared vocabulary and lineage to move from reactive data management to proactive AI enablement across CME Group."
-- Kiran Panja, Managing Director, Data & Analytics, CME Group
Agents that lack business context don’t just hallucinate. They fail systematically.
Permalink to “Agents that lack business context don’t just hallucinate. They fail systematically.”The distinction matters: systematic failure is worse than random hallucination. Random hallucination is detectable; outputs look wrong, sound wrong, or contradict verifiable facts. Systematic failure from missing business context looks right. The agent returns a number, a definition, a recommendation, and it is consistently, confidently wrong in the same direction, because it is using the wrong vocabulary.
This is the core problem that governed metadata layers exist to solve. Organizations that encode business context (certified definitions, ownership records, access policies, institutional history) into a layer that agents can query before acting are not just improving AI accuracy. They are making AI trustworthy at the enterprise scale where accuracy and auditability are both required.
The enterprise context layer is that infrastructure. Business context is the most org-specific, irreplaceable component of it. And Atlan is the governed layer that makes business context machine-readable for the first time.
FAQs about business context for AI
Permalink to “FAQs about business context for AI”1. What is business context in AI?
Permalink to “1. What is business context in AI?”Business context in AI is the layer of org-specific knowledge that determines what data means inside a given organization, including certified term definitions, data ownership, access policies, and institutional history. It is distinct from technical metadata (schemas, data types) and general AI knowledge (what foundation models know). Without it, agents operate on generic vocabulary that does not match the org’s data stack.
2. Why do AI agents give wrong answers in enterprise settings?
Permalink to “2. Why do AI agents give wrong answers in enterprise settings?”Enterprise AI agents most often give wrong answers not because the model is unintelligent, but because they are using the wrong definition of a business term. When “revenue” means three different things across three different systems, an agent that picks the wrong one produces confident, systematically wrong outputs. This is the three-definition problem: a vocabulary issue, not a model capability issue.
3. What is the difference between business context and data?
Permalink to “3. What is the difference between business context and data?”Data is the raw content: rows, values, schemas, tables. Business context is the meaning layered on top, including what a field represents in this org, who certified that definition, what policies govern access, and what decisions shaped how the data was collected. Data without business context is factually present but semantically ambiguous. AI agents need both.
4. How does a business glossary help AI agents?
Permalink to “4. How does a business glossary help AI agents?”A business glossary gives AI agents a certified, org-specific vocabulary to query before acting. When an agent resolves a term like “active user,” it can retrieve the org’s certified definition (daily login versus 30-day session) rather than inferring from the data or defaulting to a model-learned approximation. This prevents systematic definition errors from propagating through downstream outputs.
5. Why does enterprise AI hallucinate on business questions?
Permalink to “5. Why does enterprise AI hallucinate on business questions?”Enterprise AI hallucination on business questions is usually misdiagnosed. The agent is not fabricating information. It is using a plausible but incorrect definition of an org-specific term. Because the correct definition is not encoded in any accessible, governed layer, the agent defaults to model-inferred meaning. The fix is not a better model; it is a governed business context layer the agent can query.
6. How do you encode institutional knowledge for AI?
Permalink to “6. How do you encode institutional knowledge for AI?”Encoding institutional knowledge for AI requires building a governed metadata layer that captures it explicitly: a business glossary for term definitions, ownership records for accountability, lineage graphs for historical decisions, and certification records for trust signals. This context must then be exposed to the AI pipeline via MCP, RAG retrieval, or API so agents can query it before acting on ambiguous terms.
7. What is a context layer for AI agents?
Permalink to “7. What is a context layer for AI agents?”A context layer for AI agents is the infrastructure that provides the information an agent needs to act correctly, beyond the raw data it is querying. Business context layers specifically encode org-level semantic meaning: what terms mean, who governs data, what agents can access, and why certain decisions were made. Without a context layer, agents resolve ambiguity using model defaults, which are rarely org-specific.
8. What role does a data catalog play in AI decision-making?
Permalink to “8. What role does a data catalog play in AI decision-making?”A governed data catalog is the primary home of business context for AI. It stores certified definitions, ownership metadata, lineage records, and access policies in a machine-readable format. When connected to AI pipelines via MCP or API, the catalog becomes the context layer agents query to resolve terms before acting, replacing model-inferred guesses with org-governed answers.
Sources
Permalink to “Sources”- Lack of AI-Ready Data Puts AI Projects at Risk, Gartner (February 2025)
- Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026 (August 2025)
- Your Data Agents Need Context, a16z (March 2026)
- Enterprise Context Graph: Why AI Fails Without Business Context, Publicis Sapient
- The State of AI in 2025: Agents, Innovation, and Transformation, McKinsey Global Survey
- Why AI Agents Keep Failing: The Missing Context Problem, Martin Nevicky (November 2025)
- The Agent Context Layer for Trustworthy Data Agents, Snowflake
- Why Enterprise AI Can’t Understand Your Data, Data Science Collective
- Why 95% of AI Projects Fail, SR Analytics (citing Gartner, Deloitte, McKinsey)
Share this article
