How to Build an AI-Ready Semantic Layer

author-img
by Heather Devane, Lead Content Strategist, AtlanLast Updated on: November 27th, 2025 | 10 min read

You’ve finally built an AI agent to help with quarterly reporting, complete with your most high quality and carefully curated data. It’ll save you massive amounts of time pulling reports and best of all, you’ll be able to drill down on the data and get faster, more in-depth analysis. It’s the holy grail.

But then you ask the agent a simple question – “What was our ARR this quarter?” – and it immediately breaks.

This is what happened to the team at Workday. During his keynote at Atlan Re:Govern, VP of Enterprise Data and Analytics Joe DosSantos described how Workday’s financial reporting agent couldn’t answer a single question it was asked, no matter how straightforward.

“We started to realize that we were missing this translation layer,” he said. “We had no way to interpret the human language against the structure of the data.”

Workday isn’t alone. Research from MIT found that only 5% of AI pilots make it to production. And it’s because AI has a translation problem: AI systems and applications understand language, but not what it means for your business. They’re missing critical context. And worst of all, AI doesn’t know how to say “I don’t know.”

This isn’t a new realization – we know that AI misunderstands, hallucinates, and confidently gives wrong answers. But it’s becoming a more glaring issue as organizations try to scale AI, and quickly run into its shortcomings.

The solution? Context and clarity, baked into platforms in the form of an AI-ready semantic layer. This isn’t your traditional semantic layer – it’s unified, machine-readable, and at the intersection of all your tools. Here are our pressure-tested steps for building it.


From BI to AI: The New Semantic Layer #

For decades, we’ve been building semantic layers to help humans understand data. Now that we’re asking AI to understand data, it’s time for a new approach to the semantic layer.

Think of it this way:

From BI to AI: The New Semantic Layer

From BI to AI: The New Semantic Layer. Source: Atlan.

Ultimately, AI exists to provide answers, not ask for clarification. That’s the shift we need to make.


The Anatomy of an AI-Ready Semantic Layer #

The gap between BI-era and AI-era semantic layers is architectural, not incremental. To translate language into business meaning, an AI-ready semantic layer needs:

Machine-Readable Context #


Static documentation is out. Machine-readable formats are in. In the AI era, context needs to cater to machines, not just to humans.

“It’s no longer about documenting glossary terms in a tool. It’s about building a machine-readable layer of meaning,” said Joe DosSantos at Re:Govern. “We’re talking about YAML files, complex mappings, [and] connecting different AI agents and platforms.”

Why it matters: Without machine-readable context, you’re stuck manually updating context for every new AI use case. That eats up time and resources that should be spent training, refining, and deploying models.

Universal Semantics #


While BI semantic layers live inside specific tools like Tableau and Looker, AI semantic layers should not only sit in the middle of all tools, but also be universal. This allows the entire enterprise to speak the same language with understanding and consistency.

“Semantics shouldn’t be for a particular use case. It just exists for everyone,” said Joe DosSantos. “[Definitions] have to sit at the crossroads of all of this in the semantic layer so that contextual meaning can be understood by everyone who’s calling it from different tools.”

Why it matters: Without universal semantics, you end up with 900 agents that each have their own bespoke semantic layer. Unifying their definitions and understanding improves their consistency and business value.

Clarity and Context by Design #


When context is ambiguous or an afterthought, it’s set up to fail. If humans can’t understand what a term means or how systems work, how is AI supposed to? Clarifying and embedding context from the start is the only way to future-proof your semantic layer.

“At Mastercard, we’ve learned you can’t just bolt things on at the end. We’ve learned that from privacy by design and now we’re going to make sure we do that with context by design,” said Andrew Reiskind, Chief Data Officer at Mastercard, during his Re:Govern keynote. “You have to build it in from the get-go because when you do that, you can not just keep up with AI, but you can build trust in your journey.”

Why it matters: Without being crystal clear about what you’re asking of AI and embedding the context it needs to answer correctly, you’ll be constantly fixing hallucinations instead of preventing them systemically.

Human-in-the-Loop Feedback #


Context isn’t a one-time setup; it’s constantly evolving. But it needs human input in order to improve and self-heal. Every interaction between a human and an AI agent is a chance to guide context, resolve ambiguity and errors, and increase shared understanding.

“You always have to have people to understand the business, the human in the loop… who is going to be the one who will be coming up with the business process and validating which technology will be there,” said Sridher Arumugham, Chief Data and Analytics Officer at DigiKey, during Re:Govern.

Why it matters: While BI-era semantic layers operated with humans in the loop, that feedback is even more necessary when AI is acting on its own. Without this, your semantic layer may not keep up with the pace of business – and leave AI behind.


The AI-Ready Semantic Layer Build Process #

The next generation of semantic layers needs to be reimagined for AI. Playbooks that worked for BI will be increasingly irrelevant as organizations move definitions close to the data, instead of locked in a BI tool. Here’s what the new process looks like.

1. Audit Your Business Definitions #


Before you can build an AI-ready semantic layer, you need to understand what you already have – and what you don’t.

Start by asking: how many semantic layers do you actually have? The answer is probably more than you think:

  • Tableau’s calculated fields define metrics one way, Looker defines them another, and Power BI has its own DAX formulas.
  • Glossaries may exist in data catalogs, but business logic is embedded in dbt transformations and SQL views encode rules.
  • Tribal knowledge lives in Slack threads, emails, and people’s heads.

Mapping all of it can help uncover inconsistencies and avoid the situation DigiKey’s data team found itself in during the Shanghai supply chain shutdown, when they discovered that “container,” “order,” and “port call” all had different definitions in different systems.

“Each team had part of the picture, but no one had the whole view,” said Sridher Arumugham during his Re:Govern keynote. “The reason was not lack of data. It was a lack of a shared meaning.”

To start, create a spreadsheet with three columns – “term,” “where it lives,” and “how it’s defined” – and fill it in with your top 20 business-critical terms. You’ll quickly see where conflicts exist.

2. Align Definitions and Pressure Test with Risk-Averse Teams #


With a lay of the land, you can begin getting alignment on definitions. This part isn’t technical – it’s coordination across business stakeholders.

Your definitions need to capture not just “what,” but “when,” “how,” and “except when.” For instance:

  • When does a lead become a customer?
  • How do you calculate tenure: from hire date, start date, or first contribution?
  • What are the edge cases where we need to consider an exception?

Approaching it this way will help you answer the next question: how do I train my AI on this knowledge? Explicit, unambiguous, context-rich definitions are how to train AI with confidence that its answers will be accurate.

It’s a good idea to pressure test those definitions with risk-averse teams, like Finance or Compliance, that can’t afford ambiguity.

“We are trying to get the data and analytics to the people in a way that is easy to consume…. What you’re trying to do is to find those people that need trustworthiness,” said Joe DosSantos of Workday during Re:Govern. “Generally speaking, those people can be found in places like finance…. So that’s where we went first.”

To start, host a 90-minute working session with Finance and one other risk-averse team. Pick five critical terms and ask the teams to answer:

  • What does it mean?
  • When does it apply?
  • What are the exceptions?
  • Who owns the definition?

This way, you get everyone on the same page so that definitions are airtight. The semantic layer goes from being siloed and disjointed to cohesive and reliable.

3. Build the Translation Layer #


The next step is where the rubber meets the road – where AI starts to make sense of business context. It’s the difference between spitting out the first answer it can find and having a context-based conversation.

Encode Business Terms into Schemas

Start by encoding business terms into executable schemas. If your definition of “Monthly Active User” is “a user who has logged into the platform at least once in the past 30 days,” then your AI needs to know:

  • Which tables to reference for login events (e.g. user_events.logins)
  • What constitutes a valid login (e.g. event_type = ‘successful_login’ AND session_duration > 0)
  • What is considered “past 30 days” (e.g. CURRENT_DATE - INTERVAL ‘30 days’)
  • Whether you’re counting distinct users or total logins (e.g. COUNT(DISTINCT user_id))

Mapping the relationships across different entities and creating rules for conditional logic safeguards against AI guesswork. For instance, to segment a set of rules to enterprise customers only, you may encode this decision tree:

IF customer_type = ‘enterprise’ AND contract_value > 100000 THEN apply enterprise_revenue_recognition_rules ELSE apply standard_rules.

Implement Disambiguation Workflows #


Next, implement disambiguation workflows meant to train AI to say “I don’t know” instead of confidently hallucinating. When you’re starting out, this may mean building guided prompts that align with your semantic layer, rather than letting users type freeform queries.

Putting guardrails on prompts may look like embedding:

  • An autocomplete feature that suggests valid entity names as users type
  • Dropdown menus for approved filters
  • Template-based query builders where users fill in blanks (e.g. “Show me [METRIC] for [DIMENSION] where [FILTER] between [DATE_RANGE]”)

The key to moving from a one-threaded, potentially inaccurate answer to a back-and-forth like the example below isn’t just better AI training – it’s better guidance on how to ask the right questions.

Implement Disambiguation Workflows

Implement Disambiguation Workflows. Source: Atlan.

This is where AI goes from being a tool to being a partner.

Incorporate Feedback Loops #


To further improve upon the AI-ready semantic layer you’ve built, incorporate the human-in-the-loop feedback we laid out in the previous section. Every time AI answers a question, give users the ability to flag incorrect results, suggest better phrasing for definitions, or report missing context. These signals will feed directly back into your translation layer.


The New Era of AI-Ready Semantic Layers #

The semantic layers we spent years perfecting for BI aren’t broken; they’re just built for a different era. But in our current era, AI is dictating how we work – and it doesn’t work that way.

AI needs unambiguous definitions, machine-readable structures, and the ability to say “I don’t know.” It needs semantics that work universally across all systems, not stay locked inside individual BI tools. And it needs context that evolves continuously, not documentation that updates whenever teams get around to it.

That means building the foundational context layer that teaches AI how your business actually thinks and works. Start with risk-averse domains, encode your most critical definitions first, and create a translation layer that turns ambiguity into clarity.

These elements working together will be the difference between organizations that stay stuck in the AI pilot phase and those that successfully make it to production. Read more about the context layer here.


Share this article

signoff-panel-logo

Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.

 

Atlan named a Leader in the Gartner® Magic Quadrant™ for Metadata Management Solutions 2025. Read Report →

[Website env: production]