Atlan + AWS: Bringing Data, Governance, and Semantic Context into Amazon SageMaker Unified Studio
Last Updated on: December 03rd, 2025 | 8 min read
While most enterprises are accelerating their GenAI initiatives, they are encountering a consistent bottleneck on the path from prototype to production. The challenge isn’t model quality, compute capacity, or foundation models — it’s the lack of complete, consistent metadata and governance context around the data powering those systems.
For AI to be trustworthy and usable at scale, builders need clear visibility into where data came from, how it was transformed, what it means, who owns it, and which governance rules apply. Without unified metadata and governance, this critical context is scattered across tools and teams — creating the gap that prevents GenAI systems from performing reliably in real-world environments.
This fragmentation creates what many organizations call the context gap — an opportunity to strengthen the reliability and governance of AI systems. By giving AI builders inside AWS environments access to complete, consistent enterprise context, teams can accelerate development and confidently deploy AI into production.
Today, we’re excited to announce a deep expansion of the Atlan + AWS partnership, including a new integration with Amazon SageMaker Unified Studio. This integration builds a bi-directional, real-time metadata sync between Atlan and Amazon SageMaker Unified Studio, ensuring that metadata remains consistent and standardized no matter which platform teams use.
With a single, authoritative source for definitions, lineage, ownership, and policy metadata, data stewards, data engineers, and AI builders can all work on top of the same trusted context — reducing fragmentation and enabling them to build and scale AI systems with greater confidence.
A Foundation Built Across the AWS Ecosystem #
Atlan already provides native integrations with core AWS services such as S3, Glue, Athena, Redshift, EMR, MSK, and QuickSight. The new Amazon SageMaker Unified Studio integration extends this foundation into the ML workflow, enabling a single, standardized metadata layer.
Atlan + AWS Integrations: Context Delivered Across the AI Lifecycle #
| Integration | Type of Context | What Syncs | Why It Matters for AI |
|---|---|---|---|
| Amazon S3 | Raw Data Context | Object metadata, schema patterns, partitions, tags, lifecycle details, usage patterns | Enables provenance tracking from raw files → features → models, ensuring training data is reliable and traceable. |
| AWS Glue | Transformation Context | Table definitions, crawlers, schema evolution, Glue jobs (via OpenLineage) | AI builders understand how data was created, transformed, and validated before entering ML workflows. |
| Amazon Athena | Query & Usage Context | Query logs, table usage, analytical flows | Identifies trusted datasets and analytical logic, helping ML teams select the right training inputs. |
| Amazon Redshift | Analytical Warehouse Context | Tables, columns, performance metadata, query lineage | Surfaces analytical assumptions baked into features and datasets used to train and evaluate models. |
| Amazon EMR | Processing & Pipeline Context | Job runs, Spark lineage, transformations | Allows traceability into distributed processing jobs that shape model-training datasets. |
| Amazon MWAA (Airflow) | Orchestration & Operational Context | DAGs, task runs, upstream/downstream pipelines (via OpenLineage) | Debugging and model monitoring become easier with full visibility into upstream pipeline health and dependencies. |
| Amazon MSK / Kafka | Streaming Context | Stream topics, producers/consumers, schema changes | Critical for real-time ML features and event-driven AI applications that rely on accurate, timely data. |
| Amazon DynamoDB | NoSQL Context | Table schemas, indexes, access vectors | Ensures that key-value and document store datasets used in AI workloads are well-understood and governed. |
| Amazon QuickSight | Consumption & Impact Context | Dashboards, linked datasets, usage patterns | Helps AI builders understand where model outputs influence business decisions. |
Creating Value Together for Nasdaq: Giving Builders Context at the Scale of 140 Billion Daily Events #
Few organizations operate at the scale of Nasdaq. With 30 exchanges worldwide and 140 billion events processed every day, they run one of the most data-intensive environments on the planet — powered by AWS services like Redshift, S3, Glue, and QuickSight.
But even with a decade of AWS experience and a modern stack (dbt, Spark, Monte Carlo, QuickSight), Nasdaq faced the same challenge shared across every large enterprise: teams were drowning in data but starved for context. Power users spent a third of their time trying to understand lineage, definitions, and ownership. Business units often needed to ask four different teams just to find the right source or meaning behind a dataset.
With Atlan + AWS, Nasdaq unified the context behind their Redshift and QuickSight assets — meaning, lineage, ownership, usage, quality — into a single, searchable experience their users.
The result was a measurable shift in productivity and trust: discovery time dropped by one-third, producers and consumers finally aligned on definitions, business units became genuinely self-serve, and central teams regained capacity to focus on higher-value initiatives instead of answering repeated context questions.
Announcing Atlan’s Integration with Amazon SageMaker Unified Studio #
Amazon SageMaker Unified Studio is AWS’s consolidated environment for analytics and AI, enabling teams to document and discover datasets, build pipelines, manage features, experiment, and deploy models. As organizations scale these workflows, they need metadata — business, technical, and governance — to remain consistent across every platform their teams use.
This integration builds a bi-directional, real-time metadata sync between Atlan and Amazon SageMaker Unified Studio, ensuring a single, standardized representation of business and governance context across both environments. Definitions, lineage, ownership, classifications, and policy metadata stay aligned, so business users, data stewards, engineers, and AI practitioners can all work with the same trusted context — no matter which tool they’re in.
What the Integration Enables — Context Delivered to AI Builders #
1. Consistent, Shared Context Across Every Tool #
Amazon SageMaker Unified Studio already supports rich metadata like definitions, ownership, classifications, quality, and lineage. The Atlan integration ensures this context is consistent and trusted everywhere.
By synchronizing metadata bi-directionally and stitching Unified Studio’s context together with the rest of an organization’s data and AI landscape, Atlan creates a context layer that keeps meaning, lineage, and governance aligned across every platform teams use.
This gives data stewards, data engineers, and AI builders a shared, standardized understanding of their assets — whether they are cataloging data, building features, training models, or powering AI agents via MCP. When every workflow operates on the same context, teams make better decisions, avoid duplication, and ship AI systems that behave reliably in production.

Published assets from Amazon SageMaker Unified Studio in Atlan. Source: Atlan.
2. AI-Ready Data Products With Shared Business Meaning #
With standardized metadata synchronized across Atlan and Amazon SageMaker Unified Studio, teams can package datasets, features, and models into verified, business- and AI-ready Data Products.
Data Products are created and managed in Atlan, organized by domains, and enriched with consistent definitions, ownership, quality expectations, and lineage. They can then be published to Atlan’s Data Product Marketplace, giving business users an easy way to discover trusted, governed assets.
This shared, context-rich foundation ensures that both business and technical teams start with reliable inputs when building analytics dashboards or AI models.

Amazon SageMaker Unified Studio data products and domains get published in Atlan’s Data Product Marketplace. Source: Atlan.

Amazon SageMaker Unified Studio data products in Atlan. Source: Atlan.
3. Cross-System Lineage Connecting Amazon SageMaker Unified Studio to the Entire Data & AI Estate #
Atlan’s integration with Amazon SageMaker Unified Studio supports Published ↔ Subscribed lineage relationships, giving teams a clear view of how assets are shared and reused within Unified Studio. Atlan extends this further by stitching these relationships together with lineage from the rest of the data and AI landscape.
This unified lineage helps teams validate model inputs, understand dependencies, and diagnose issues faster. It also strengthens auditability and governance, ensuring that production AI systems are transparent, explainable, and built on trustworthy foundations.

Atlan stitches lineage across the data & AI landscape. Source: Atlan.
What’s Coming Next #
The next phase of the Atlan + Amazon SageMaker Unified Studio integration expands how governance and context flow across the AI lifecycle.
1. Governance Workflows Integrated Across Platforms #
Users will be able to discover SMUS assets directly in Atlan, request access, and track approval status — ensuring consistent, compliant access governance across teams and tools.
2. Extended End-to-End Lineage #
Lineage will extend from Published ↔ Subscribed relationships into full source-to-consumer coverage, giving teams deeper visibility across data, features, models, and downstream applications.
3. Publish Assets From Atlan #
Users will gain the ability to publish assets directly from Atlan, enabling a smoother producer workflow for creating, sharing, and governing AI-ready assets across the organization.
These upcoming capabilities further strengthen the unified metadata foundation across Atlan and Amazon SageMaker Unified Studio, enabling teams to collaborate, govern, and scale AI with greater clarity and confidence.
Laying the Foundation for Context-Aware AI #
As enterprises bring more of their analytics and AI development into Amazon SageMaker Unified Studio, having consistent and trusted metadata becomes essential. AI systems scale safely only when every team — from governance to data engineering to ML — works from the same shared business and policy context.
The expanded Atlan + AWS integration creates exactly that foundation. With a standardized metadata layer spanning both platforms, organizations can reduce fragmentation, improve collaboration, and accelerate the path from experimentation to production. Data stewards, engineers, and AI builders gain a unified understanding of the assets they rely on, enabling them to build and deploy AI systems with greater clarity, confidence, and control.
Share this article
Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.

