8 Snowflake Data Lineage Best Practices for 2026
How does Snowflake capture data lineage?
Permalink to “How does Snowflake capture data lineage?”Snowflake offers automatic data lineage capture and management as part of its core services. Lineage gets captured when you move data from one asset to another or when you define dependencies via object references in views and queries.
Snowflake captures lineage at three levels:
- Asset level: Tables, views, materialized views, external tables, Iceberg tables
- Column level: Table-like assets
- Process level: Stored procedures and tasks
When a CREATE TABLE AS SELECT, INSERT INTO, or view definition executes, Snowflake parses the SQL and records the upstream-to-downstream relationships in its metadata layer.
Lineage data is retained for up to 365 days and is accessible through the SNOWFLAKE.ACCOUNT_USAGE schema, the GET_LINEAGE function, and Snowflake’s visual lineage graph in Snowsight. The graph is updated in near real-time as queries run, without requiring any manual refresh.
It’s worth noting what Snowflake doesn’t capture automatically:
-
Lineage originating outside Snowflake, such as from dbt, Airflow, or Spark jobs, requires explicit ingestion via the OpenLineage standard or the Snowflake REST API.
-
External lineage currently resolves only at the table level, not the column level.
How to access Snowflake lineage metadata
Permalink to “How to access Snowflake lineage metadata”Before diving into best practices, it’s worth understanding the different interfaces available for accessing lineage metadata in Snowflake.
In Snowflake, you can access the lineage metadata in multiple ways:
-
Snowsight UI (Enterprise Edition or higher): The most common interface is the Lineage tab in the Snowsight UI, as it is easy to use, interactive, and supports tags. Best suited for ad hoc exploration and stakeholder-facing lineage reviews.
-
GET_LINEAGEfunction: You can also access lineage programmatically using theGET_LINEAGEfunction. Use this for automated lineage jobs, custom dashboards, or integration with external tools. However, you should note that the function has a maximum depth of 5 levels for the lineage graph. -
OBJECT_DEPENDENCIESview: When you only want table-level lineage, you can simply use theOBJECT_DEPENDENCIESview. It’s lightweight and straightforward, but doesn’t capture column-level relationships. -
ACCESS_HISTORYview: When you need column-level lineage, use theACCESS_HISTORYview. Note that lineage for materialized views and dynamic tables isn’t accessible via theACCESS_HISTORYview. -
Snowflake ML API / Snowpark Python SDK: For ML and Snowpark workloads, you’ll need to use the Snowflake ML API or the Snowpark Python SDK to get access to data lineage.
While using any views in the ACCOUNT_USAGE schema of the SNOWFLAKE catalog, note that they have a refresh time of 45 minutes to 3 hours (most views have a latency of 2 hours).
Next, let’s look at the 8 essential best practices for Snowflake data lineage in a bit more detail and with greater nuance, especially around Snowflake Editions.
1. Configure proper access control for end-to-end lineage visibility
Permalink to “1. Configure proper access control for end-to-end lineage visibility”While some of the features, like OBJECT_DEPENDENCIES and QUERY_HISTORY are available across all editions in Snowflake, more advanced features like the Snowsight lineage visualization UI, ACCESS_HISTORY (for column-level lineage), GET_LINEAGE, and object tagging with automated tag propagation require Enterprise Edition or higher.
When using this Edition, you need to make sure that you’re providing the right level of privileges to roles that will be viewing lineage and to roles that will be managing it.
Follow the least privilege principle and align lineage privileges with your existing role hierarchy rather than granting them ad hoc. For example, granting broad privileges like RESOLVE ALL to a widely-used role can inadvertently expose metadata about objects a user has no business seeing.
The key privileges to configure are:
-
VIEW LINEAGE: Granted to thePUBLICrole by default. You should revoke this privilege and reassign it to the relevant custom role you’re creating for this purpose. -
RESOLVE ALL: Grant this privilege for ensuring the resolution of all the objects referenced in the lineage graph. -
INGEST LINEAGE: Required for writing external lineage into Snowflake via the API. Grant this exclusively to trusted pipeline orchestrators or service accounts. -
GOVERNANCE_VIEWERorUSAGE_VIEWER: Use these database roles for users to be able to access theACCOUNT_USAGEviews that capture lineage metadata, such asOBJECT_DEPENDENCIESandACCESS_HISTORY.
The setup required to enable lineage in Snowflake is minimal; you just need to manage permissions correctly.
Finally, audit your lineage privilege assignments regularly, especially after role changes or onboarding new integrations. Lineage data is sensitive and could expose proprietary business logic or security vulnerabilities if accessed by the wrong people. So, treat lineage access with the same rigor you apply to table-level privileges.
2. Implement object tagging on tables and columns
Permalink to “2. Implement object tagging on tables and columns”Raw lineage tells you what connects to what, but it doesn’t tell you why it matters. Object tagging bridges this gap by attaching business and governance metadata directly to your Snowflake objects, making lineage graphs actionable rather than just informational.
Apply tags to tables and columns to capture information such as data sensitivity (e.g., pii = true), data domain (e.g., domain = finance), regulatory scope (e.g., gdpr_relevant = true), and ownership (e.g., owner = revenue_team).
Make sure that you:
-
Tag consistently across your object inventory: Define centralized terminology for tags your entire organization agrees on before deployment, and enforce it as part of your object creation workflow.
-
Enable automatic tag propagation: Once a tag is applied to a source column, Snowflake automatically propagates it to downstream objects through the lineage graph.
-
Drive governance policies from your tags: Snowflake supports tag-based masking policies and row access policies. By attaching these policies to tags rather than to individual objects, enforcement automatically extends to any column carrying that tag, without requiring manual intervention at every step.
3. Bring external lineage into Snowflake
Permalink to “3. Bring external lineage into Snowflake”Most enterprise data pipelines don’t live entirely inside Snowflake. dbt transformations, Airflow DAGs, Spark jobs — all of these produce lineage that exists outside Snowflake’s native tracking scope. Without capturing this external lineage, your lineage graph has blind spots exactly where pipeline complexity tends to concentrate.
Snowflake addresses this via native support for OpenLineage, an open standard for lineage metadata interchange. As of January 2026, this feature is in Preview — functional and usable, but worth monitoring for changes before building critical governance workflows on top of it.
Here’s what to keep in mind when working with external lineage in Snowflake:
-
Check connector support before committing: Verify your tooling is compatible before assuming lineage will flow correctly. There are a few official connectors available for bringing in lineage, such as dbt and Airflow, which allow you to bring lineage into Snowflake based on OpenLineage v1 Specification (Specification v2 isn’t yet supported).
-
Review permissions carefully: Make sure that you follow the spec and send all the required properties and grant the required permissions for the lineage to be ingested, else it will be ignored.
-
No column-level support: External lineage resolves at the table level only and column-level relationships from external pipelines aren’t captured. External lineage also doesn’t show up when you use the
GET_LINEAGEfunction in Snowflake. -
At least one object must be a Snowflake object: Lineage between two fully external systems cannot be represented in Snowflake’s lineage graph. For organizations with complex multi-system pipelines that don’t always touch Snowflake, this is where a dedicated external lineage tool becomes necessary.
4. Query lineage programmatically
Permalink to “4. Query lineage programmatically”Snowflake provides powerful native mechanisms for extracting detailed lineage data programmatically. Understanding how to use these effectively lets you build custom lineage dashboards, automate governance checks, and integrate lineage into your data catalog.
Combine the following to build complete lineage pipelines:
-
GET_LINEAGEfunction: The primary SQL function for querying the lineage graph. It allows you to traverse lineage upstream or downstream from any object. -
OBJECT_DEPENDENCIESview: Available inSNOWFLAKE.ACCOUNT_USAGE, this view surfaces direct dependencies between objects. Useful for impact analysis. -
ACCESS_HISTORYview: Captures which users accessed which columns at what times, providing a read-access lineage layer on top of structural lineage. Helps with compliance audits and understanding actual data consumption patterns. -
QUERY_HISTORYview: Provides the raw SQL query text and execution metadata behind Snowflake’s automatic lineage detection. Analysts can cross-reference query history with lineage results.
5. Add business context
Permalink to “5. Add business context”While automated lineage captures structural relationships, it cannot capture business intent and without that context, lineage doesn’t help much with navigation, discovery, or governance.
Use COMMENT ON for free-text documentation
Snowflake’s COMMENT ON statement lets you attach free-text documentation directly to tables and columns within the platform. This metadata travels with the object, surfaces in information schema queries, and integrates naturally with data catalog tools that pull from Snowflake.
Pair comments with object tagging for maximum impact: tags provide machine-readable governance metadata for automated policy enforcement, while comments provide human-readable context for investigation and onboarding.
To keep comments consistent and programmatically parseable, adopt a standard template across your organization.
Use Cortex AI for description generation
Rather than writing descriptions from scratch, you can use Cortex AI to generate descriptions for objects and later refine them with further business context using the COMMENT ON or COMMENT ON COLUMN features.
Finally, this enrichment layer compounds in value when connected to an active metadata platform like Atlan with bi-directional tag sync for continuous, updated context.
6. Track ML lineage across the pipeline
Permalink to “6. Track ML lineage across the pipeline”ML lineage, which is an Enterprise Edition (or higher) feature, works slightly differently from the core database objects. It is tightly embedded in the MLOps lifecycle, where lineage is also tracked for the model lifecycle to understand how a model was trained, how it was registered, and how it was deployed.
The best way to use Snowflake’s ML lineage is to do the following:
-
Use Model Registry: If you want to make the most out of Snowflake’s ML lineage, you must use the Model Registry for logging your models. Any model trained or deployed outside the registry is invisible to the lineage graph.
-
Use Snowpark DataFrames: Ensure that you’re using Snowpark DataFrames because if you do, model lineage is automatically captured and visible via the Model Registry. Using Pandas DataFrames requires an extra manual step.
-
Query ML lineage programmatically with
.lineage(): Use the Snowpark ML Python API’s.lineage()method to get both upstream and downstream relationships for the model programmatically.
7. Embed data quality into lineage graph
Permalink to “7. Embed data quality into lineage graph”Without quality signals attached, lineage can’t tell you whether the data flowing through that graph is actually trustworthy. Snowflake’s Data Metric Functions (DMFs) are the native mechanism for embedding data quality into lineage and closing that gap.
Here’s how to make quality and lineage work together effectively:
-
Attach DMFs at the lineage node level: When a quality check fails, you can immediately traverse upstream to identify where bad data entered the pipeline and assess downstream blast radius.
-
Use
GET_LINEAGEfor impact analysis on quality failures: When a DMF flags an issue on a source table, useGET_LINEAGEto enumerate all downstream objects affected. -
Tag objects with quality status: Combine DMF results with object tagging to surface quality status as a machine-readable signal. For example, a tag like
quality_status = failingcan drive automated alerts or access policy changes while an issue is being resolved. -
Connect quality signals to your metadata control plane: Platforms like Atlan surface DMF results as trust signals within the lineage graph, making quality context visible to all consumers navigating lineage.
See Atlan + Snowflake Lineage in Action
Book a Demo →8. Integrate with a metadata control plane like Atlan
Permalink to “8. Integrate with a metadata control plane like Atlan”Snowflake’s native lineage capabilities are powerful within the platform, but most enterprise stacks span dbt, Airflow, Spark, Monte Carlo, and BI tools like Tableau or Looker. Lineage breaks at every platform boundary, and without a unifying layer, you’re governing in silos.
A metadata control plane, like Atlan, sits across your entire stack, stitching together cross-system lineage, propagating context, and making governance actionable for both technical and business users.
Here’s what you get:
-
Metadata lakehouse: The metadata lakehouse acts as a control plane for all metadata, supporting a range of use cases beyond lineage, such as data quality, governance, security, and observability.
-
Automated cross-system column-level lineage: Atlan automatically extracts metadata from Snowflake’s
ACCOUNT_USAGEandINFORMATION_SCHEMA, mines query history to build comprehensive lineage graphs, and tracks column-level relationships for impact analysis. -
Update lineage metadata: Edit and modify existing lineage metadata to fill gaps in lineage and add business context wherever required.
-
Automatic downstream tag propagation: Automatically propagate tags, classifications, and governance metadata irrespective of whether the source system supports it or not (including bi-directional metadata sync for many systems, including Snowflake).
-
A shared semantic and context layer: As a launch partner for Snowflake’s Open Semantic Interchange, Atlan acts as the context layer for the AI-native enterprise. So, business meaning and governance travel seamlessly across every tool, dashboard, and AI agent.
-
Data quality as a trust signal: Atlan’s Data Quality Studio surfaces DMF quality results in business-friendly dashboards, giving data consumers a live quality signal alongside lineage without moving data out of the platform.
-
AI governance out of the box: Atlan lets data teams catalog models, policies, and lineage side-by-side, enforce responsible AI controls, and deliver end-to-end transparency with AI governance.
To gain a detailed understanding, let’s look at how some of Atlan’s customers use the metadata control plane to get the most out of their lineage.
Real stories from real customers: Extend native Snowflake lineage with an active metadata control plane
Permalink to “Real stories from real customers: Extend native Snowflake lineage with an active metadata control plane”Massive Asset Cleanup: Mistertemp's Lineage-Driven Optimization to Deprecate Two-Thirds of Their Data Assets
"Using Atlan's automated lineage, started analyzing [data assets in] Snowflake and Fivetran. They could see every existing connection, what was actually used. We kept those, and for everything else, we would disconnect."
Data Team
Mistertemp
🎧 Listen to AI-generated podcast: Mistertemp's data lineage optimization success
Improved time-to-insight and reduced impact analysis time to under 30 minutes
"I've had at least two conversations where questions about downstream impact would have taken allocation of a lot of resources. Actually getting the work done would have taken at least four to six weeks, but I managed to sit alongside another architect and solve that within 30 minutes with Atlan."
Karthik Ramani, Global Head of Data Architecture
Dr. Martens
🎧 Listen to AI-generated podcast: Dr. Martens' data transparency transformation
From Hours to Minutes: How Aliaxis Reduced Effort on Root Cause Analysis by almost 95%
"A data product owner told me it used to take at least an hour to find the source of a column or a problem, then find a fix for it, each time there was a change. With Atlan, it's a matter of minutes. They can go there and quickly get a report."
Data Governance Team
Aliaxis
🎧 Listen to AI-generated podcast: How Aliaxis Reduced Root Cause Analysis Effort by 95%
Moving forward with Snowflake data lineage best practices for 2026
Permalink to “Moving forward with Snowflake data lineage best practices for 2026”Snowflake offers a solid built-in solution and foundation for lineage, and it works well if you’re working solely in Snowflake. As of the start of 2026, some OpenLineage-based connectors are available, but they are currently in preview and don’t cover all popular data sources for lineage metadata.
The above best practices can help you get the most out of the native data lineage features in Snowflake. However, it is common to see medium to large organizations using many tools, which leads to broken lineage due to increased siloing and fragmentation. That’s exactly when the need for a unified metadata control plane arises.
A unified metadata control plane aggregates and activates metadata across your data stack. It also helps you get all sorts of metadata for quality, governance, and observability, including that for lineage.
FAQs about Snowflake data lineage best practices
Permalink to “FAQs about Snowflake data lineage best practices”1. Does Snowflake automatically capture lineage?
Permalink to “1. Does Snowflake automatically capture lineage?”Yes, Snowflake captures lineage for data movement defined by SQL operations like CREATE TABLE AS ..., INSERT, MERGE, and also for both explicitly-defined object dependencies (such as views referencing tables) and implicit object dependencies (via table references).
2. What Snowflake edition is required for lineage?
Permalink to “2. What Snowflake edition is required for lineage?”For most data lineage features, like access to the ACCESS_HISTORY view, the GET_LINEAGE function, etc., you need Enterprise Edition or higher. Many other features, such as access to the OBJECT_DEPENDENCIES view, are available across all editions.
3. Is external lineage supported in Snowflake?
Permalink to “3. Is external lineage supported in Snowflake?”Yes, as of January 2026, OpenLineage-based connectors for dbt (Data Build Tool) and Airflow are available in Preview. However, while dbt and Airflow individually support granular column-level lineage, Snowflake doesn’t yet support it via these connectors.
4. What are some of the best practices for lineage in Snowflake?
Permalink to “4. What are some of the best practices for lineage in Snowflake?”Some of the most important best practices include configuring appropriate access control, adding human and AI-generated technical and business context, integrating external lineage, and leveraging automatic tag propagation through object tagging.
5. For how long is lineage metadata available in Snowflake?
Permalink to “5. For how long is lineage metadata available in Snowflake?”Most views in the ACCOUNT_USAGE schema of the SNOWFLAKE catalog have a retention of 1 year. This is also true for views pertaining to lineage metadata. You can always create archive tables using these views if you want to retain the lineage history for more than a year.
Share this article
Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.
Snowflake data lineage best practices: Related reads
Permalink to “Snowflake data lineage best practices: Related reads”- What Is Data Lineage & Why Is It Important?
- How do you connect Snowflake data quality to downstream dashboards?
- Automated Data Lineage: Making Lineage Work For Everyone
- AI-Ready Data Lineage: Activates Trust & Context in 2025
- Data Lineage Tracking: Best Practices & Tools
- 9 Best Data Lineage Tools in 2026: A Complete Roundup of Key Capabilities
- Data Lineage Solutions: Capabilities and 2026 Guidance
- Column-Level Lineage: Why It Matters for Data Governance
- Snowflake Data Quality: How to Scale Trust in Your Data
- 8 Best Data Quality Tools for Modern Data Teams in 2026
- Data Quality Software: Pick The Best Option For Your Business in 2026
- Snowflake Data Governance: Best Practices for 2026
- Data Quality Alerts: Setup, Best Practices & Reducing Fatigue
- How to Set Up Snowflake Data Lineage
- Snowflake Data Governance: Key Features & How Atlan Scales It
- Atlan Launches Data Quality Studio for Snowflake
- Data Governance Framework 2026: Pillars and Implementation
- Snowflake Horizon Catalog + Atlan: Unified Data Governance
- Top 14 Data Observability Tools of 2026: Key Features Compared
- Data Observability: Definition, Key Elements, & Benefits
- How Data Observability & Data Catalog Are Better Together
- Data Lineage & Data Observability: Why Are They Important?
- Data Quality Explained: Causes, Detection, and Fixes
- The Best Open Source Data Quality Tools for Modern Data Teams
- Semantic Layers: The Complete Guide for 2026
- Ontology vs Semantic Layer: Understanding the Difference for AI-Ready Data
- Context Graph vs Knowledge Graph: Key Differences for AI
- Context Graph: Definition, Architecture, and Implementation Guide
- Context Graph vs Ontology: Key Differences for AI
- Context Layer 101: Why It’s Crucial for AI
- Who Should Own the Context Layer: Data Teams vs. AI Teams? | A 2026 Guide
- Combining Knowledge Graphs With LLMs: Complete Guide
- Context Preparation vs. Data Preparation: Key Differences, Components & Implementation in 2026
- 5 Best Data Governance Platforms in 2026 | A Complete Evaluation Guide to Help You Choose
- Data Governance Tools: Importance, Key Capabilities, Trends, and Deployment Options
- 7 Top AI Governance Tools Compared | A Complete Roundup for 2026
- Active Metadata Management: Powering lineage and observability at scale
- Dynamic Metadata Management Explained: Key Aspects, Use Cases & Implementation in 2026
- Metadata Lakehouse: Activates Governance & Drives AI Readiness in 2026
- Metadata Orchestration: How Does It Drive Governance and Trustworthy AI Outcomes in 2026?
