Atlan named a Visionary in the 2025 Gartner® Magic Quadrant™ for Data and Analytics Governance.

How to Set Up Snowflake Data Lineage: A Step-by-Step Guide for 2025

Updated May 12th, 2025

Share this article

Snowflake data lineage has become essential for data teams seeking to improve transparency and accountability across systems. A frequent question from data practitioners is, “How can I implement data lineage in Snowflake to better understand the flow of my data?

Snowflake data lineage offers full visibility into your virtual warehouses and allows you to track data flow and usage across the system… Whether you’re a data engineer, business analyst, or platform architect, implementing Snowflake data lineage can significantly enhance your operational efficiency and ensure compliance.

AI on Snowflake? Learn to govern it — live with Atlan.

This article will go through:

  • Essential prerequisites for Snowflake data lineage
  • A step-by-step breakdown of how to set up Snowflake data lineage, manage roles and permissions, and address key challenges
  • The role of a metadata-led control plane like Atlan in enhancing visualization and discoverability
  • How leading companies across industries are unlocking the full potential of their data using the Snowflake-Atlan integration

Table of contents #

  1. Does Snowflake enable data lineage?
  2. How can you set up Snowflake data lineage?
  3. Why should you consider setting up an external platform for data lineage tracking in Snowflake?
  4. Are there any prerequisites to setting up data lineage for Snowflake?
  5. How does a metadata-led control plane like Atlan enhance Snowflake data lineage?
  6. How can you set up data lineage for Snowflake in Atlan?
  7. How the Snowflake and Atlan integration benefits customers across industries
  8. Snowflake data lineage: Bottom line
  9. Snowflake data lineage: Frequently asked questions (FAQs)
  10. Snowflake data lineage: Related reads

Does Snowflake enable data lineage? #

Snowflake data lineage tracks how data flows from source to target objects and helps you understand the relationships between your Snowflake objects. Snowflake data lineage captures two types of relationship for your Snowflake assets:

  • Data movement: When data is copied or materialized from one object to another
  • Object dependencies: When an object references a base object but does not materialize or copy data, such as when a view references a table

Snowflake data lineage UI

Snowflake data lineage UI - Source: Snowflake.

What are the business benefits of Snowflake data lineage? #


Snowflake data lineage offers several benefits, such as:

  • Impact analysis by understanding object relationships
  • Monitoring and troubleshooting support by viewing data movement lineage and object dependencies
  • Compliance support by tracking sensitive data flow
  • Column-level protection for sensitive data with tags and masking policies
  • Trust in data by mapping data flow across source and target objects and columns

How can you set up Snowflake data lineage? #

Here’s how you can set up Snowflake data lineage:

  1. Create a database role for data lineage.
  2. Create a database user and assign it the role.
  3. Identify the tables and views for which you’ll track data lineage.
  4. Assign relevant read permissions to the database role.
  5. Assign the role to the database user.
  6. Configure the Snowflake connector and start crawling lineage metadata

Let’s explore the specifics.

Step 1: Create a database role for data lineage #


Snowflake’s access control layer works with users and roles. Whatever permissions you have to grant, you grant them to a role and after that, you assign a role to a user. You can also assign roles to other roles, making role hierarchies.

In this case, you’ll create a new role called data_lineage_role, using the following command:

CREATE OR REPLACE ROLE data_lineage_role;

There are many ways to get lineage metadata from Snowflake. You’ll need to grant permissions to this role based on which method(s) you choose.

Before going into grants, let’s create a database user to which you’ll assign this role in the next step.

Step 2: Create a database user #


If you already have a data_catalog_user (as prescribed in this tutorial), use the same user; otherwise, create a new one.

In addition to the data_catalog_role, we’ll also assign the data_lineage_role to the same user.

Here are the commands you can use to create a database user in Snowflake:

# Method 1: With password
CREATE USER data_lineage_user PASSWORD='<password>' DEFAULT_ROLE=data_lineage_role DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>';

# Method 2: With public key
CREATE USER data_lineage_user RSA_PUBLIC_KEY='<rsa_public_key>' DEFAULT_ROLE=data_lineage DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>';

Alternatively, you can use SSO. There are two ways to do authentication on Snowflake:

  1. Using browser-based SSO
  2. Using your identity provider’s native SSO (only available for Okta)

Step 3: Identify tables and views you’ll be using for inferring data lineage from Snowflake #


To find out what permissions you need to grant to the data_lineage_role, you need to understand the different methods of fetching lineage metadata from Snowflake.

You’ll also need to consider the level of support your data catalog or lineage tool has for these methods. This is because some of the operations involved in fetching metadata include advanced SQL parsing, data flattening, and sophisticated querying to infer table-level and column-level lineage.

Here’s a basic comparison of the function and level of detail of three different data sources for lineage metadata in Snowflake:

SCHEMA.OBJECT FUNCTION LEVEL OF DETAIL
INFORMATION_SCHEMA.OBJECT_DEPENDENCIES Captures how different Snowflake objects are dependent on one another. Low
ACCOUNT_USAGE.ACCESS_HISTORY Contains queries for DML operations. Helps with column-level lineage. High
ACCOUNT_USAGE.QUERY_HISTORY Logs every query in the last 365 days. High

Please note that the QUERY_HISTORY and ACCESS_HISTORY objects are also available in the READER_ACCOUNT_USAGE schema.

Step 4: Assign relevant read permissions to the database role #


If you want to grant access to all three objects mentioned in the previous section, use the following set of GRANT statements:

# To access dependencies between Snowflake objects
GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE data_lineage_role;

# To get access logs for DML operations, and how columns changed because of the operations AND
# To get every query run in the past 365 days

GRANT USAGE, MONITOR ON WAREHOUSE <warehouse_name> TO ROLE data_lineage_role;

In addition to this, the INFORMATION_SCHEMA has a lot of other objects that make Snowflake’s internal data dictionary, such as TABLES, COLUMNS, etc. You can also use the metadata from those objects to make more sense of data lineage.

Notice how the permissions are granted on a WAREHOUSE level. You will need to individually grant the USAGE or MONITOR privilege to each virtual warehouse in your Snowflake account.

The alternative is to grant the permissions from the ACCOUNTADMIN role to the data_lineage_role. Snowflake highly recommends that you NOT do that, but if you still want to, here’s how you would do it:

USE ROLE ACCOUNTADMIN;
GRANT IMPORTED PRIVILEGES ON DATABASE snowflake TO ROLE data_lineage_role;

Additionally, if you are dealing with cloned accounts in Snowflake, you’ll need to grant permissions for their access too.

Step 5: Assign the database role to a database user #


Once you’re done assigning all the relevant permissions to the role, you’ll need to assign the role to the data_lineage_user using the following GRANT statement:

GRANT ROLE data_lineage_role TO USER data_lineage_user;

You should now be ready to connect to your Snowflake account from your data catalog or lineage tool.

Step 6: Configure the Snowflake connector and start crawling lineage metadata #


To configure the Snowflake connector, log into your data catalog or lineage tool and find the Snowflake connector. Enter the database user credentials into that connector, and you should be all set.

If you cannot connect to your Snowflake warehouse, you’ll probably need to check if you missed any networking or security steps. You can use the SnowCD (Snowflake Connectivity Diagnostic) tool to evaluate your network connectivity.

Once you resolve any connectivity issues, you can start crawling lineage metadata.

Most data catalog or lineage tools provide you with an option to run the crawler in three different ways:

  1. Ad-hoc crawl (manual crawl using a CLI command or the data catalog console)
  2. Scheduled crawl (E.g., based on a cron expression)
  3. Event-based crawl (E.g., crawl triggered from an event that the data catalog can listen to)

Why should you consider setting up an external platform for data lineage tracking in Snowflake? #

When you put your Snowflake lineage metadata into your data catalog or lineage tool, you can identify other data sources and connect them with your data catalog to get a fuller picture of the flow of data and its lineage in your data platform.

Such a setup will improve the data development and consumption experience significantly by ensuring that:

  • Data developers have more insight into the flow of data to understand the repercussions of data movement, transformation, archival, etc.
  • Data developers have more context when they’re writing a new data workload or fixing issues and bugs in an existing one.
  • Business users have the context of how data flows from the business applications and third-party integrations into the data platform for better, more meaningful reporting and analytics.

These are just a few examples. There are many more things that data lineage solves. Head over to this article to know more.


Are there any prerequisites to setting up data lineage for Snowflake? #

If you’re setting up a third-party data lineage tool for Snowflake, you’ll need to tick a few networking, infrastructure, and security checkboxes:

  1. Reachability: Make sure that the data lineage tool can connect Snowflake, i.e., it has proper connectivity. You might need to consider handling PrivateLink, NACLs, VPNs, etc., to make this work.
  2. Encryption: Use Snowflake’s data dictionary to get metadata securely into your data lineage tool. Some detailed methods of fetching data lineage contain highly sensitive data (detailed schema information) that could expose your organization to cyber attacks.
  3. Infrastructure: Ensure that your data catalog has enough computing power and memory to address data crawling, previewing, and querying operations.

How does a metadata-led control plane like Atlan enhance Snowflake data lineage? #

Atlan is an active metadata platform that takes care of data lineage, in addition to data cataloging, search, and discovery for your Snowflake data platform. Its metadata-first approach enhances Snowflake data lineage by providing end-to-end visibility across your entire data stack.

Atlan automatically extracts metadata from Snowflake’s ACCOUNT_USAGE and INFORMATION_SCHEMA, mines query history to build comprehensive lineage graphs, and tracks column-level relationships for impact analysis.

In addition, it gives you a rich interface to preview and query data from your Snowflake warehouses, making it a one-stop shop for all your data needs. This unified view provides deeper context about data flow, usage patterns, and asset relationships, making it easier to manage and govern your data estate.

Want to see how this integration can work for you? [Book a demo today] and discover the impact of Atlan + Snowflake for your organization.


How can you set up data lineage for Snowflake in Atlan? #

To set up data lineage for Snowflake in Atlan, you can go through the following steps:

  1. Create role in Snowflake
  2. Create a user
    1. With a password in Snowflake
    2. With a public key in Snowflake
    3. Managed through your identity provider (IdP)
  3. Grant role to the user
  4. Choose the metadata fetching method
    1. Information schema (recommended)
    2. Account usage (alternative)
  5. Grant permissions
    1. To crawl existing assets
    2. To crawl future assets
    3. To mine query history for lineage
    4. To preview and query existing assets
    5. To preview and query future assets
  6. Allowlist the Atlan IP

How the Snowflake and Atlan integration benefits customers across industries #

Atlan, a Leader in The Forrester Wave™, helps organizations across banking, healthcare, fintech, and manufacturing integrate with Snowflake to modernize their data stack, streamline data governance, enable self-service access to data, and get the data AI-ready.

Here are some examples.

Banking #


Austin Capital Bank modernized their data stack with Snowflake and Atlan, gaining seamless data access management and governance. As Ian Bass, Head of Data & Analytics, explains: “Atlan gave us a simple way to see who has access to what.”

Healthcare #


Scripps Health uses the Snowflake-Atlan integration to manage sensitive healthcare data while meeting HIPAA requirements. “Since Atlan is virtualized on Snowflake, security is no longer a concern,” notes Victor Wilson, Data Architect.

FinTech #


Tala combines Snowflake with Atlan, dbt, and Looker in their data stack. Through automated dbt documentation sync with Snowflake via Atlan, they’ve streamlined data processes and created a unified data dictionary that business users can easily understand.

Manufacturing #


Aliaxis integrated Atlan with their Snowflake data warehouse to improve data visibility. “If there’s any question you have about data in Snowflake, go to Atlan,” says Nestor Jarquin, Global Data & Analytics Lead.

These implementations demonstrate how organizations can enhance their data capabilities through the Snowflake-Atlan integration.


Snowflake data lineage: Bottom line #

Snowflake data lineage provides essential visibility into data flows and relationships across your Snowflake ecosystem. Whether implemented through native features or enhanced by metadata platforms like Atlan, effective lineage tracking helps organizations maintain data quality, ensure compliance, and understand impact analysis.

By following the setup steps and best practices outlined above, data teams can implement comprehensive and accurate lineage tracking that scales with their data needs.


Snowflake data lineage: Frequently asked questions (FAQs) #

1. What is Snowflake data lineage and why is it important? #


Snowflake data lineage refers to the process of tracking and visualizing how data moves and transforms within Snowflake. It helps teams understand the complete lifecycle of data—from its source to its final destination.

2. Why is data lineage important in Snowflake? #


Snowflake data lineage is crucial because it provides transparency into data flows, enabling teams to ensure data quality, meet compliance requirements, and troubleshoot issues faster by understanding how data is used and transformed across Snowflake.

3. What types of relationships does Snowflake data lineage track? #


Snowflake tracks two main types of relationships: data movement (when data is copied between objects) and object dependencies (when objects reference each other without data copying, like views referencing tables).

4. How can Atlan help set up Snowflake data lineage? #


Atlan integrates seamlessly with Snowflake to automatically capture, visualize, and track data lineage. With Atlan, users can get a holistic view of data pipelines, transformations, and usage patterns without manual setup or complex configurations.

5. What benefits does Snowflake data lineage offer to businesses using Atlan? #


By integrating Atlan with Snowflake, businesses can:

  • Gain a clear understanding of data transformations and workflows.
  • Ensure better data governance and compliance.
  • Improve reporting accuracy by ensuring data integrity.
  • Empower teams with contextual information for better decision-making and data utilization.

6. How easy is it to set up Snowflake data lineage with Atlan? #


Setting up data lineage with Atlan in Snowflake is simple. Atlan automatically captures and visualizes lineage across tables, columns, and queries in Snowflake, helping teams effortlessly manage their data environments without requiring extensive configuration.

7. Can Atlan display column-level lineage for Snowflake? #


Yes, Atlan provides column-level lineage for Snowflake, allowing users to track how data at the granular level is transformed and used across various datasets. This helps data teams ensure precision and accuracy in their data analysis.

8. How does Atlan handle changes in Snowflake’s data structure for lineage tracking? #


Atlan continuously syncs with Snowflake, ensuring that changes in data pipelines, tables, or transformations are reflected in the data lineage views. This real-time tracking helps maintain an up-to-date lineage view without manual interventions.



Share this article

[Website env: production]