How to Set Up Data Governance for Snowflake? (2024 Guide)

Updated September 27th, 2024

Share this article

Snowflake is known for its robust security and governance features. You can connect Snowflake to a data governance tool to make the best of those features. The key idea is to enable analysts, engineers, and business users to centrally govern data for better security, risk awareness, and access control.

See How Atlan Simplifies Data Governance – Start Product Tour

In this step-by-step guide, you’ll learn how to set up a data governance tool for Snowflake.

Snowflake has various out-of-the-box governance features. You’ll need to follow these steps to access the governance-related metadata from Snowflake:

  1. Create a database role for data governance
  2. Create a database user
  3. Identify objects and features you’ll need for managing data governance in Snowflake
  4. Assign relevant read permissions to the database role
  5. Assign the database role to the database user
  6. Configure the Snowflake connector and begin your data governance workflow

Data governance is a critical area in any data system. It deals with privacy, security, encryption, compliance, regulations, and consumer rights, among other things.

Over the years, the value of doing data governance right has become evident with several significant data breaches and fines. If you feel your current Snowflake-based data platform can’t provide the data governance controls and mechanisms you need: Talk to us.


Table of contents #

  1. Prerequisites to setting up data governance for Snowflake
  2. Steps to set up data governance for Snowflake
  3. Business outcomes from Snowflake data governance
  4. How to set up data governance in Atlan for Snowflake
  5. Snowflake Data Governance Setup: Related reads

Prerequisites to setting up data governance for Snowflake #

When setting up a data governance tool for Snowflake, first, you’ll need to ensure that you cover a few baseline things, such as networking, infrastructure, and security:

  1. Reachability — Make sure that your data governance tool can connect to Snowflake. Depending on your Snowflake setup, you might need to consider handling PrivateLink, NACLs, VPNs, etc., to make this work.
  2. Encryption — Ensure that any communication with your data governance tool is secure. You must ensure this from Snowflake’s and the data governance tool’s end.
  3. Infrastructure — Ensure that your data governance tool has enough computing power and memory to address data governance-related metadata processing.


Steps to set up data governance for Snowflake #

The setup broadly involves creating a Snowflake role, a Snowflake user, and granting it relevant data governance-related privileges. Once all that is done, you can connect to your data governance tool via a Snowflake connector. Let’s dive right in!

Step 1. Create a database role for data governance #


One of the core functions of effective data governance is to provide a robust access control layer. Snowflake provides two access control models: RBAC (role-based access control) and DAC (discretionary access control).

For the data governance tool to be able to fetch governance-related metadata from Snowflake, you need to create a role and grant it relevant access. Use the following command to create the data_governance_role:


CREATE OR REPLACE ROLE data_governance_role;

If you cannot work with Snowflake Tags or other features, ensure that you have the right Snowflake edition in place. Some advanced governance features only work in the Enterprise edition and above.

Step 2. Create a database user #


With a CREATE USER statement, you can create a Snowflake database user either with a password or the RSA asymmetric key algorithm. While creating the role, you can also set role-specific defaults, such as DEFAULT_WAREHOUSE and DEFAULT_ROLE, as shown in the following commands:

# Method 1: With password
CREATE USER data_governance_user PASSWORD='<password>' DEFAULT_ROLE=data_governance_role DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>';

# Method 2: With public key
CREATE USER data_governance_user RSA_PUBLIC_KEY='<rsa_public_key>' DEFAULT_ROLE=data_governance_role DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>';


You can do authentication in Snowflake in two other ways too: using browser-based SSO or your identity provider’s native SSO (only available for Okta).

Step 3. Identify the objects and features you’ll need for managing data governance in Snowflake #


Apart from Snowflake’s internal schemas containing metadata related to Snowflake data assets, you’ll need access to the following:

  • Snowflake Object Tags — you’ll need permissions to create and alter tags and the permissions to apply labels on Snowflake objects. Moreover, you’ll also need permission to monitor tags.
  • Data Classification Functions — you’ll need permission to spin up Snowflake-managed serverless resources to run these functions on your data-containing objects.
  • Column-level security -you’ll need permission to create and alter data masking policies that enable dynamic data masking. External functions, such as an AWS Lambda function, can also be used for masking. You’ll need additional permissions to invoke an external process.
  • Row-level security — and finally, you’ll also need permissions to create and alter row access policies in Snowflake.

Remember that much of the data related to the above features is available in the Snowflake-managed database called SNOWFLAKE. You’ll need access to the ACCOUNT_USAGE and READER_ACCOUNT_USAGE schemas to work with these features.

Step 4. Assign relevant read permissions to the database role #


Based on the information above, you’ll need to assign either specific privileges to the data_governance_role or you’ll need to import privileges from other system-defined roles, such as ACCOUNTADMIN. The data governance tool will need access to the INFORMATION_SCHEMA along with Snowflake’s internal schemas, as mentioned in the previous section. To assign those privileges, you can use the following set of statements:


USE ROLE ACCOUNTADMIN;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE data_governance_role;
GRANT USAGE ON DATABASE <database_name> TO ROLE data_governance_role;
GRANT USAGE ON ALL SCHEMAS IN DATABASE <database_name> TO ROLE data_governance_role;

GRANT SELECT ON ALL TABLES IN SCHEMA SNOWFLAKE.ACCOUNT_USAGE TO ROLE data_governance_role;
GRANT SELECT ON ALL VIEWS IN SCHEMA SNOWFLAKE.ACCOUNT_USAGE TO ROLE data_governance_role;

After granting these privileges, you can go one of two routes: granting OPERATE and USAGE privileges to the data_governance_role or granting specific data governance-related permissions to the data_governance_role:

GRANT OPERATE, USAGE ON WAREHOUSE <warehouse_name> TO ROLE data_governance_role;


Snowflake recommends that you don’t use overly permissive grants to custom roles. The above grant allows the data_governance_role to do pretty much anything in your warehouse, including starting, suspending, and resuming it.

You can alternatively take a more granular approach by applying specific permissions. You can use the following set of statements to do that:

# Masking policies
GRANT CREATE MASKING POLICY ON SCHEMA <database_name>.<schema_name> TO ROLE data_governance_role;
GRANT APPLY MASKING POLICY ON ACCOUNT TO ROLE data_governance_role;
GRANT APPLY ON MASKING POLICY <masking_policy_name> to ROLE data_governance_role;

# Row access policies
GRANT CREATE ROW ACCESS POLICY ON SCHEMA <database_name>.<schema_name> to role data_governance_role;
GRANT APPLY ROW ACCESS POLICY ON ACCOUNT data_governance_role;

# Object tagging
GRANT CREATE TAG ON SCHEMA <database_name>.<schema_name> TO ROLE data_governance_role;
GRANT APPLY TAG ON ACCOUNT TO ROLE data_governance_role;

# Data classification
GRANT DATABASE ROLE snowflake.governance_admin TO ROLE data_governance_role;
GRANT APPLY TAG ON ACCOUNT TO ROLE data_governance_role;


If you haven’t already, let’s assign the data_governance_role to a database user.

Step 5. Assign the database role to the database user #


Once you’re done assigning all the relevant permissions to the role, you’ll need to assign the role to the data_governance_user using the following GRANT statement:

GRANT ROLE data_governance_role TO USER data_governance_user;

You should now be ready to connect to your Snowflake account from your data governance tool.

Step 6. Configure the Snowflake connector and begin your data governance workflow #


After sorting out all the relevant privileges and assigning them to a database user via the data_governance_role, you should be able to use your data governance tool’s Snowflake connector to start interacting with the data governance-related metadata in Snowflake. If you cannot connect to Snowflake, you may have missed one of the prerequisites. You can try to go over them again. You can also use the SnowCD (Snowflake Connectivity Diagnostic) tool to evaluate your network connectivity.

After fixing connectivity issues, you can start governing your data using your data governance tool. Most data governance tools provide you with an option to fetch governance-related metadata from Snowflake in three different ways:

  1. Ad-hoc crawl (running a manual metadata fetch using the console, the CLI, or the SDK)
  2. Scheduled crawl (running based on a predefined schedule using a cron expression)
  3. Event-based crawl (triggering a crawl from an event that the data catalog can listen to)

You’re now ready to start working with tagging, classification, masking, and other out-of-the-box data governance features in Snowflake.


Business outcomes from Snowflake data governance #

Setting up a data governance tool for Snowflake will enable you to control your data’s access, security, and business context. A data governance tool helps you with the following:

  • Enabling data ownership and access control for data stewards, master data managers, and data product owners.
  • Tagging data for better data categorization, discovery, and visibility.
  • Classifying sensitive PII and PHI data based on Snowflake’s external functions and the data governance tool’s features.
  • Ensuring data privacy and security by allowing you to set masking policies, row, and column-based access restrictions, among other things.

A data governance tool can help your business in many more ways than the ones mentioned above. Head over to this article to know more.


How to set up data governance in Atlan for Snowflake #

Atlan is an active metadata platform that takes care of data governance, in addition to data cataloging, search, and discovery for your Snowflake data platform. In addition, it gives you a rich interface to preview and query data from your Snowflake warehouses, making it a one-stop shop for all your data needs. To set up data governance for Snowflake in Atlan, you can go through the following steps:

  1. Create role in Snowflake
  2. Create a user
    1. With a password in Snowflake
    2. With a public key in Snowflake
    3. Managed through your identity provider (IdP)
  3. Grant role to the user
  4. Choose the metadata fetching method
    1. Information schema (recommended)
    2. Account usage (alternative)
  5. Grant permissions
    1. To crawl existing assets
    2. To crawl future assets
    3. To mine query history for lineage
    4. To preview and query existing assets
    5. To preview and query future assets
  6. Allowlist the Atlan IP


Share this article

[Website env: production]