How to Set Up Data Governance for Snowflake: A Step-by-Step Guide
Share this article
Snowflake is known for its robust security and governance features. You can connect Snowflake to a data governance tool to make the best of those features. The key idea is to enable analysts, engineers, and business users to centrally govern data for better security, risk awareness, and access control.
In this step-by-step guide, you’ll learn how to set up a data governance tool for Snowflake.
Snowflake has various out-of-the-box governance features. You’ll need to follow these steps to access the governance-related metadata from Snowflake:
- Create a database role for data governance
- Create a database user
- Identify objects and features you’ll need for managing data governance in Snowflake
- Assign relevant read permissions to the database role
- Assign the database role to the database user
- Configure the Snowflake connector and begin your data governance workflow
Data governance is a critical area in any data system. It deals with privacy, security, encryption, compliance, regulations, and consumer rights, among other things.
Over the years, the value of doing data governance right has become evident with several significant data breaches and fines. If you feel your current Snowflake-based data platform can’t provide the data governance controls and mechanisms you need: Talk to us.
Table of contents
- Prerequisites to setting up data governance for Snowflake
- Steps to set up data governance for Snowflake
- Business outcomes from Snowflake data governance
- How to set up data governance in Atlan for Snowflake
Prerequisites to setting up data governance for Snowflake
When setting up a data governance tool for Snowflake, first, you’ll need to ensure that you cover a few baseline things, such as networking, infrastructure, and security:
- Reachability — Make sure that your data governance tool can connect to Snowflake. Depending on your Snowflake setup, you might need to consider handling PrivateLink, NACLs, VPNs, etc., to make this work.
- Encryption — Ensure that any communication with your data governance tool is secure. You must ensure this from Snowflake’s and the data governance tool’s end.
- Infrastructure — Ensure that your data governance tool has enough computing power and memory to address data governance-related metadata processing.
Steps to set up data governance for Snowflake
The setup broadly involves creating a Snowflake role, a Snowflake user, and granting it relevant data governance-related privileges. Once all that is done, you can connect to your data governance tool via a Snowflake connector. Let’s dive right in!
Step 1. Create a database role for data governance
One of the core functions of effective data governance is to provide a robust access control layer. Snowflake provides two access control models: RBAC (role-based access control) and DAC (discretionary access control).
For the data governance tool to be able to fetch governance-related metadata from Snowflake, you need to create a role and grant it relevant access. Use the following command to create the
CREATE OR REPLACE ROLE data_governance_role;
Step 2. Create a database user
CREATE USER statement, you can create a Snowflake database user either with a password or the RSA asymmetric key algorithm. While creating the role, you can also set role-specific defaults, such as
DEFAULT_ROLE, as shown in the following commands:
# Method 1: With password CREATE USER data_governance_user PASSWORD='<password>' DEFAULT_ROLE=data_governance_role DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>'; # Method 2: With public key CREATE USER data_governance_user RSA_PUBLIC_KEY='<rsa_public_key>' DEFAULT_ROLE=data_governance_role DEFAULT_WAREHOUSE='<warehouse_name>' DISPLAY_NAME='<display_name>';
Step 3. Identify the objects and features you’ll need for managing data governance in Snowflake
Apart from Snowflake’s internal schemas containing metadata related to Snowflake data assets, you’ll need access to the following:
- Snowflake Object Tags — you’ll need permissions to create and alter tags and the permissions to apply labels on Snowflake objects. Moreover, you’ll also need permission to monitor tags.
- Data Classification Functions — you’ll need permission to spin up Snowflake-managed serverless resources to run these functions on your data-containing objects.
- Column-level security -you’ll need permission to create and alter data masking policies that enable dynamic data masking. External functions, such as an AWS Lambda function, can also be used for masking. You’ll need additional permissions to invoke an external process.
- Row-level security — and finally, you’ll also need permissions to create and alter row access policies in Snowflake.
Remember that much of the data related to the above features is available in the Snowflake-managed database called
SNOWFLAKE. You’ll need access to the
READER_ACCOUNT_USAGE schemas to work with these features.
Step 4. Assign relevant read permissions to the database role
Based on the information above, you’ll need to assign either specific privileges to the
data_governance_role or you’ll need to import privileges from other system-defined roles, such as
ACCOUNTADMIN. The data governance tool will need access to the
INFORMATION_SCHEMA along with Snowflake’s internal schemas, as mentioned in the previous section. To assign those privileges, you can use the following set of statements:
USE ROLE ACCOUNTADMIN; GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE data_governance_role; GRANT USAGE ON DATABASE <database_name> TO ROLE data_governance_role; GRANT USAGE ON ALL SCHEMAS IN DATABASE <database_name> TO ROLE data_governance_role; GRANT SELECT ON ALL TABLES IN SCHEMA SNOWFLAKE.ACCOUNT_USAGE TO ROLE data_governance_role; GRANT SELECT ON ALL VIEWS IN SCHEMA SNOWFLAKE.ACCOUNT_USAGE TO ROLE data_governance_role;
After granting these privileges, you can go one of two routes: granting
USAGE privileges to the
data_governance_role or granting specific data governance-related permissions to the
GRANT OPERATE, USAGE ON WAREHOUSE <warehouse_name> TO ROLE data_governance_role;
Snowflake recommends that you don’t use overly permissive grants to custom roles. The above grant allows the
data_governance_role to do pretty much anything in your warehouse, including starting, suspending, and resuming it.
You can alternatively take a more granular approach by applying specific permissions. You can use the following set of statements to do that:
# Masking policies GRANT CREATE MASKING POLICY ON SCHEMA <database_name>.<schema_name> TO ROLE data_governance_role; GRANT APPLY MASKING POLICY ON ACCOUNT TO ROLE data_governance_role; GRANT APPLY ON MASKING POLICY <masking_policy_name> to ROLE data_governance_role; # Row access policies GRANT CREATE ROW ACCESS POLICY ON SCHEMA <database_name>.<schema_name> to role data_governance_role; GRANT APPLY ROW ACCESS POLICY ON ACCOUNT data_governance_role; # Object tagging GRANT CREATE TAG ON SCHEMA <database_name>.<schema_name> TO ROLE data_governance_role; GRANT APPLY TAG ON ACCOUNT TO ROLE data_governance_role; # Data classification GRANT DATABASE ROLE snowflake.governance_admin TO ROLE data_governance_role; GRANT APPLY TAG ON ACCOUNT TO ROLE data_governance_role;
If you haven’t already, let’s assign the
data_governance_role to a database user.
Step 5. Assign the database role to the database user
Once you’re done assigning all the relevant permissions to the role, you’ll need to assign the role to the
data_governance_user using the following
GRANT ROLE data_governance_role TO USER data_governance_user;
You should now be ready to connect to your Snowflake account from your data governance tool.
Step 6. Configure the Snowflake connector and begin your data governance workflow
After sorting out all the relevant privileges and assigning them to a database user via the
data_governance_role, you should be able to use your data governance tool’s Snowflake connector to start interacting with the data governance-related metadata in Snowflake. If you cannot connect to Snowflake, you may have missed one of the prerequisites. You can try to go over them again. You can also use the SnowCD (Snowflake Connectivity Diagnostic) tool to evaluate your network connectivity.
After fixing connectivity issues, you can start governing your data using your data governance tool. Most data governance tools provide you with an option to fetch governance-related metadata from Snowflake in three different ways:
- Ad-hoc crawl (running a manual metadata fetch using the console, the CLI, or the SDK)
- Scheduled crawl (running based on a predefined schedule using a cron expression)
- Event-based crawl (triggering a crawl from an event that the data catalog can listen to)
You’re now ready to start working with tagging, classification, masking, and other out-of-the-box data governance features in Snowflake.
Business outcomes from Snowflake data governance
Setting up a data governance tool for Snowflake will enable you to control your data’s access, security, and business context. A data governance tool helps you with the following:
- Enabling data ownership and access control for data stewards, master data managers, and data product owners.
- Tagging data for better data categorization, discovery, and visibility.
- Classifying sensitive PII and PHI data based on Snowflake’s external functions and the data governance tool’s features.
- Ensuring data privacy and security by allowing you to set masking policies, row, and column-based access restrictions, among other things.
A data governance tool can help your business in many more ways than the ones mentioned above. Head over to this article to know more.
How to set up data governance in Atlan for Snowflake
Atlan is an active metadata platform that takes care of data governance, in addition to data cataloging, search, and discovery for your Snowflake data platform. In addition, it gives you a rich interface to preview and query data from your Snowflake warehouses, making it a one-stop shop for all your data needs. To set up data governance for Snowflake in Atlan, you can go through the following steps:
- Create role in Snowflake
- Create a user
- Grant role to the user
- Choose the metadata fetching method
- Grant permissions
- Allowlist the Atlan IP
Share this article