The ever-increasing volume of data created every day, the growing number of disparate data sources; strict adherence to rising privacy regulations, and the diversity of data users, all call for having data access control as a vital component of data governance.
The challenge arises in balancing the need for protecting the security and privacy of data with the business need for easier and faster access to data for analysis.
This article discusses how Snowflake and Atlan help achieve this balance through data access frameworks, models, and best practices.
What is access control?
Access control is a set of frameworks and rules for allowing or restricting the ability to access, read, update, create and delete data objects in a database/data warehouse. The data objects include databases, schema, tables, columns, and queries.
Examples of data that need access control:
Personally identifiable information (PII): Name, phone number address, social security number, email, passwords, and IP addresses.
Multimedia: Photos, audio, video, and documents.
Protected health information (PHI): Medical record number, test results, health vitals, and insurance information.
Financial information: Credit card and bank account numbers, and credit scores.
[Download ebook] → Rethinking Data Governance for the Modern Data Stack
Data access control in Snowflake
Snowflake provides out-of-the-box granular access control features that enable you to manage who can access what data objects and what actions/operations can be done on those data objects.
Access control framework
Snowflake provides a foundational model of concepts and components over which one can build and manage access controls.
Snowflake supports these two most common data access models:
Role-based access control (RBAC): Assign permissions and privileges to a set of users grouped under a specific business role/function/domain.
Discretionary access control (DAC): Allows users/data owners to grant permissions and privileges to other users.
Key components of Snowflake access control
The key concepts in managing access control are:
- Securable objects
- Role hierarchy and privilege inheritance
In Snowflake, all securable objects have privileges, privileges are assigned to roles, and roles are then assigned to users (groups) and other roles.
The Securable Objects are the essential components (that follow a hierarchy) in a Snowflake warehouse that users interact with. These include warehouses, databases, schema, roles, tables, columns, views, tasks, and stored procedures.
Privileges are generally operations/actions that can be performed on a particular Snowflake object. The privileges in Snowflake are hierarchically grouped into thirty scope/categories such as:
- Global privileges: Permission to create users, roles, integrations, tasks, and monitor usage.
- Database privileges: Permission to create a schema, modify the database, and manage ownership.
- Schema privileges: Permission to modify the schema, create tables and views, set row access, and masking policies.
- Table privileges: Permission to execute SQL statements like SELECT, INSERT, DELETE, UPDATE.
Roles are used to grant and revoke access privileges to securable objects. Think of a role to be a package that contains permissions types/privileges rather than a role as a group of users. Roles are then assigned to users which enables them to operate on the data assets in a secure and compliant manner. A user can be assigned multiple roles and a user can switch roles based on the requirements.
Role hierarchy and privilege inheritance
In Snowflake a role (which has privileges) can inherit privileges from another role. This essentially allows users to combine privileges from multiple roles. The advantages of role hierarchy are it provides granular control, help reduce duplication of roles, encourages reusability of roles, and helps minimize access control complexities.
Challenges in managing access control in Snowflake
- Access control management for non-technical data stewards
- Lack of user-interface layer for managing access
- Absence of data lineage
- Access control for data from multiple sources (non-Snowflake)
- Challenging access request management
Access control management for non-technical data stewards
Snowflake access control management is done entirely by writing scripts. This makes it harder for non-technical data stewards to manage access.
Lack of user-interface layer for managing access
Managing access control at scale for large teams, across geographies and functions might be difficult without a user-interface layer. The UI layer simplifies access management tasks like creating roles, and groups, and handling requests for granting/revoking access controls.
Absence of data lineage
Apart from tracking the journey of the data asset from the source to the BI tools, lineage can also be leveraged to automatically propagate access policies downstream.
Access control for data from multiple sources (non-Snowflake)
Data (and metadata), especially on the cloud, proliferate across multiple data sources, tools, and processes ranging from ingestion, ETL, data quality, and business intelligence (BI). So the need for a centralized access control platform is a must-have to get a complete hold on governance.
Challenging access request management
As your team scales, you need a centralized and easy-to-use access control request management system. This gives you a complete picture and a snapshot of who has access to what data which is an essential requirement for compliance with data regulations.
A Guide to Building a Business Case for a Data Catalog
Snowflake access controls with Atlan: Flexible and scalable
As organizations are trying to transition from a controlling, bureaucratic form of data governance to a collaborative one, access control management has become one of the most important components of an effective data governance program.
With user groups, custom access policies, Personas, Purposes, and data lineage, Atlan makes access control for Snowflake easy at scale.
Let’s now look at the key access control capabilities of Atlan:
Access to Snowflake metadata and data is restricted by default in Atlan. Access granting/control in Atlan is achieved through the following 5 methods :
- User Roles
- Access Policies
Selecting a method depends on the scale and granularity at which one wants to control access. We’ll discuss each of these methods in brief:
Roles are the most general and broad permissions a user can have within Atlan. There are three default roles in Atlan, they are:
- Admin who can set up and manage connections to Snowflake, manage users, classifications, and access policies.
- Member who can search, view, query, and update information about a data asset.
- Guest who can only search and view information about the data.
Access policies help allow or restrict access to certain data and metadata in Snowflake. Access policies enable you to go granular with access control. There are three kinds of Access policies in Atlan, they are:
Metadata policies control access and privileges to Snowflake metadata. Some of the privileges include: Updating descriptions, certifications, owners, classifications, and viewing an asset’s activity log, lineage, and SQL queries.
Data policies control access and privileges to Snowflake data. Some of the privileges include: Querying and previewing the data, and data masking.
Data glossary policies control privileges to glossary metadata. Some of the privileges include: Creating and updating glossary definitions, adding certifications and owners, associating related terms, and adding classifications
Casestudy: Why use Atlan for data discovery, governance, and access control?
How Atlan helps to scale Snowflake access control
One thing that Atlan really aces is how it helps automatically handle access control at scale when you bring in new users, teams, and data sources. This is done using:
Personas are a way to control access to users who belong to a group/domain — Sales and marketing, data engineer, and data analyst. With Personas, you can handle scale by:
- Curating data assets relevant to the domain/team
- Controlling privileges/actions the users have on the asset (creating, updating, querying, etc)
Purposes are a way to control access to user groups who are tagged with a particular classification. Purposes add granularity to Personas. For example, a Persona might have permission to query and preview the Snowflake data, but the users in the Persona must not be able to access sensitive columns. This is controlled using a Purpose that tags all sensitive data assets as PII. So next time when you create a new persona, you don’t have to create a new access policy for sensitive data.
Atlan protects sensitive data through data masking. Adding data masking to any access policy automatically propagates masking to any asset that has been tagged with that policy. Data masking methods supported in Atlan are: Show first/last, hash, nullify, and redact.
Automatic classification propagation through data lineage
Atlan parses the SQL queries and creates lineage visualization for Snowflake data assets. Atlan leverages lineage to automatically propagate classifications to all derived assets downstream. Propagation combined with Purposes helps scale Snowflake data access management.
Atlan handles all change/update requests to Snowflake metadata in a centralized location. This helps streamline and track request approvals across multiple teams and data sources.
Atlan: A Snowflake validated data governance solution
If you are evaluating and looking to deploy best-in-class data access governance for Snowflake — without having to compromise on data democratization? Try Atlan!
Atlan is the first data catalog and metadata management solution validated by Snowflake’s technology validation program.
Quoting, Bob Muglia, Former CEO of Snowflake,
"Atlan’s unique, collaboration-first approach for the modern data stack helps to break down organizational silos and empower cross-functional teams to work together to make better business decisions."
Getting started with Snowflake access control with Atlan:
- How to crawl Snowflake metadata
- How to mine Snowflake metadata
- What does Atlan crawl from Snowflake?
- How to attach a classification for Snowflake data assets?
- How do I control access to Snowflake metadata and data?
Snowflake data access control: Related reads
- Data catalog for Snowflake data assets
- Automated data lineage for Snowflake
- Personalized data discovery for Snowflake data assets
- Snowflake data dictionary — Documentation for your database
- Snowflake data governance — Data discovery, security & access policies