Snowflake X Azure: Practical Guide For Deployment
Share this article
Snowflake is a data cloud platform that enables you to deploy different data architecture patterns, such as data warehouses and data lakes, with the help of various storage, processing, and consumption features. You can deploy Snowflake in your preferred cloud platform.
With over 200 cloud services, Azure is one of the major cloud platforms. Being a Microsoft product, Azure has the advantage of seamless integration with the vast landscape of Microsoft products and services, Microsoft 365 being a prime example.
Azure is one of the three cloud platforms you can deploy Snowflake. This article will explain how cloud products build your data ecosystem and securely enable data engineering, data science, and AI/ML use cases. Throughout the article, we’ll take a look at the following themes:
- Deploying Snowflake on Azure
- Staging data in Azure Containers
- Loading and unloading data in Azure Containers
- Using Snowpipe with Azure Containers
- Using external Azure Functions
- Connecting Snowflake to PowerBI Desktop for visualization
- Using the Snowflake Business Critical Edition on Azure
Let’s get into it!
Modern data problems require modern solutions - Try Atlan, the data catalog of choice for forward-looking data teams! 👉 Book your demo today
Table of contents #
- Deploying Snowflake on Azure
- Staging data in Azure Containers
- Loading and unloading data in Azure Containers
- Using Snowpipe with Azure Containers
- Using Azure Functions with Snowflake
- Connecting Snowflake to Power BI Desktop for visualization
- Using the Snowflake Business Critical Edition on Azure
- Summary
- Related reads
Deploying Snowflake on Azure #
Snowflake has been generally available on Azure since mid-2018. As of October 2023, Snowflake is available in 19 Azure regions, including a few SnowGov regions in the US. Snowflake pricing depends on the Snowflake Edition and the region where you want to deploy it. Here’s what the Snowflake pricing page looks like in Azure’s Virginia region:
Before choosing a region, you must know that not all regions support all Snowflake features. This is especially important when your business requires specific security and compliance features, private networking and data residency being a couple of them. You need to consider all these factors when deploying Snowflake in Azure.
Staging data in Azure Containers #
Azure’s object-based storage comes in the form of Azure Blob Storage, which, in turn, is realized by Azure Containers. Containers are like buckets in Amazon S3 or Google Cloud Storage. If you already have existing data infrastructure on Azure, you might want to move data into Snowflake for more structured consumption use cases, such as enterprise reporting and data analytics.
You can load and unload data from and to other data systems into Snowflake stages, which can be Snowflake-managed internal stages to load data into a Snowflake table from a local file system, and external stages to load data from object-based storage too. The unloading process also involves the same steps but in reverse.
If your data is in Azure Containers, you can create an external stage for your container to load data into Snowflake. You must create a storage integration with your Azure credentials. You can create a storage integration using the following SQL statement in Snowflake:
CREATE STORAGE INTEGRATION <storage_integration_name>
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = 'AZURE'
ENABLED = TRUE
AZURE_TENANT_ID = '<azure_tenant_id>'
STORAGE_ALLOWED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/')
[ STORAGE_BLOCKED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') ]
Using the storage integration you just created, you can create a stage by specifying the Azure Container path and the format of the files you want to stage into Snowflake using the following SQL statement:
CREATE STAGE <stage_name>
STORAGE_INTEGRATION = <storage_integration_name>
URL = 'azure://myaccount.blob.core.windows.net/mycontainer/load/files/'
FILE_FORMAT = my_csv_format;
Once that’s done, you can load data from and unload data into Azure Containers. Let’s look at that in a bit more detail.
Loading and unloading data in Azure Containers #
Once the Azure Container external stage is created, you can use the COPY INTO
command for bulk-loading the data into a table, as shown in the SQL statement below:
COPY INTO <table_name>
FROM @<stage_name>
PATTERN='.*address.*.csv';
The pattern would get all the address.*.csv
files, such as address.part001.csv
, address.part002.csv
, and so on, and load them in parallel into the specified Snowflake table.
You don’t necessarily have to create a separate stage for unloading data. Still, it’s considered best practice to create a unload directory to unload
the data from a table, as shown in the SQL statement below:
COPY INTO @<stage_name>/unload/ from address;
After this, you can run the GET
statement to unload the data from the stage to your local file system:
GET @mystage/unload/address.csv.gz file:///address.csv.gz
The Snowflake + Azure storage integration also allows you to run queries on an external stage without getting the data into Snowflake. To do this, you’ll need to create external tables in Snowflake, which collect table-level metadata and store that in Snowflake’s data catalog.
When you create external tables that refer to an Azure Container external stage, you can set that external table to refresh automatically using Azure Event Grid events from your Azure Container.
Using Snowpipe with Azure Containers #
Snowpipe is a managed micro-batching data loading service offered by Snowflake. Like bulk data loading from Azure Containers, you can also load data in micro-batches using Snowpipe. The Snowpipe + Azure integration also rests upon the storage integration and external stage construct in Snowflake.
To trigger Snowpipe data loads, you can use Azure Event Grid messages for Blob Storage events. An additional step here is to create a notification integration using the following SQL statement:
CREATE NOTIFICATION INTEGRATION <notification_integration_name>
TYPE = QUEUE
NOTIFICATION_PROVIDER = AZURE_STORAGE_QUEUE
AZURE_STORAGE_QUEUE_PRIMARY_URI = '<queue_URL>'
AZURE_TENANT_ID = '<directory_ID>'
ENABLED = TRUE;
Using the same notification integration, you need to create a pipe and then use the following SQL statement:
CREATE PIPE <pipe_name>
AUTO_INGEST = TRUE
INTEGRATION = '<notification_integration_name>'
AS
<copy_statement>;
Using Azure Functions with Snowflake #
In addition to external tables where you don’t have to move your data into Snowflake, you can also use external functions where you don’t have to move your business logic into Snowflake. An external function or a remote service gives you the option to offload certain aspects of your data platform to certain services in Azure, the most prominent of which is Azure Functions, the serverless compute option in Azure.
To integrate Snowflake with Azure Functions, you have to create an API integration, as shown in the SQL statement below:
CREATE OR REPLACE API INTEGRATION <api_integration_name>
API_PROVIDER = azure_api_management
AZURE_TENANT_ID = '<azure_tenant_id>'
AZURE_AD_APPLICATION_ID = '<azure_application_id>'
API_ALLOWED_FIXES = ('https://')
ENABLED = TRUE;
Snowflake uses Azure API Managment service as a proxy service to send requests for Azure Function invocation, which is why you have the API_PROVIDER
listed as azure_api_management
.
Several use cases exist for external functions to enrich, clean, protect, and mask data. You can call other third-party APIs from these external functions or write your code within the external function. One typical example of external functions in Snowflake is the external tokenization of data masking, where you can use Azure Functions as the tokenizer.
Connecting Snowflake to Power BI Desktop for visualization #
Microsoft Power BI is one of the most widely used enterprise BI tools. Chances are, if you are already using Azure, you’ll be using Power BI too. The great thing about Snowflake’s integration with Power BI is that it provides native OAuth 2.0 SSO connectivity.
This means that you can log into Snowflake directly from Power BI without the need for an on-prem Power BI Gateway, which simplifies the connectivity a great deal without compromising on security.
The Snowflake + Power BI integration relies upon a security integration, which you can set up using the following SQL statement:
CREATE SECURITY INTEGRATION <security_integration_name>
TYPE = external_oauth
EXTERNAL_OAUTH_TYPE = azure
EXTERNAL_OAUTH_ISSUER = '<azure_ad_issuer>'
EXTERNAL_OAUTH_JWS_KEYS_URL = 'https://login.windows.net/common/discovery/keys'
EXTERNAL_OAUTH_AUDIENCE_LIST = ('https://analysis.windows.net/powerbi/connector/Snowflake')
EXTERNAL_OAUTH_TOKEN_USER_MAPPING_CLAIM = 'upn'
EXTERNAL_OAUTH_SNOWFLAKE_USER_MAPPING_ATTRIBUTE = 'login_name'
ENABLED = TRUE;
Read more about this security integration here.
Using the Snowflake Business Critical Edition on Azure #
If your business requires the highest level of data security and protection in the cloud, you should use Snowflake’s Business Critical Edition. This edition comes compliance-ready for regulations like HIPAA and HITRUST CSF.
Moreover, you can use the Azure Private Link service to enable private networking between Snowflake and the rest of your Azure infrastructure.
Summary #
This article took you through deploying Snowflake on Azure and using services, such as Azure Blob Storage, Azure Functions, Azure API Management, Azure Private Link, and Azure Event Grid, to enhance some of the Snowflake-native features. The article also walked you through the different ways of offloading part of your workload to Azure for integration, privacy, and security, especially with Azure Active Directory and Power BI SSO.
Your data infrastructure on Azure with Snowflake at the center is excellent because you can gather metadata using Snowflake’s data governance services, like Azure Data Catalog and Microsoft Purview. This data can then flow into Atlan, an active metadata platform, enabling your business to realize the true potential of using metadata.
You can learn more about the Snowflake + Azure partnership from Snowflake’s official website and the Snowflake + Atlan partnership here.
Snowflake on Azure: Related reads #
- Snowflake Data Mesh: Step-by-Step Setup Guide
- Snowflake Data Catalog: Importance, Benefits, Native Capabilities & Evaluation Guide
- Snowflake Data Governance: Features, Frameworks & Best practices
- Snowflake Metadata Management: Importance, Challenges, and Identifying The Right Platform
- Snowflake Data Lineage: A Step-by-Step How to Guide
- Snowflake Data Dictionary: Documentation for Your Database
Share this article