Polaris Catalog from Snowflake: Everything We Know So Far

Updated June 13th, 2024

Share this article

Polaris Catalog is Snowflake’s open-source catalog for Apache Iceberg. Currently, it’s interoperable with Amazon Web Services (AWS), Confluent, Dremio, Google Cloud, Microsoft Azure, and Salesforce.

This article will look at the core capabilities of Polaris Catalog from Snowflake and address some of the most commonly asked questions.



Table of contents #

  1. What is Polaris Catalog?
  2. Polaris Catalog: Core capabilities
  3. What’s next?
  4. Polaris catalog: Frequently Asked Questions
  5. Polaris Catalog: Related Reads

What is Polaris Catalog? #

On June 3, 2024, Snowflake announced Polaris Catalog, a vendor-neutral, open data catalog for Apache Iceberg — which is an open-source table format for large analytic workloads.

Polaris Catalog builds on the open standard of a REST protocol

Polaris Catalog builds on the open standard of a REST protocol - Source: Christian Kleinerman, EVP of product for Snowflake.

Polaris Catalog builds on the open standard of a REST protocol created by the Iceberg community. The goal is to support interoperability across engines, without any vendor lock-in.

“Polaris Catalog provides an open standard for users to access and retrieve data using any engine of choice that supports the Iceberg Rest API, including Apache Flink, Apache Spark, Dremio, Python, Trino and others.”

Polaris Catalog for Apache Iceberg — vendor-neutral and interoperable across engines

Polaris Catalog for Apache Iceberg — vendor-neutral and interoperable across engines - Source: Polaris Catalog.

You can host Polaris Catalog in Snowflake managed infrastructure or your infrastructure of choice. The catalog will also adopt Snowflake Horizon’s security and governance capabilities to provide enterprise-grade security for your data.

Snowflake states that Polaris Catalog will be “both open-sourced in the next 90 days and available to run in public preview in Snowflake infrastructure soon.”


Polaris Catalog: Core capabilities #

Since Snowflake announced the catalog recently, its capabilities and features will keep evolving.

An essential development to note is that Polaris intends to make the existing REST protocol for Apache Iceberg suitable for enterprise use cases. IDC’s research vice president, Stewart Bond, observes that:

Polaris Catalog adds enterprise-grade capabilities

Polaris Catalog adds enterprise-grade capabilities - Source: IDC’s research vice president Stewart Bond on the capabilities of Polaris Catalog.

As of now, Polaris Catalog is geared to offer the following:

  • Cross-engine read and write interoperability
  • Centralized access across engines
  • Vendor-agnostic flexibility (run anywhere, no lock-in)
  • Extend Snowflake Horizon’s governance features via Polaris Catalog Integration

Let’s get into the specifics.

Cross-engine read and write interoperability #


Multi-engine interoperability is one of the key tenets of Polaris Catalog. So, you can read and write from any REST-compatible engine — Apache Doris, Apache Flink, Apache Spark, PyIceberg, StarRocks, Trino and more. This eliminates the need to move or copy data for different engines and catalogs, which leads to siloed data.

Iceberg community has developed open standards to enable storage interoperability

Iceberg community has developed open standards to enable storage interoperability - Source: Danny Mak, Partner Sales Engineering Leader at Snowflake, on interoperability.

Centralized access across engines #


Polaris Catalog enables you to manage Iceberg tables for all users and engines from one location. Regardless of the engines, all Iceberg read and write operations will get routed through Polaris Catalog.

So, diverse data teams can modify tables concurrently, generate and run queries to analyze the data in those tables, and more. This streamlines data management for Apache Iceberg tables, while centralizing access.

Vendor-agnostic flexibility #


Snowflake heavily emphasizes the ‘no vendor lock-in’ capability of Polaris Catalog. So, you can run Polaris Catalog in your infrastructure of choice — either Snowflake’s AI Data Cloud infrastructure (public preview soon), or self-hosting with containers such as Docker or Kubernetes (coming soon).

Polaris Catalog’s backend implementation will be open source

Polaris Catalog’s backend implementation will be open source - Source: Snowflake.


No vendor lock-in with Snowflake’s Polaris Catalog

No vendor lock-in with Snowflake’s Polaris Catalog - Source: Polaris Catalog.

Extend Snowflake Horizon’s governance features #


You can integrate Polaris Catalog with Snowflake Horizon. This allows you to leverage Horizon’s governance capabilities — column masking policies, object tagging and sharing — for Polaris Catalog.

Whether an Iceberg table is created in Polaris Catalog by Snowflake or another engine

Whether an Iceberg table is created in Polaris Catalog by Snowflake or another engine - Source: Snowflake.


What’s next? #

Snowflake’s Polaris Catalog acts as a single location to access and retrieve Apache Iceberg data and metadata from numerous engines, thereby supporting storage interoperability. Its integration with Snowflake Horizon ensures enterprise-grade governance, data security, and privacy capabilities for your Iceberg data, regardless of the underlying hosting infrastructure.

As of now, Snowflake is still working on releasing Polaris to its enterprise customers (public preview) and we’re excited to see how this solution will shape up in the future to enable interoperability for open table formats.


Polaris catalog: Frequently Asked Questions (FAQs) #

1. What is Polaris Catalog? #


As mentioned earlier, Polaris Catalog is an open-source catalog for Apache Iceberg from Snowflake. The Apache Iceberg community laid the groundwork, and Snowflake built on those standards to improve Polaris Catalog — with full enterprise security, interoperability, and vendor neutral storage.

2. Is Polaris Catalog available for general use? #


Not yet. It will be in public preview in the next 90 days.

3. How much will Polaris Catalog cost? #


As of June 3, 2024, Polaris Catalog is open-source.

Polaris Catalog builds on the open standard of a REST protocol

Polaris Catalog builds on the open standard of a REST protocol - Source: Christian Kleinerman, EVP of product for Snowflake.



Share this article

[Website env: production]