Databricks Data Contracts Explained: How to Build Trusted Data Products in 2025

author-img
by Emily Winks

Data governance expert

Last Updated on: October 08th, 2025 | 11 min read

Quick Answer: What are Databricks data contracts?

Databricks data contracts are technical agreements that define the structure, quality, and availability of data products, similar to API contracts in software. They formalize expectations between data producers and consumers, enabling reliable data exchange. Data contracts are crucial to the implementation of data products.

Databricks enables and uses data products and contracts in the following ways:

  • Create and publish data products, backed by defined contract guarantees
  • Provisions for testing and assessing data contract violations (e.g., quality, availability) using built-in tools like Expectations
  • Leverage a metadata repository used for governance features (like granular security)
  • Distinguish data assets by domains aligned with your organization’s hierarchy and business needs

Below: Support for data contracts in the Databricks ecosystem, role of metadata, implementation.


Gartner’s Inaugural Magic Quadrant for D&A Governance is Here #


In a post-ChatGPT world where AI is reshaping businesses, data governance has become a cornerstone of success. The inaugural report provides a detailed evaluation of top platforms and the key trends shaping data and AI governance.

Read the Magic Quadrant for D&A Governance


How does Databricks support data contracts? #

Summarize and analyze this article with 👉 🔮 Google AI Mode or 💬 ChatGPT or 🔍 Perplexity or 🤖 Claude or 🐦 Grok (X) .

Databricks allows you to specify enforceable rules for data types, quality, and security. These rules are enforced using built-in features like Expectations in Lakeflow Declarative Pipelines (previously known as Delta Live Tables or DLT).

For instance, you can have Expectations for:

  • Schema evolution
  • Record count validation
  • Specifying how to handle bad or invalid records

Expectations, by themselves, are not data contracts, but they play a role in enforcing the rules of the contract.

Databricks, at this point, does not natively support the end-to-end data contract lifecycle, which is why data contracts need to be created and maintained separately, often in a declarative style in a JSON or a YAML file.

Managing these files can get tricky if you are dealing with hundreds and, possibly, thousands of data products in large enterprises.

Databricks Asset Bundles are best for packaging the whole data product, including the data contract, into a self-contained entity. Databricks Asset Bundles can also be used to integrate with external data mesh managers or a data cataloging solution to ensure data contract discovery for data consumers.

Databricks also supports and encourages the use of data contracts in the following ways:

Let’s now look at the structure of a typical data contract.


How can you define data contracts in Databricks? #

Till the time Databricks starts supporting end-to-end data product lifecycle, it’s their recommendation to use YAML files for defining and maintaining data contracts in a Unity Catalog-enabled workspace.

Databricks recommends adding the following to the data contract of a data product:

  • Data description: Information that will be useful for the data consumer to discover and understand what the data product is about and how to use it
  • Schema: Table structure, data type, row and column-level security, and other critical information about the data product
  • Ownership and security: Who owns, manages, and approves access to the data product, and who is allowed to use it?
  • Usage policies: What are the allowed uses of the data product and how can one use it?
  • Data quality metrics: What are the various types of quality checks and validations that are in place to ensure that the data is usable and useful
  • Service level agreements: Information about different types of service levels around freshness, retention, validity, among other things

The main goal of the data contract is to encapsulate all the key information and rules that help maintain the quality of a data product and make it discoverable and usable at the same time.

In Databricks, the data product can be any data asset, such as a dashboard, a table, a materialized view, or even a notebook.


What is the role of metadata and Unity Catalog in data contracts? #

Metadata is essential to data contracts, as it captures all the information about a data product and the expectations for that data product, around which the contract is created and enforced.

Data products are discoverable, understandable, and usable only because of the metadata associated with them.

In Databricks, all the metadata is captured and maintained by Unity Catalog.

Here are some examples of what metadata Unity Catalog has about a data product:

  • Tags, descriptions, data types, and purpose, both on the table and column-level
  • Relationships between data objects like tables and views to enforce integrity checks
  • Lineage for various data assets, both on the table and column-level
  • Higher-order usage metrics for usage insights and raising security issues
  • Low-level logs for development, debugging, and optimizing jobs and workflows

While Unity Catalog captures all this metadata to enable the publishing and maintenance of data products, it doesn’t currently have native support for enforcing data contracts.

To ensure a consistent, trustworthy, and end-to-end data product lifecycle, you need a platform that can not only leverage all the metadata in Unity Catalog but also activate it for enforcing contracts.

That’s where a unified metadata control plane like Atlan comes into play.


How can you enforce Databricks data contracts with Atlan’s unified metadata control plane? #

Atlan’s unified metadata control plane is built on the premise of integrating all of your organization’s metadata in place, from where it can be managed, activated, and put to use for automation.

The automation can be in the form of events, webhooks, API calls, log entries, alerts, notifications, among other things.

More than being the single place for finding and activating all metadata, Atlan supports an end-to-end data product lifecycle by allowing you to create data domains, subdomains, and policies.

Atlan natively supports data contracts using its YAML contract template. These contracts are version-controlled, embeddable, extensible, and enforceable, and they can leverage the metadata that Atlan’s metadata control plane brings from Unity Catalog.

Atlan also allows you to understand the impact of any change to a data contract on the end users. Atlan lets you create a GitHub Action, which can assess and report on the impact, although this feature is only useful for Databricks users who’re also using dbt.


Real stories from real customers: Data democratization and governance at scale #

Austin Capital Bank Logo

Modernized data stack and launched new products faster while safeguarding sensitive data

“Austin Capital Bank has embraced Atlan as their Active Metadata Management solution to modernize their data stack and enhance data governance. Ian Bass, Head of Data & Analytics, highlighted, ‘We needed a tool for data governance… an interface built on top of Snowflake to easily see who has access to what.’ With Atlan, they launched new products with unprecedented speed while ensuring sensitive data is protected through advanced masking policies.”

Ian Bass

Ian Bass, Head of Data & Analytics

Austin Capital Bank

🎧 Listen to podcast: Austin Capital Bank From Data Chaos to Data Confidence

Kiwi Logo

53 % less engineering workload and 20 % higher data-user satisfaction

“Kiwi.com has transformed its data governance by consolidating thousands of data assets into 58 discoverable data products using Atlan. ‘Atlan reduced our central engineering workload by 53 % and improved data user satisfaction by 20 %,’ Kiwi.com shared. Atlan’s intuitive interface streamlines access to essential information like ownership, contracts, and data quality issues, driving efficient governance across teams.”

Data Team

Kiwi.com

🎧 Listen to podcast: How Kiwi.com Unified Its Stack with Atlan

Let’s help you build a robust data governance framework

Book a Personalized Demo →

Ready to maximize the ROI of your Databricks ecosystem? #

Data contracts are essential to adopting a data product philosophy, whether in a data mesh architecture or otherwise. They ensure that the data in the data products is discoverable, understandable, reliable, and usable.

Databricks provides several core features to enable data products, such as the Databricks Marketplace, Unity Catalog, and Expectations. At this point, it doesn’t natively support the end-to-end data product lifecycle, of which data contracts are a big part.

That’s where Atlan’s unified metadata control plane comes into the picture. Atlan natively supports all aspects of data product development, including data contracts. It allows you to define, maintain, and enforce data contracts.

The data contract features in Atlan also work well with other data product features, such as domains, stakeholders, tags, and lineage.


FAQs about Databricks data contracts #

1. What is the Databricks data contract? #


A Databricks data contract is a formal, enforceable agreement between data producers and consumers that defines the structure, quality, and expectations for a dataset. It typically includes schema definitions, data freshness requirements, SLAs, and validation rules.

2. What is a data product? #


A data product is a packaged and self-contained entity of data that is designed to address a specific business use case. It includes every aspect of data, including quality, documentation, security, ownership, etc., that is required to deliver value to the business.

3. What is a data contract? #


A data contract is a formal contract between the producers and consumers of data about the format, structure, quality, and behaviour of data. Using this contract, consumers can design these systems and processes with ease.

Data contracts are crucial for ensuring a data product-based data publishing and consumption practice within and outside an organization.

4. Does Databricks have native support for data products? #


Databricks has native support for data products via Databricks Marketplace, Delta Sharing, and, most importantly, Unity Catalog. However, Databricks doesn’t natively support data contracts at this point, but does provide ways of incorporating data contracts into your workflow.

5. How does Databricks support data contracts? #


Data contracts in Databricks are supported through Unity Catalog, Delta Lake schemas, and integration with tools like Atlan to automate enforcement, detect violations, and maintain trust across data pipelines.

6. Where are data products published in Databricks? #


Data products can be published on Databricks Marketplace (via Unity Catalog) or other marketplaces that can leverage the data product metadata stored in Unity Catalog. The publishing and use of data products, both in internal and external marketplaces, is facilitated by Delta Sharing.

7. What types of data products are there in Databricks? #


Databricks data products can be of various types. Datasets (tables, views, materialized views, etc.), Dashboards, Notebooks (especially .databricks.com/solutions/accelerators), and AI models, among others, are a few examples of data products in Databricks.

8. How do Databricks data contracts work? #


Databricks data contracts mostly rely on Expectations for structural and quality rules enforcement. Databricks natively does not offer a way to store and manage data contracts but it encourages you to use YAML files with open standards. You can use external tools like Atlan for an end-to-end data contracts implementation.


Share this article

signoff-panel-logo

Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.

Governance That Drives AI From Pilot to Reality — with Atlan + Snowflake. Watch Now →

[Website env: production]