What Is a Metadata Catalog? | Atlan

November 30th, 2020

header image

Share this article

What is a metadata catalog?

A metadata catalog is nothing but a collection of all the data about your data. Metadata can include the data source, origin, owner, and other attributes of a data set. These help you learn more about a data set and evaluate if it is well-suited for your use case.

A truly powerful metadata catalog will help you:* Create a central repository for all your data and metadata, including the structure, quality, definitions and usage of the data.

  • Access the metadata right alongside the data itself—no asking around!
  • Ensure data consistency and accuracy by updating itself auto-magically, while allowing humans to remain in the loop.

The basics of metadata and data catalogs

Before we go any further, let’s go through some commonly asked questions about metadata.

  • What is metadata? Metadata is just data about other data. It gives basic information about a data asset to help users find the data they need.
  • What is an example of metadata? An example of basic metadata is the author, date created, data last modified, source, and size of a data set. Some more complex examples of metadata are query logs, lineage, quality score, and related discussions.
  • What is the difference between data and metadata? Data is information that measures, describes, or reports on something. Metadata is relevant information that gives context about that data.
  • Who uses metadata? Anyone who uses data would also use metadata. After all, you can’t use a data set until you first know if that data is right for your use case.
  • Where is metadata stored? Metadata should be stored close to the data it is describing. This can be in a nearby table or field, a separate document like a data dictionary, or ideally in a metadata catalog.
  • What are the benefits of metadata? Metadata is important for giving context around data. With the sheer amount of data available nowadays, you need more information about a data set before you can know if it’s right for you. Metadata also helps document data so it can be shared and reused across multiple use cases.

What are metadata catalogs useful for?

A well-organized catalog with your metadata is useful for creating a single source of truth for all your company’s data. A metadata catalog can help your team discover, manage and understand all your data assets in one place.

This is important because the consumers of data are quickly increasing. Companies are increasingly investing in setting up data lakes, big data initiatives and creating self-service data analytics ecosystems. This leads to many versions of the truth—multiple data sets, versions, and isolated knowledge.

Four ways to ace your metadata catalog needs

  1. Understand the fine print and quality of your data.
  2. Crowdsource your metadata catalog.
  3. Get critical business context on your data.
  4. Search through petabytes of data.

Understand the fine print and quality of your data

Understand what each column means via shareable data dictionaries. Access detailed data quality reports and understand the quality of a data table. Quickly onboard new users and help admins to monitor data quality.

Tools and techniques that can help:* Data dictionary

  • Quality reports
  • Metadata management

Metadata catalog - Understand the quality of your data

Crowdsource your metadata catalog

Convert human tribal knowledge into a living system by allowing your team to add notes, ratings, and tags to datasets. Easily evaluate the quality of your data and help your team access this information too.

Tools and techniques that can help:* Data annotations

  • User-generated ratings
  • Data tags

Metadata catalog - Crowdsource metadata

Get critical business context on your data

Supplement your technical data with contextual business information. Easily understand how a data set can be used and what it contains. Add context to your data, alongside it.

Tools and techniques that can help:* READMEs

  • Metadata repository
  • Business glossary

Metadata catalog - Get business context on your data

Search through petabytes of data

A metadata catalog should enable you to find and discover the exact data table that you need for your use case. Metadata tags such as owners, source, timeframe, etc should help in filtering the data.

Tools and techniques that can help:* Data filtering

  • Powerful search

Metadata catalog - Search through petabytes of data

These techniques will help you ace an essential part of your metadata management. Most importantly, it should help you create a single source of truth across your data ecosystem.

Get the modern metadata catalog your team needsSee Product Tour

See Product Tour

Share this article

[Website env: production]