What Is Metadata: Definition, Examples, and Types

August 10th, 2022

header image for What Is Metadata: Definition, Examples, and Types

What is metadata?

In simple terms, metadata is “data/information about data". Metadata helps us understand the structure, nature, and context of the data.

Metadata facilitates easy search and retrieval of data. Metadata also helps keep a check on the quality and reliability of data. Metadata is the key to unlocking the value of your data.


Let's look at other definitions of metadata:

“A set of data that describes and gives information about other data.”

- Definition of metadata according to Lexico

One useful definition of metadata is “any data which conveys knowledge about an item without requiring examination of the item itself.” Because metadata derives its value from saving human time and attention, it must be effective at distinguishing relevant and irrelevant or redundant content.

— Kenneth Haase, 12th annual ACM conference: Context for semantic metadata

What meta itself means:

“Meta is a word which, like so many other things, we have the ancient Greeks to thank for. When they used it, meta meant “beyond,” “after,” or “behind.” The “beyond” sense of meta still lingers in words like metaphysics or meta-economy.”


Modern Metadata Management: The Key Trends, the Data Stack, and the Humans of Data

Download ebook


Examples of metadata

Let’s take an example of an image. To the naked eye, a rose is just a rose.

But to the more discerning “meta” eye, a rose is so much more. It’s the sum total of its meta.

You might be surprised by the amount of metadata that goes into describing an image.

Some of the metadata information stored are:

  • The make of the camera
  • Lenses used
  • Time at which the picture was taken
  • Focal length
  • GPS coordinates of the location
  • Image resolution
  • Color profiles.

Image metadata gives technical insights that are helpful during image processing. Metadata also facilitates easy search, retrieval, and backups and hence helps increase productivity.

image-metadata-information

Metadata information stored in an image file

Let’s take another example of looking into the metadata of an mp3 audio file.

The key metadata information are:

  • Audio format
  • Encoding
  • Channels
  • Bit rate
  • Size
  • Band
  • Album release date
audio-metadata-information

Metadata information stored in an audio file

Metadata in database:

Getting closer home to the humans of data, here’s an example of something that we use on a daily basis—the mighty Excel sheet.

While the data in an Excel sheet refers to the actual information (numbers or text) contained in rows X columns, the metadata refers to the description:

  • Tables/column names, source, descriptions, and relationships
  • Validation rules for a data asset
  • Data types
  • Column statistics — missing values, min-max values, and histogram distribution.
  • Data owner

So you get a better context on the data itself. Like an explainer.

Exploring metadata via Atlan’s data dictionary

Exploring metadata via Atlan’s data dictionary

Learn more: Data vs metadata — what are the differences and examples


[Download ebook] What is Active Metadata and Why Does it Matter?


Why is metadata so important?

Data is nothing but the sum total of its metadata. It is what helps us create a complete picture of our data and understand it in its entirety.

For instance, after the COVID-19 pandemic, medical/pharma research became increasingly collaborative. Researchers needed an effective system to search, share, understand, peer review, and replicate experiments.

The fundamental aspect of such a system is the availability of robust metadata.

Metadata for scientific research includes information about test design, test population details, the definition of terms, measurement methods, and data collection schedules.

Given that enterprises are increasingly investing in and betting on data to make better decisions, the amount of data we use is only set to increase. In order, to increase the shelf life and longevity of data, it’s important for companies to invest in managing their metadata as well.

The need of the hour is to remove data silos, let analytics flow at the speed of thought and create a single source of truth for your entire team, which brings us to an important point.


Types of metadata

Today, metadata is everywhere. Every component of the modern data stack and every user interaction on it generates metadata. Apart from traditional forms like technical metadata (e.g. schemas) and business metadata (e.g. taxonomy, glossary), our data systems now create entirely new forms of metadata.

Types of metadata

Image by: Atlan

Technical Metadata:

Technical metadata is information about the data itself. It is the documentation for the database. This includes information about the design and structure of schema, table and column information, column size, validation rules, and data quality profiles on data assets.

Technical metadata is the information about the data itself. Source: Atlan

Technical metadata is the information about the data itself. Source: Atlan

Structural metadata:

Structural metadata provides information that helps establish object-to-object relationships and hierarchical structure between different data assets. This includes table names, data types, data sources, foreign key cardinality, and referential integrity.

Stuctural metadata: Information about data relationships and hierarchy. Source: Atlan

Stuctural metadata: Information about data relationships and hierarchy. Source: Atlan

Operation Metadata:

Operational metadata tracks all the information related to the flow of data throughout its lifecycle. This includes information about the data source, data transformations, lineage, and logs of ETL and orchestration jobs.

Stuctural metadata: Information about the flow and transformation of data. Source: Atlan

Stuctural metadata: Information about the flow and transformation of data. Source: Atlan

Business Metadata:

Business metadata is a glossary of terms/definitions that helps business users understand a particular data asset.

For instance, questions like, does the metric annual recurring revenue(ARR) on the dashboard includes one-time discount and initial setup costs? Could be documented in the glossary for reference.

Business metadata enables collaboration to validate, verify, and attach terms to the right data assets.

Business metadata: A glossary of terms and definitions. Source: Atlan

Business metadata: A glossary of terms and definitions. Source: Atlan

Administrative metadata:

Administrative metadata provides information related to governance, privacy, security, and access controls. This includes technical data on rights management, copyright information, and license agreements, access control information, and user restrictions.

Administrative metadata: Information about governance, privacy, and access control. Source: Atlan

Administrative metadata: Information about governance, privacy, and access control. Source: Atlan

Social/collaborative metadata:

As more and more businesses embrace democratizing their data, a new set of valuable metadata is emerging from the collaborative efforts.

Examples of social metadata include ratings, chat transcripts, notes, tags, comments, glossary, and bookmarks.

Social metadata: Chat conversations, issue tcikets, notes, tags, comments, and ratings. Source: Atlan

Social metadata: Chat conversations, issue tcikets, notes, tags, comments, and ratings. Source: Atlan

Provenance metadata:

Provenance metadata is information about the origin of a data asset. It informs about data sources, ownerships, transformations, freshness, usage, and archival.

Learn more: Types of metadata: How each helps with faster data discovery and better insights

Provenance metadata: Information about the journey of the data from its origins to archival. Source: Atlan

Provenance metadata: Information about the journey of the data from its origins to archival. Source: Atlan


[Download ebook] → Building a Business Case for DataOps

Download ebook


What are the biggest challenges in metadata management?

One of the biggest problems facing businesses is that though they are aware of the value of metadata and have invested in managing it, they are yet to see enough ROI.

Sadly, companies have traditionally invested in more manual, ad-hoc processes to manage their situation. Departments would either share information, including metadata, verbally or by maintaining Excel/doc files to document data.

  • No one knows where the documents are located—missing information
  • No one bothers to update the documents, especially when people move on—outdated data
  • No one knows how data sets are related—and how to fix changing values across all of them—no data lineage or data quality checks
  • No way to maintain all revisions or versions of data
  • No way to keep metadata along with the data—leading to even more data silos and versions of the truth

That’s why simply plugging in an isolated metadata management tool or metadata catalog within your data lake may not be the answer to your data woes. Today’s business mandates that data be available for whoever needs it, wherever and whenever they need it—with all the context they need.

Learn more: What is Metadata Management? Benefits, tools, and best practices


Metadata: Conclusion

Robust metadata management is the key for data-driven teams to discover, understand, trust, and collaborate on data assets across your data universe. If you are looking to implement a metadata management tool for your organization, do take Atlan for a spin.


Atlan Case study: Metadata Management at Wework



Data Catalog Primer - Everything You Need to Know About Data Catalogs.

Adopting a data catalog is the first step towards data discovery. In this guide, we explore the evolution of the data management ecosystem, the challenges created by traditional data catalog solutions, and what an ideal, modern-day data catalog should look like. Download now!