Gartner on Data Lineage: Trends, Recommendations, and Selecting Tools

Updated August 23rd, 2023
Gartner Data Lineage

Share this article

Data lineage has become an indispensable part of data workflows. But how is data lineage evolving to serve the needs of modern business?

In this article, we’ll review how Gartner sees data lineage and its role in the modern data stack. We’ll also cover what you should look for in data lineage tools and how Gartner sees the concept of data lineage evolving over time.


Table of contents

  1. Data lineage: A fundamental feature of any data catalog
  2. Gartner on data lineage
  3. What to look for in data-lineage-aware tools
  4. Atlan’s approach to data lineage
  5. Related reads

Data lineage: A fundamental feature of any data catalog

Data lineage is metadata that documents the flow of a data object through a company over time. It enables you to ask critical questions about data, including:

  • Where did this data come from?
  • Who changed it last - and how?
  • What purpose does this data serve?
  • Who’s consuming this data elsewhere in the company?

Data lineage support is a core offering for any data catalog. With a data catalog, your users can discover and query data no matter where it lives throughout the company. With data lineage, they can perform a host of operations that improve the quality and utility of data, including:

  • Increase user’s confidence in data by identifying its sources and history over time
  • Improve data quality through validation and prevention of data loss
  • Perform impact analysis, identifying how potential data changes will affect downstream consumers
  • Enforce data governance through classification and auditing

Read more → Data Lineage 101


Gartner on data lineage

Gartner doesn’t track data lineage tools explicitly using its Magic Quadrant assessment as it does for other categories, such as Data Quality or Data Catalogs.

However, in its Magic Quadrant for Data Quality Solutions report, Gartner identifies collecting metadata from third-party systems and building data lineage maps as a key component of any data quality solution.

To qualify for a ranking in Gartner’s Data Quality report, a solution must “delivery critical data quality functions.” Gartner ranks data lineage and metadata as part of that feature set, along with data monitoring, roles management and data validation, and standardization and cleansing, among others.

Gartner’s analysts describe lineage as critical “to perform rapid root cause analysis of data quality issues and impact analysis of remediation.”

Gartner’s report emphasizes lineage as an “emerging technology.” However, it also downgrades several Data Quality tools for their poor data lineage support.

A large component of data lineage support is interoperability. Gartner says that any data lineage solution must have “close integration with metadata solutions” to enable lineage.

How Gartner sees data lineage in the overall data landscape


In general, Gartner sees data lineage as part of a large category it calls “active metadata management”.

Metadata management technology maturity

Metadata management technology maturity - Source: Gartner.

In its 6-tier scheme for metadata, Gartner sees most companies incorporating data lineage at Level 2, the Catalog level. At this level, an organization has just invested in and deployed a data catalog and is taking the first steps to trace the flow of data through the company.

Data lineage work continues throughout the subsequent layers of metadata management maturity. At these higher levels, companies add the ability to perform critical asset resolution, analyze trends, and generate alerts and recommendations.

In other words, in Gartner’s view, data lineage is a foundational technology for any data-mature enterprise.


What to look for in data-lineage-aware tools

If you’re just starting on the road to data lineage, it’s important to select the right tools.

As Gartner’s report on Data Quality tools makes clear, data lineage is still a “new” and evolving technology. Unfortunately, many data quality and data catalog tools that tout lineage support are still immature or hard to use. It’s important to weigh your options before making such a critical investment.

That’s why we put together a guide to the 19 questions and gotchas to look out for when selecting a data lineage solution. Some of the key features any data lineage solution should have include:

  • Column-level lineage, so that you have granular visibility into how your data and its characteristics change over time
  • Connectivity to a broad array of third-party systems for collecting metadata and lineage information
  • Automated lineage support, so that you always have the latest, most up-to-date view of your data
  • Manual enrichment and editing of lineage metadata to augment your automated systems
  • Strong visualization support, so that you can see the flow of data clearly and use it to make decisions

How data lineage in a cataloging tool can help you track data flow

How data lineage in a cataloging tool can help you track data flow - Source: Atlan.

With this array of features, you can not only see your data but take action on it.

For example, you can see clearly how a change to the type of a column in a SQL database table could impact a Power BI report used by the sales department.


Atlan’s approach to data lineage

Atlan agrees with Gartner that strong data lineage features are a cornerstone of data quality and good data governance. That’s why Atlan’s data catalog supports column-level lineage with rich UI visualization tools.

Atlan’s data lineage capabilities can help you integrate, curate, and manage metadata from all of the data stores and BI tools in your data stack. Moreover, it enables your team to collaborate together to find, identify, and fix data quality issues across your data stack.

Ready to start your data lineage journey? Try Atlan today and see the difference.



Share this article

[Website env: production]