Share this article
Data lineage has become an indispensable part of data workflows. But how is data lineage evolving to serve the needs of modern business?
See How Atlan Simplifies Data Cataloging – Start Product Tour
In this article, we’ll review how Gartner sees data lineage and its role in the modern data stack. We’ll also cover what you should look for in data lineage tools and how Gartner sees the concept of data lineage evolving over time.
Table of contents #
- Data lineage: A fundamental feature of any data catalog
- Gartner on data lineage
- What to look for in data-lineage-aware tools
- Atlan’s approach to data lineage
- Related reads
Data lineage: A fundamental feature of any data catalog #
Data lineage is metadata that documents the flow of a data object through a company over time. It enables you to ask critical questions about data, including:
- Where did this data come from?
- Who changed it last - and how?
- What purpose does this data serve?
- Who’s consuming this data elsewhere in the company?
Data lineage support is a core offering for any data catalog. With a data catalog, your users can discover and query data no matter where it lives throughout the company. With data lineage, they can perform a host of operations that improve the quality and utility of data, including:
- Increase user’s confidence in data by identifying its sources and history over time
- Improve data quality through validation and prevention of data loss
- Perform impact analysis, identifying how potential data changes will affect downstream consumers
- Enforce data governance through classification and auditing
Read more → Data Lineage 101
Gartner on data lineage #
Gartner doesn’t track data lineage tools explicitly using its Magic Quadrant assessment as it does for other categories, such as Data Quality or Data Catalogs.
However, in its Magic Quadrant for Data Quality Solutions report, Gartner identifies collecting metadata from third-party systems and building data lineage maps as a key component of any data quality solution.
To qualify for a ranking in Gartner’s Data Quality report, a solution must “delivery critical data quality functions.” Gartner ranks data lineage and metadata as part of that feature set, along with data monitoring, roles management and data validation, and standardization and cleansing, among others.
Gartner’s analysts describe lineage as critical “to perform rapid root cause analysis of data quality issues and impact analysis of remediation.”
Gartner’s report emphasizes lineage as an “emerging technology.” However, it also downgrades several Data Quality tools for their poor data lineage support.
A large component of data lineage support is interoperability. Gartner says that any data lineage solution must have “close integration with metadata solutions” to enable lineage.
How Gartner sees data lineage in the overall data landscape #
In general, Gartner sees data lineage as part of a large category it calls “active metadata management”.
In its 6-tier scheme for metadata, Gartner sees most companies incorporating data lineage at Level 2, the Catalog level. At this level, an organization has just invested in and deployed a data catalog and is taking the first steps to trace the flow of data through the company.
Data lineage work continues throughout the subsequent layers of metadata management maturity. At these higher levels, companies add the ability to perform critical asset resolution, analyze trends, and generate alerts and recommendations.
In other words, in Gartner’s view, data lineage is a foundational technology for any data-mature enterprise.
What to look for in data-lineage-aware tools #
If you’re just starting on the road to data lineage, it’s important to select the right tools.
As Gartner’s report on Data Quality tools makes clear, data lineage is still a “new” and evolving technology. Unfortunately, many data quality and data catalog tools that tout lineage support are still immature or hard to use. It’s important to weigh your options before making such a critical investment.
That’s why we put together a guide to the 19 questions and gotchas to look out for when selecting a data lineage solution. Some of the key features any data lineage solution should have include:
- Column-level lineage, so that you have granular visibility into how your data and its characteristics change over time
- Connectivity to a broad array of third-party systems for collecting metadata and lineage information
- Automated lineage support, so that you always have the latest, most up-to-date view of your data
- Manual enrichment and editing of lineage metadata to augment your automated systems
- Strong visualization support, so that you can see the flow of data clearly and use it to make decisions
With this array of features, you can not only see your data but take action on it.
For example, you can see clearly how a change to the type of a column in a SQL database table could impact a Power BI report used by the sales department.
Atlan’s approach to data lineage #
Atlan agrees with Gartner that strong data lineage features are a cornerstone of data quality and good data governance. That’s why Atlan’s data catalog supports column-level lineage with rich UI visualization tools.
Atlan’s data lineage capabilities can help you integrate, curate, and manage metadata from all of the data stores and BI tools in your data stack. Moreover, it enables your team to collaborate together to find, identify, and fix data quality issues across your data stack.
Ready to start your data lineage journey? Try Atlan today and see the difference.
Gartner data lineage: Related reads #
- Gartner Data Catalog Research Guide — How To Read Market Guide, Magic Quadrant, and Peer Reviews
- A Guide to Gartner Data Governance Research — Market Guides, Hype Cycles, and Peer Reviews
- Gartner Active Metadata Management: Concept, Market Guide, Peer Insights, Magic Quadrant, and Hype Cycle
- Active Metadata: Your 101 Guide From People Pioneering the Concept & It’s Understanding
- The G2 Grid® Report for Data Governance: How Can You Use It to Choose the Right Data Governance Platform for Your Organization?
- The G2 Grid® Report for Machine Learning Data Catalog: How Can You Use It to Choose the Right Data Catalog for Your Organization?
- Data Catalog: What It Is & How It Drives Business Value
- What Is a Metadata Catalog? - Basics & Use Cases
- Modern Data Catalog: What They Are, How They’ve Changed, Where They’re Going
- Open Source Data Catalog - List of 6 Popular Tools to Consider in 2024
- 5 Main Benefits of Data Catalog & Why Do You Need It?
- Enterprise Data Catalogs: Attributes, Capabilities, Use Cases & Business Value
- The Top 11 Data Catalog Use Cases with Examples
- 15 Essential Features of Data Catalogs To Look For in 2024
- Data Catalog vs. Data Warehouse: Differences, and How They Work Together?
- Snowflake Data Catalog: Importance, Benefits, Native Capabilities & Evaluation Guide
- Data Catalog vs. Data Lineage: Differences, Use Cases, and Evolution of Available Solutions
- Data Catalogs in 2024: Features, Business Value, Use Cases
- AI Data Catalog: Exploring the Possibilities That Artificial Intelligence Brings to Your Metadata Applications & Data Interactions
- Amundsen Data Catalog: Understanding Architecture, Features, Ways to Install & More
- Machine Learning Data Catalog: Evolution, Benefits, Business Impacts and Use Cases in 2024
- 7 Data Catalog Capabilities That Can Unlock Business Value for Modern Enterprises
- Data Catalog Architecture: Insights into Key Components, Integrations, and Open Source Examples
- Data Catalog Market: Current State and Top Trends in 2024
- Build vs. Buy Data Catalog: What Should Factor Into Your Decision Making?
- How to Set Up a Data Catalog for Snowflake? (2024 Guide)
- Data Catalog Pricing: Understanding What You’re Paying For
- Data Catalog Comparison: 6 Fundamental Factors to Consider
- Alation Data Catalog: Is it Right for Your Modern Business Needs?
- Collibra Data Catalog: Is It a Viable Option for Businesses Navigating the Evolving Data Landscape?
- Informatica Data Catalog Pricing: Estimate the Total Cost of Ownership
- Informatica Data Catalog Alternatives? 6 Reasons Why Top Data Teams Prefer Atlan
- Data Catalog Implementation Plan: 10 Steps to Follow, Common Roadblocks & Solutions
- Data Catalog Demo 101: What to Expect, Questions to Ask, and More
- Data Mesh Catalog: Manage Federated Domains, Curate Data Products, and Unlock Your Data Mesh
- Best Data Catalog: How to Find a Tool That Grows With Your Business
- How to Build a Data Catalog: An 8-Step Guide to Get You Started
- The Forrester Wave™: Enterprise Data Catalogs, Q3 2024 | Available Now
- How to Pick the Best Enterprise Data Catalog? Experts Recommend These 11 Key Criteria for Your Evaluation Checklist
- Collibra Pricing: Will It Deliver a Return on Investment?
- Data Lineage Tools: Critical Features, Use Cases & Innovations
- OpenMetadata vs. DataHub: Compare Architecture, Capabilities, Integrations & More
- Automated Data Catalog: What Is It and How Does It Simplify Metadata Management, Data Lineage, Governance, and More
- Data Mesh Setup and Implementation - An Ultimate Guide
Share this article