The Future of the Modern Data Stack: Key Trends, How it Started, and What’s next?

Emily Winks profile picture
Data Governance Expert
Published:01/13/2022
|
Updated:01/13/2022
6 min read

Key takeaways

  • Understanding the future of the modern data stack: key trends, how it star is key for modern data teams.
  • A structured approach helps organizations scale their data governance efforts.

Quick Answer: What is the Future of the Modern Data Stack?

The modern data stack is evolving rapidly around six key trends that emerged from 2021 and continue to shape the data industry. From data mesh and metrics layers to reverse ETL and data observability, these ideas are transforming how organizations manage, govern, and extract value from data.

Key trends covered:

  • Data Mesh decentralized data ownership and domain-oriented architecture
  • Metrics Layer standardizing KPI definitions across tools like dbt and BI platforms
  • Reverse ETL pushing warehouse data back into operational business tools
  • Active Metadata third-generation catalogs that automate data management tasks
  • Data Observability monitoring and preventing data downtime proactively
  • Data teams as product teams shifting from service model to user-centered approach

Want to skip the manual work?

See Atlan in Action

Is it just us, or did data go through five years’ worth of change in 2021? With so much hype and rapid change, it’s hard to know what trends are here to stay and which will disappear just as quickly as they arose.

This guide breaks down the six ideas you should know about the modern data stack going into 2022 — the ones that exploded in the data world last year and don’t seem to be going away.


Data Mesh

Permalink to “Data Mesh”

You probably know this term by now, even you don’t exactly know what it means. The idea of the “data mesh” came from two 2019 blogs by Zhamak Dehghani, Director of Emerging Technologies at Thoughtworks. The idea of the data mesh has been quietly growing since 2019 until suddenly it was everywhere in 2021.

The Data Mesh Learning Community launched, and their Slack group got over 1,500 signups in 45 days. Zalando started doing talks about how it moved to a data mesh.

Soon enough, hot takes were flying back and forth on Twitter, with data leaders arguing over whether the data mesh is revolutionary or ridiculous.

So what exactly is data mesh and how will this trend shape in 2022? Read here.

Metrics Layer

Permalink to “Metrics Layer”

In 2021, people finally started talking about how the modern data stack could fix this issue. It’s been called the metrics layermetrics storeheadless BI, and even more names than we can list here.

It started in January, when Base Case proposed “Headless Business Intelligence”, a new approach to solving metrics problems. A couple months later, Benn Stancil from Mode talked about the “missing metrics layer” in today’s data stack.

Airbnb announced that it had been building a home-grown metrics platform called Minerva to solve this issue. Other prominent tech companies soon followed suit, including LinkedIn’s Unified Metrics Platform, Uber’s uMetric, and Spotify’s metrics catalog in their “new experimentation platform”.

Drew Banin (CPO and Co-Founder of dbt) opened a PR on dbtcore, hinting that dbt would be incorporating a metrics layer into its product, and even included links to those foundational blogs by Benn and Base Case.

Are we going to see metrics become a first-class citizen in more transformation tools in 2022? Read here.

Reverse ETL

Permalink to “Reverse ETL”

In 2021, we got another major evolution in this idea — reverse ETL. This concept first started getting attention in February, when Astasia Myers (Founding Enterprise Partner at Quiet Capital) wrote an article about the emergence of reverse ETL.

Hightouch and Census have dominated the reverse ETL discussion in 2021, but they’re not the only ones in the space. Other notable companies are **Grouparoo__, **HeadsUp**, PolytomicRudderstack, and Workato (who closed a $200m Series E in November). Seekwell even got acquired by Thoughtspot in March.

Will we finally solve the “last mile” problem in the modern data stack? Read here.

Active Metadata & Third-Gen Data Catalogs

Permalink to “Active Metadata & Third-Gen Data Catalogs”

This year, data catalogs got new life with the creation of two new concepts — third-generation data catalogs and active metadata.

At the beginning of 2021, we wrote an article on modern metadata for the modern data stack. We introduced the idea that we’re entering the third-generation of data catalogs, a fundamental transformation from the prevalent old-school, on-premise data catalogs. These new data catalogs are built around diverse data assets, “big metadata”, end-to-end data visibility, and embedded collaboration.

This idea got amplified by a huge move Gartner made this year — scrapping its Magic Quadrant for Metadata Management Solutions and replacing it with the Market Guide for Active Metadata. In doing this, they introduced “active metadata” as a new category in the data space.

Since the first time we wrote about third-generation catalogs, they’ve become part of the discourse around what it means to be a modern data catalog. We even saw the terms pop up in RFPs!

How can we use and leverage metadata to create the modern data experience? Read here.

Data catalog 3.0 requirements

Data Teams as Product Teams

Permalink to “Data Teams as Product Teams”

In 2021, Emilie Schario from Amplify PartnersTaylor Murphy from Meltano, and Eric Weber from Stitch Fix talked about a way to break data teams out of this trap — rethinking data teams as product teams. They first explained this idea with a blog on Locally Optimistic, followed by great talks at conferences like MDSCON, dbt Coalesce, and Future Data.

A product isn’t measured on how many features it has or how quickly engineers can quash bugs — it’s measured on how well it meets customers’ needs. Similarly, data product teams should be centered on the users (i.e. data consumers throughout the company), rather than questions answered or dashboards built. This allows data teams to focus on experience, adoption, and reusability, rather than ad-hoc questions or requests.

Data teams today are stuck in a service trap, and only 27% of their data projects are successful. So will data teams emerge as one of the most important teams in the organization fabric? Read here.

Data Observability

Permalink to “Data Observability”

This idea came out of “data downtime”, which Barr Moses from Monte Carlo first spoke about in 2019 saying, “Data downtime refers to periods of time when your data is partial, erroneous, missing or otherwise inaccurate”. It’s those emails you get the morning after a big project, saying “Hey, the data doesn’t look right…”

Data downtime has been a part of normal life on a data team for years. But now, with many companies relying on data for literally every aspect of their operations, it’s a huge deal when data stops working.

Yet everyone was just reacting to issues as they cropped up, rather than proactively preventing them. This is where data observability — the idea of “monitoring, tracking, and triaging of incidents to prevent downtime” — came in.

The space went from being non-existent to hosting a bunch of companies, with a collective $200m of funding raised in 18 months. But is data observability here to stay and be a key part of the modern data stack in the future? Read here.

Conclusion

Permalink to “Conclusion”

It may feel chaotic and crazy at times, but today is a golden age of data. In the last eighteen months, our data tooling has grown exponentially.

We’re on the cusp of getting the modern data stack right. Read about these trends and what 2022 holds for them in this report.

Download Full Report

Share this article

signoff-panel-logo

Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.

 

Atlan named a Leader in 2026 Gartner® Magic Quadrant™ for D&A Governance. Read Report →

[Website env: production]