The Future of the Modern Data Stack: Key Trends, How it Started, and What’s next?

Is it just us, or did data go through five years’ worth of change in 2021? With so much hype and rapid change, it’s hard to know what trends are here to stay and which will disappear just as quickly as they arose.

This guide breaks down the six ideas you should know about the modern data stack going into 2022 — the ones that exploded in the data world last year and don’t seem to be going away.

Data Mesh

You probably know this term by now, even you don’t exactly know what it means. The idea of the “data mesh” came from two 2019 blogs by Zhamak Dehghani, Director of Emerging Technologies at Thoughtworks. The idea of the data mesh has been quietly growing since 2019 until suddenly it was everywhere in 2021.

The Data Mesh Learning Community launched, and their Slack group got over 1,500 signups in 45 days. Zalando started doing talks about how it moved to a data mesh.

Soon enough, hot takes were flying back and forth on Twitter, with data leaders arguing over whether the data mesh is revolutionary or ridiculous.

So what exactly is data mesh and how will this trend shape in 2022? Read here.

Metrics Layer

In 2021, people finally started talking about how the modern data stack could fix this issue. It’s been called the metrics layer, metrics store, headless BI, and even more names than we can list here.

It started in January, when Base Case proposed “Headless Business Intelligence”, a new approach to solving metrics problems. A couple months later, Benn Stancil from Mode talked about the “missing metrics layer” in today’s data stack.

Airbnb announced that it had been building a home-grown metrics platform called Minerva to solve this issue. Other prominent tech companies soon followed suit, including LinkedIn’s Unified Metrics Platform, Uber’s uMetric, and Spotify’s metrics catalog in their “new experimentation platform”.

Drew Banin (CPO and Co-Founder of dbt) opened a PR on dbtcore, hinting that dbt would be incorporating a metrics layer into its product, and even included links to those foundational blogs by Benn and Base Case.

Are we going to see metrics become a first-class citizen in more transformation tools in 2022? Read here.

Reverse ETL

In 2021, we got another major evolution in this idea — reverse ETL. This concept first started getting attention in February, when Astasia Myers (Founding Enterprise Partner at Quiet Capital) wrote an article about the emergence of reverse ETL.

Hightouch and Census have dominated the reverse ETL discussion in 2021, but they’re not the only ones in the space. Other notable companies are **Grouparoo__, **HeadsUp**, Polytomic, Rudderstack, and Workato (who closed a $200m Series E in November). Seekwell even got acquired by Thoughtspot in March.

Will we finally solve the “last mile” problem in the modern data stack? Read here.

Active Metadata & Third-Gen Data Catalogs

This year, data catalogs got new life with the creation of two new concepts — third-generation data catalogs and active metadata.

At the beginning of 2021, we wrote an article on modern metadata for the modern data stack. We introduced the idea that we’re entering the third-generation of data catalogs, a fundamental transformation from the prevalent old-school, on-premise data catalogs. These new data catalogs are built around diverse data assets, “big metadata”, end-to-end data visibility, and embedded collaboration.

This idea got amplified by a huge move Gartner made this year — scrapping its Magic Quadrant for Metadata Management Solutions and replacing it with the Market Guide for Active Metadata. In doing this, they introduced “active metadata” as a new category in the data space.

Since the first time we wrote about third-generation catalogs, they’ve become part of the discourse around what it means to be a modern data catalog. We even saw the terms pop up in RFPs!

How can we use and leverage metadata to create the modern data experience? Read here.

Data catalog 3.0 requirements

Data Teams as Product Teams

In 2021, Emilie Schario from Amplify Partners, Taylor Murphy from Meltano, and Eric Weber from Stitch Fix talked about a way to break data teams out of this trap — rethinking data teams as product teams. They first explained this idea with a blog on Locally Optimistic, followed by great talks at conferences like MDSCON, dbt Coalesce, and Future Data.

A product isn’t measured on how many features it has or how quickly engineers can quash bugs — it’s measured on how well it meets customers’ needs. Similarly, data product teams should be centered on the users (i.e. data consumers throughout the company), rather than questions answered or dashboards built. This allows data teams to focus on experience, adoption, and reusability, rather than ad-hoc questions or requests.

Data teams today are stuck in a service trap, and only 27% of their data projects are successful. So will data teams emerge as one of the most important teams in the organization fabric? Read here.

Data Observability

This idea came out of “data downtime”, which Barr Moses from Monte Carlo first spoke about in 2019 saying, “Data downtime refers to periods of time when your data is partial, erroneous, missing or otherwise inaccurate”. It’s those emails you get the morning after a big project, saying “Hey, the data doesn’t look right…”

Data downtime has been a part of normal life on a data team for years. But now, with many companies relying on data for literally every aspect of their operations, it’s a huge deal when data stops working.

Yet everyone was just reacting to issues as they cropped up, rather than proactively preventing them. This is where data observability — the idea of “monitoring, tracking, and triaging of incidents to prevent downtime” — came in.

The space went from being non-existent to hosting a bunch of companies, with a collective $200m of funding raised in 18 months. But is data observability here to stay and be a key part of the modern data stack in the future? Read here.

Conclusion

It may feel chaotic and crazy at times, but today is a golden age of data. In the last eighteen months, our data tooling has grown exponentially.

We’re on the cusp of getting the modern data stack right. Read about these trends and what 2022 holds for them in this report.

Download Full Report

Share this article