Metrics Layer: A Single Source of Truth for All KPI Definitions
Share this article
Metrics layer is a framework that empowers organizations to unlock valuable insights and drive data-informed decision-making by consolidating, analyzing, and visualizing key performance indicators in a unified and intuitive manner.
In this article, we’ll explore the significance of the metrics layer, its benefits, key differences as compared to the semantics layer, and requirements for a successful implementation.
Table of contents
- What is the metrics layer?
- Why do you need it?
- How to implement?
- Wrapping up
- Related reads
What is the metrics layer?
A metrics layer (also known as the metrics store or headless BI) is a framework for standardizing metrics, i.e., to centralize how a company calculates its metrics. It can be seen as the single source of truth when it comes to defining KPIs used within the organization.
Bonus trivia: In case you were wondering, the term “headless BI” derives from the fact that these solutions enable various BI tools to connect to an API for accessing metrics. Consequently, they provide the flexibility to swap out tools while maintaining the integrity of metric definitions.
In essence, the concept of metrics layer is not entirely unfamiliar.
For instance, you already store a project’s codebase in a central repository, versioned with Git. Similarly, the organization’s data warehouse or data lake serves as the single source of truth for all data.
Analogously, the metrics layer functions as the single source of truth for the definitions of all KPIs used within the organization.
As illustrated in the schema below, the metrics layer should reside between the data warehouse (or the data source in a broader sense) and all the relevant applications (such as dashboards, reports, AI models, etc.) that consume these metrics.
Let’s further expand on this definition.
The metrics layer not only stores all the metric definitions but also translates the requests generated by applications into SQL. Then, the layer executes the requests against the data warehouse/lake to retrieve the desired metrics.
Why do you need the metrics layer?
You probably have heard some variation of the following sentences in your organization:
- Why is the value of this metric different on dashboards X, Y, and Z?
- It appears that this dashboard is using a different definition of metric X. Can we quickly align all of our dashboards to convey the same story?
- Somebody from management asked about the definition of this metric. Could you investigate the custom queries in this dashboard and determine how we actually calculate it?
Unfortunately, these examples illustrate the types of questions data scientists or data analysts frequently encounter during their daily jobs.
Such questions signal that metrics (or KPIs, we will use those terms interchangeably) have become unmanageable, causing chaos for users, whether they are fellow data professionals or non-technical stakeholders.
What makes it even worse is that these users must often make critical business decisions based on these metrics.
The hidden complexity of simple metrics
As businesses grow and develop, the metrics they monitor also evolve. With the increased volume of data collected, its complexity grows as well.
What might initially seem counterintuitive is that even seemingly simple tasks like counting become challenging in analytics, as numerous complexities arise when aggregating raw data.
To illustrate that, let’s consider a relatable example for many organizations: counting the number of users for an app or service. It should be straightforward, right?
However, the following issues may arise when attempting to count users:
- Determining the time frame for counting users: Should it be done on a daily, weekly, monthly, yearly, or other basis?
- Segmenting users by geographic area: If segmentation is required, what level of detail should be used? Continent, country, state, city, etc.?
- Defining an active user: How do we identify an active user? Should a user be considered inactive if there have been no transactions after a specific period? If so, what is that specific period? Additionally, how should users who log in and use the service but make no purchases be handled? The definition of “active users” can vary significantly.
- Applying data filters or excluding specific users: Should certain users be excluded based on specific flags? For example, should test accounts used by company employees be excluded?
Even a seemingly simple task like counting users involves numerous complexities.
Ensuring accuracy in these metrics is crucial, as inconsistent KPIs across multiple outlets, such as dashboards or reports, can make the stakeholders lose trust in the data. Moreover, it can be extremely challenging for the data team to identify all the different locations where varying and often conflicting metric definitions are used.
In such scenarios, the biggest problem is that there is no central repository for storing metric definitions. These definitions are scattered across various BI tools and custom SQL queries that populate views or dashboards. Consequently, they are often recreated and reused without proper oversight and consistency.
That’s where the metrics layer comes to your rescue. Next, let’s look at the benefits of setting up a single source of truth for your KPIs.
The advantages of the metrics layer
Implementing a metrics layer ensures that multiple individuals within the organization receive consistent answers when they ask questions about a certain metric to different data and non-data professionals.
Let’s explore some additional advantages of implementing a metrics layer:
- Promotes consistency and builds trust
- Embraces the DRY (Don’t Repeat Yourself) principle
- Facilitates adherence to software engineering best practices
- Future-proofs data consumption outlets
- Supports a variety of tools
- Provides a single interface for metrics definitions
Let’s delve into the intricacies of each advantage further.
Promotes consistency and builds trust
By enabling clear and reusable business definitions, the metrics layer fosters consistency within the organization. This consistency strengthens stakeholders’ trust in the data.
Moreover, it allows for the inspection of metric lineage — an understanding of how metrics are constructed and which data sources are used.
Embraces the DRY (Don’t Repeat Yourself) principle
Using the metrics layer eliminates the need to define the business logic for each metric in multiple locations. This avoids unnecessary repetition and ensures efficiency in managing metric definitions.
Facilitates adherence to software engineering best practices
Since the metrics layer is defined using code, it becomes easier to follow established best practices. Additionally, industry-standard solutions can be employed to version control the metrics layer, thus ensuring proper tracking.
Future-proofs data consumption outlets
With a metrics layer in place, the risk of using outdated metric definitions is mitigated across various instances. This empowers developers to build analytics features and data-powered applications, all while maintaining consistent and up-to-date metric definitions.
Supports a variety of tools
The centralized architecture of the metrics layer allows it to seamlessly integrate with a range of tools such as CRMs, BI tools, and internally developed solutions.
Regardless of the tool being used or its internal logic, the end result is based on standardized metric logic.
Provides a single interface for metrics definitions
The centralized architecture of the metrics layer offers a unified interface where all data stakeholders throughout the organization can inspect how specific metrics are defined. This promotes transparency and ensures a shared understanding of metric definitions.
Having explored the what and why of a metrics layer, it’s now time to dive into the how.
Let’s look into the requirements for a successful implementation of the metrics layer. There are several off-the-shelf solutions available for implementation, each with its own strengths and weaknesses.
However, let’s take a step back and shift our focus to the key characteristics that any implementation of a metrics layer should possess in order to effectively fulfill its role within a modern data stack.
Setting the stage for a successful metrics layer implementation
For a successful metrics layer implementation, you need five core attributes:
- A powerful semantics layer
- Integration capabilities
- Performance optimization for low latency
- Deployment flexibility
- Enterprise capabilities
Let’s delve into the nitty-gritty of each attribute, beginning with one aspect that can often lead to confusion: the semantic layer, also known as the semantic model or logical model.
Metrics layer vs semantic layer
The semantic layer serves as a mapping between the tables and columns within the data warehouse and meaningful business entities. The semantic layer is where businesses can define dimensions, measures, and metrics using business-friendly language.
It’s important to note that the semantic layer is just one component of the metrics layer and should not be mistaken for the metrics layer itself.
Ideally, these definitions should be easily crafted through an intuitive user interface (UI) and stored in version-controlled text files, usually in formats like YAML or JSON.
Furthermore, to facilitate automation, the implementation should offer a declarative API.
In addition to the strong semantic layer, there are several other attributes that are crucial for a well-rounded metrics layer implementation, as mentioned above. Let’s explore these further.
To ensure consistent metric definitions, a headless BI solution should have the flexibility to integrate with popular BI tools, programming languages, ML frameworks, and other relevant technologies.
This requires broad support for standards-based data protocols, APIs, and SDKs.
Performance optimizations for low latency
The metrics layer should be designed for high-performance querying, enabling real-time access to metrics at scale.
This is essential for powering automation features such as email triggers and personalized product experiences.
The metrics layer should support a wide range of deployment options, including fully hosted services and cloud-native deployments across different providers.
This flexibility allows organizations to choose the deployment model that best suits their specific needs and infrastructure.
Considerations such as governance, security, access control, performance, scalability, and high availability are crucial for many organizations.
As the metrics layer becomes a mission-critical component for various applications, tools, and processes, it should possess enterprise-grade features to meet the organization’s requirements.
By considering these requirements, organizations can ensure a comprehensive and successful implementation of the metrics layer within their modern data stack.
Many companies are still in the early stages of their data science and machine learning journeys. As such, improving business intelligence and reporting can address about 90% of the data-related challenges they face.
That’s why it’s crucial for these companies to establish consistent and centralized metric definitions.
The metrics layer serves as the authoritative source for all KPI definitions used within the organization. It acts as a bridge between the data source and the various applications (dashboards, reports, AI models, etc.) that rely on these metrics.
Implementing a metrics layer offers numerous benefits. It ensures consistency and trust in the data, enhances operational efficiency, promotes adherence to best practices, future-proofs data analyses, integrates with different tools, and provides stakeholders with unified access to metric definitions.
By leveraging a metrics layer, companies can improve the precision, reliability, and overall effectiveness of their data-driven decision-making processes.
To learn a bit more about the metrics layer, consider reading the insightful conversation between Prukalpa Sankar (Co-founder of Atlan), Drew Banin (Co-founder of dbt Labs), and Nick Handel (Co-founder of Transform).
Their discussion covers a wide range of topics, starting from the fundamentals of a metrics layer and common misconceptions surrounding it to real-life use cases and its significance within the modern data stack.
Metrics layer: Related reads
- What Is a Data Catalog? & Do You Need One?
- AI Data Catalog: Exploring the Possibilities That Artificial Intelligence Brings to Your Metadata Applications & Data Interactions
- 8 Ways AI-Powered Data Catalogs Save Time Spent on Documentation, Tagging, Querying & More
- Data Catalog Market: Current State and Top Trends in 2023
- 15 Essential Data Catalog Features to Look For in 2023
- What is Active Metadata? — Definition, Characteristics, Example & Use Cases
- Data catalog benefits: 5 key reasons why you need one
- Open Source Data Catalog Software: 5 Popular Tools to Consider in 2023
- Data Catalog Platform: The Key To Future-Proofing Your Data Stack
- Top Data Catalog Use Cases Intrinsic to Data-Led Enterprises
- Business Data Catalog: Users, Differentiating Features, Evolution & More
Share this article