Business Data Catalog: Users, Differentiating Features, Evolution & More
Share this article
A business data catalog democratizes an organization’s data, placing it in the hands of diverse data users. Business data catalogs are particularly beneficial for non-technical business users who have previously struggled to access data.
This article provides an overview of data catalogs that are particularly suited for use by business users. It also highlights the capabilities that distinguish these catalogs from traditional ones. Finally, we also discuss how to choose the right data catalog software for your business.
Table of contents
- What is a business data catalog?
- Who are the users of business data catalogs?
- How does a business data catalog adapt to different data users?
- How did business data catalogs come into being?
- Features of a business data catalog
- The value of a business data catalog
- Business data catalog: How to evaluate one?
- Business data catalog: Related reads
What is a business data catalog?
Business data catalogs are data catalogs designed specifically with a diverse range of data users in mind. Unlike traditional catalogs, they consider business users of data within an organization as the default and not outliers. They enable these users to equally discover, understand, trust, and utilize data.
For example, a business analyst researching the profitability of retail stores can discover which data is being generated by stores, who owns that data, and how that data is used. This eliminates time, effort, and frustration and produces new business insights on existing data, without complete dependency on a technical data team.
Without a business data catalog, the business analyst would have difficulty finding and accessing retail store data. They may eventually seek help from the central data team, who may or may not have the bandwidth to fulfill the request immediately.
Older-generation data catalogs hide data in isolated systems, wrapped in technical jargon and behind complex interfaces that weren’t built to provide open access to business users’ daily workflows.
Seeing a business data catalog from the perspective of those users is the best way to understand what it does.
Who are the users of business data catalogs?
A business data catalog democratizes data access to diverse roles by giving each group of users a personalized metadata experience. This is a powerful improvement over previous generations of data catalogs with their one-size-fits-all generic experience.
In this section we’ll explore how four user profiles use business data catalogs in their daily work:
- Business user
- Data analyst
- Data engineer
- Data scientist
1. Business user
A data-driven business user is trying to solve business problems or create new opportunities by using data.
Business users are typically “vertical experts” in a particular business domain, such as government, retail, finance, or manufacturing.
Business users aren’t expected to be data or technical experts and would prefer a graphical user interface with features embedded into their current tools and workflow, rather than yet another-specialized tool, portal, or programming language.
It is rare for a business user to find all the answers they need in a single dataset. An easy-to-use business data catalog should help them discover what other datasets are available and how they are relevant to the problems they are trying to solve.
In turn, business users should be able to make their results available to others via the business data catalog.
2. Data analyst
A data analyst reviews data to identify key insights into a business’s customers and ways the data can be used to solve problems. They share these insights, and the datasets they create, with leadership and other stakeholders.
A business data catalog for e.g. gives them confidence in the automatically-generated lineage of the data they are consuming, and therefore the quality and output of their workflow.
3. Data engineer
Data engineers create the source of data in the system, so the quality of their work affects everyone downstream.
Because of the downstream reliance on data engineer output, one of the important data engineering tasks is setting the initial metadata, although others and the business data catalog will evolve this metadata over time.
One example is setting the initial data governance rules in the metadata, limiting those downstream who can consume data sources.
4. Data scientist
A data scientist turns raw data into valuable insights that an organization needs for innovation. Using machine learning and artificial intelligence models, they interpret and analyze data from multiple sources to come up with imaginative solutions to problems.
Similar to data analysts, a data scientist benefits from a business data catalog because of the comprehensive inventory of all data sources and metadata. It simplifies and speeds up their work which, for data scientists, means they can perform many more experiments.
Data science methods such as the OSEMN Framework (Obtain-Scrub-Explore-Model-Interpret) benefit hugely from a business data catalog because each step benefits from active metadata, a single source of truth, and openness.
How does a business data catalog adapt to different data users?
Various users of the business data catalog can access distinct views of the same metadata based on their requirements. Here are some ways in which each user can have a personalized and curated data experience tailored to their needs:
- Personalized home page
- Metadata preferences
- Curated assets
1. Personalized home page
Every data persona gets a homepage customized to their requirement
2. Metadata preferences
Users only see the custom metadata that would be most relevant to them
3. Curated assets
Each user can curate the right set of assets for their day-to-day workflow.
How did business data catalogs come into being?
Data catalogs have a history spanning over four decades. Starting in the 1980s days of the mainframe, then evolving through the internet and cloud years, and culminating in today’s 3rd generation of active metadata platforms.
Let’s take a look at how they’ve evolved.
1. 1990-2010 | Data Catalog 1.0
“Metadata management by and for IT teams”
The relational database revolution, led by Bill Inmon, started in the 1980s mainframe days.
In the 1990s, the 1st generation of data catalogs started to emerge, from vendors such as Oracle, Talend, and Informatica.
These were data catalogs by IT for IT. While they gave experts power over data, it created a barrier to non-expert business users who weren’t technically adept.
Then the internet exploded. Data moved from tape and disk to the internet, democratizing access everywhere. It was clear that existing data solutions were not going to keep pace with the future of data.
Back in 1997, Michael Lesk estimated that the internet was growing tenfold each year, and there were already up to 12,000 petabytes (1 PB = 1,000 TB) of information on the internet.
Data warehouses started to emerge because it was clear that a better way was needed for non-technical business users to get access to and answers from the data.
2. 2010-2020 | Data Catalog 2.0
“Data inventories powered by data stewards”
2nd generation data solutions, such as Collibra and Alation, evolved during the internet and cloud eras. They provided governance and catalog features on top of solutions like data warehouses.
Cloud data services were the big step forward to democratizing access to sophisticated data solutions – such as Hadoop-as-a-service – but they had two problems:
- They were still incredibly technical systems. A new role of Data Steward emerged to recognize the specialist data skills required to marshall access to the systems.
- This democratization also led to “data sprawl” there, because it was cheap and easy to create new data solutions. Islands of disconnected data started to appear.
These disjointed islands of data created a new headache for business users who continued to struggle to get access to and answers from the data.
At the same time, the idea of metadata shifted.
As companies started setting up massive Hadoop implementations, they realized that a simple IT inventory of data wasn’t enough anymore. Instead, they needed to blend data inventory with business context.
The second-generation data catalog was rooted in these new ideas of data stewardship and context-driven metadata. Rather than just inventorying data, they sought to finally create a single source of truth.
3. 2020-today | Data Catalog 3.0
Third-generation data catalogs resemble modern, collaborative, self-service tools rather than imitating their outdated predecessors.
These data catalogs are built on four core guiding principles:
- One size doesn’t fit all in augmented data management
- Context should be embedded into teams’ daily workflows
- Piecemeal solutions are passe. End-users need end-to-end visibility
- “Open by default” will drive infinite metadata-driven use cases
To understand more about what they are and how they differ, download this primer.
Beyond the technology, the biggest differentiator between a modern business data catalog and previous generations is democratizing data access to more users, especially non-expert, non-technical users.
HumansofData explained the biggest problem that business data catalogs solve:
“…data teams are one of the most diverse teams ever created. They’re built from analysts, engineers, analytics engineers, scientists, business users, product managers, and more — all with their own tooling preferences, skillsets, and limitations. The result is a mess of collaboration overhead and data chaos.”
Is the modern data catalog the same as the business data catalog?
Yes, modern data catalogs are business data catalogs, because they are more inclusive to diverse data users.
Equally, business data catalogs are not previous 1st- or 2nd-generation data catalogs, because traditional catalogs do not democratize data access to diverse users. They create islands of data where access is marshaled by expertise.
Some features of a business data catalog
A business data catalog is not just a passive repository of metadata. It has features that elevate it to a tool that is loved and adopted by all kinds of data users. Some of these features are listed below:
Natural language search and ability to browse and filter to customize data search
Finding data should be as easy as searching for information on Google or shopping for something specific on Amazon. A business data catalog enables this type of experience.
An active business glossary for better context
A business data catalog contains a glossary that links data assets to related definitions, metrics, and assets - this connected context is powerful for business users working on data from multiple domains.
Querying data without SQL knowledge
A business data catalog enables non-SQL users to query and better understand their data.
Embedded collaboration to eliminate switching across apps
Each team has its own preferred tooling and established workflow. A business data catalog seamlessly integrates context into these tools, rather than forcing users to constantly switch between them.
Intelligent automation that can be customized to perform various aspects of data management
A business data catalog automates several aspects of finding, compiling, and inventorying data, regardless of its type, format, or source, to extract value from data quickly.
For a comprehensive list of features for business data catalogs, please visit this link.
The value of a business data catalog
One straightforward method to comprehend how a data catalog brings value to your business is by monitoring time-to-value as a metric.
A simple way to calculate time-to-value:
As the quantity of data users increases, any continual rise in time-to-value signifies the necessity of investing in a data catalog. More details on this can be found in this guide.
Without a user-friendly data catalog, central data teams would have to be significantly larger in an attempt to keep up with data demands. This is economically inefficient and not scalable.
If you’re wondering how to communicate the value of data catalogs to your team and the extended organization, check out this resource. It’ll help you make a business case for data catalogs and secure buy-in from decision-makers. Download now.
Business data catalog: How to evaluate one?
To evaluate the appropriate business data catalog, start with a comprehensive list of questions to ask.
This list should include questions about the catalog’s features, ease of use for both business and technical users, customization options, and integration capabilities.
To learn more, check out this step-by-step evaluation guide: Download now.
Business data catalog: Related reads
- What Is a Data Catalog? & Do You Need One?
- 15 Essential Data Catalog Features to Look For in 2023
- Data catalog benefits: 5 key reasons why you need one
- Open Source Data Catalog Software: 5 Popular Tools to Consider in 2023
- Data Catalog Platform: The Key To Future-Proofing Your Data Stack
- Top Data Catalog Use Cases Intrinsic to Data-Led Enterprises
Share this article