Data Dictionary vs. Business Glossary: Definitions, Examples & Why Do They Matter?
Share this article
Data dictionary vs. business glossary
A data dictionary documents metadata that describes the contents of a data set, it is technical in nature, whereas a business glossary is a common knowledge bank that helps define business concepts. Both data dictionaries and business glossaries help add context and meaning to data, but they serve different purposes.
Before getting into their use cases, let’s quickly understand how they are typically defined.
What is a data dictionary?
A data dictionary describes the contents of a data set — think of it as a README file for data — and is technical in nature. It describes each table and field in databases, spreadsheets, etc., and can include some higher-level metadata, such as data types. Technical teams are responsible for creating and maintaining data dictionaries.
What is a business glossary?
A business glossary contains the definitions of key business terms used by various teams within an organization. Think of it as a centralized knowledge bank that defines business concepts or terms. The business teams are responsible for creating and maintaining the business glossary of an organization.
Going deeper into these concepts, we’ve covered the following information for you.
Table of contents
- Data dictionary vs. business glossary
- Data dictionaries vs. business glossaries: Differences in application
- Example of a data dictionary
- Example of a business glossary
- Data catalog vs. data dictionary
- Different types of data dictionaries and business glossaries
- Data dictionary vs. business glossary: Resources to get started
The Ultimate Guide to Evaluating an Enterprise Data Catalog
Data dictionaries vs. business glossaries: Differences in application
As mentioned earlier, data dictionaries are technical in nature, whereas glossaries foster enterprise-wide consistency in data understanding and use.
For instance, The Pacific Northwest Forest Inventory and Analysis Database provide information on the contents of each table and how they relate to one another. So, researchers can join tables and run queries to find the information they need.
However, the dictionary doesn’t offer context on the elements of each table and as a result, it’s tough to answer questions such as:
- How exactly is a forest defined?
- What is the definition of a seedling?
- What is a plot, or a subplot?
Without these answers, two researchers using the forest inventory might understand some of those terms in slightly different ways, leading them to interpret the data differently.
This is where a business glossary comes in. It would define all of these terms in a clear, unambiguous way so that users across an organization understand and interpret data similarly.
Example of a data dictionary
The data dictionary acts as a reference guide to using the dataset. Let’s look at the dictionary for The Pacific Northwest Forest Inventory and Analysis Database. It contains information such as:
- A summary of each data table
- The name of each field
- Longer form descriptions of the contents of the fields
- Data types (integer, text, and real numbers)
- The relationships between tables
When a field is categorical, the dictionary offers descriptions for each category. For instance, a STATECD (i.e., state code) value of 2 in the POP_ESTN_UNIT (i.e., population estimation unit) table illustrated below means Alaska, whereas state code 6 refers to California.
If researchers wanted to use the Forest Inventory, they would find everything they need to interpret the database in this data dictionary. As a result, they can make sense of the actual values of the data accessed by referring to the relevant tables and fields.
Example of a business glossary
A business glossary standardizes terms across different business units. Here is an example of a business glossary used in the Boston University Human Capital Management reports. Each term is enriched with context, such as:
- A more descriptive name
- The type of term (characteristic, variable, key figure)
- A description of the term
- Possible aliases
In some cases, the glossary also includes specific business rules for defining a term.
Data catalog vs data dictionary
People often find themselves thinking about the differences in use cases of a data catalog and a data dictionary. Let’s try to address that here as well.
A data catalog is a tool that helps index, inventory and classify data assets across multiple data sources in your enterprise. It adds a much-needed context layer with a focus on discovery, search, metadata management, lineage, collaboration, and governance.
The best data catalog tools make it possible to create and maintain data dictionaries for your databases. They even go a step further to use metadata from dictionaries to power use cases of data discovery, trust, usage, and governance.
A demo of Atlan business glossary and data dictionary
Different types of data dictionaries and business glossaries
Data dictionaries can be:
The focus of a logical data dictionary is the meaning of the data and the relationships within the data, from the perspective of business use. Logical dictionaries are platform-agnostic — even if a dataset is moved to a different data platform, the logical data dictionary would still be valid.
Physical data dictionaries describe specific tables and fields and align with storage platform-specific naming conventions. They can include low-level details like lengths of fields and data types. Since these dictionaries are tied to the technical details of the stored data, they aren’t platform-agnostic. So, if a dataset is moved to a different data platform, the physical data dictionary would no longer be applicable.
Take a test drive, explore and try your hands on a data dictionary and business glossary
A business glossary can define aspects like:
- Business terminology
- Technical data assets
1. Business terminology
A glossary for business terminology helps in solving team collaboration problems by bringing everyone on the same page. It can include definitions and rules of use for terms such as annual statements, appraisals, etc.
A metrics glossary will contain business metrics such as ARR, MAU, MTR, NPS, etc. The glossary defines each metric and explains how it’s calculated. This ensures that metrics are standardized and team leads across the organization follow the same approach to calculating them.
A project glossary is a business glossary specific to a project. It contains relevant terminology for that project and will grow as it develops. A project-specific glossary is useful for onboarding new team members to a project. It also helps keep project handovers seamless.
For example, when a project is ready to be shipped, the project owner might bring the marketing and sales teams on board. A project glossary will help them accurately describe your project to potential customers with no miscommunication or confusion.
4. Technical data assets
A technical glossary combines a business glossary and a data dictionary. These glossaries describe the contents of technical data assets from a business point of view, such as policy_expiration_date, policy_id, etc. So, the data team can pick the right data sets to answer business questions, build dashboards, get quick insights, and more.
Demo of Atlan data dictionary and glossary
Data dictionary and business glossary: Resources to get started
- What is the motivation for building a business glossary? Knowing your nouns.
- How should I go about building a data dictionary? Have a look at this implementation plan from the New York University Data Governance Initiative.
- What information should I put in my data dictionary? The Center for Open Science and the US Department of Agriculture give some recommendations and examples.
- What are best practices for creating data dictionaries? Smithsonian Libraries and the Northwest Environmental Data Network give some recommendations.
If you are evaluating a business glossary and data dictionary tool for your team, do take Atlan for a spin - Atlan is more than a glossary/dictionary solution, it is a collaborative metadata management and data catalog tool that enables shared understanding of data.
Share this article