7 Best Practices for Data Governance to Follow in 2023

Last Updated on: April 03rd, 2023, Published on: December 6th, 2022.

header image

Share this article

Data governance best practices are a set of guidelines known to be adopted by successful data teams to effectively scale their data governance efforts.

Implementing data governance best practices is essential to ensure that your data remains accurate, reliable, and secure.

7 Data governance best practices

The seven data governance best practices to help you improve data governance are as follows:

  1. Lead with your “why”
  2. Adopt a “data product” mindset
  3. Embed collaboration in daily workflows
  4. Automate wherever possible
  5. Ensure data enablement with DataOps
  6. Invest in the right technology
  7. Keep changing and adapting your outlook on data governance

Table of contents

  1. What are data governance best practices?
  2. 7 Data governance best practices
  3. 1. Lead with your “why”
  4. 2. Adopt a “data product” mindset
  5. 3. Embed collaboration in daily workflows
  6. 4. Automate wherever possible
  7. 5. Ensure data enablement with DataOps
  8. 6. Invest in the right technology
  9. 7. Keep changing and adapting your outlook on data governance
  10. Why should you follow these data governance best practices?
  11. Data governance best practices: Next steps
  12. Related reads on data governance best practices

Here, we’ll explore and understand these best practices employed in data governance programs.

What are data governance best practices?

Data governance best practices are a set of guidelines known to be adopted by successful data teams to effectively scale their data governance efforts.

You can think of them as guard rails and policies that help you answer questions, such as:

  • What data does your organization have?
  • Where does this data live?
  • Where and how does it flow through your organization?
  • What is it used for? What reports or metrics are getting generated using this data?
  • How to get access to this data?
  • Who owns that data?
  • Who defines, modifies, and uses that data?
  • Can this data be shared?

Let’s take a deeper look into each of the 7 data governance best practices in detail.

1. Lead with your “why”

The need for an overarching purpose

Most data governance frameworks start with a why — a goal, a corporate driver, or a strategic layer for governance strategy and vision. The “why” helps you define how your actions will deliver value and align with your organization’s business objectives.

Having an overarching purpose also helps the people in your organization develop a sense of purpose and engagement. According to Kathryn Minshew, co-founder, and CEO of the popular career advice site The Muse:

“Younger employees want to believe in the value of their work. They expect to be heard and are less likely to follow orders without context.”

How does creating and communicating the “why” help your teams?

Another reason for starting with your “why” and involving your people in the process is the way data governance itself has evolved over time.

In another data governance article, we highlight how modern data governance cannot be a top-down approach, but instead, it should be a decentralized, community-led initiative. In such an environment, data governance becomes a collective, shared responsibility of everyone in your organization.

So, it’s crucial for them to understand the purpose behind your data governance program, policies, and standards. You can start by asking your teams how they visualize your organization’s data culture in the next 12-18 months.

Data governance best practices - Atlan

2. Adopt a “data product” mindset

What is a data product?

A data product is anything that extracts value from data and helps you generate meaningful insights. In the book Data Analytics with Hadoop, a data product is defined as follows:

A data application acquires its value from the data itself and creates more data as a result. It’s not just an application with data; it’s a data product.

So, a data product can be raw data, warehouses, KPI dashboards, domain data, algorithms, and more.

DJ Patil, who was formerly the Chief Data Scientist at the US Office of Science and Technology Policy, adds further context to the term here:

When you think about data products more broadly, you start to realize that even the dashboards inside your company count. Suddenly your horizons are open, and you can start creating processes that allow you to understand, make and sell things at scale.

Why should you apply product thinking to data?

Applying product thinking to data can help you generate meaning from data at scale.

Unlike a service, a product is built once and used by several customers to solve a problem. The product can be updated and improved to optimize the value your customers are getting, but the premise remains unchanged.

Here’s how Prukalpa Sankar, co-founder at Atlan, highlights the impact of product thinking on data teams:

A product isn’t measured on how many features it has or how quickly engineers can quash bugs — it’s measured on how well it meets customers’ needs. Similarly, data product teams should be centered on the users (i.e. data consumers throughout the company), rather than questions answered or dashboards built. This allows data teams to focus on experience, adoption, and reusability, rather than ad-hoc questions or requests.

Read moreHow to apply product thinking to data

How can you apply the product thinking mindset to data governance?

In the case of data governance, you can identify each data domain as a data product, and appoint domain data owners, i.e., data product owners to govern the data they create. When you put the onus of managing data on the ones who create it, dealing with data accountability and trust issues becomes simpler.

The consumers of that data product — analysts, scientists, and business managers — should be treated as customers, and providing them with a delightful experience should be a fundamental objective of each data product owner.

The data product owners are thereby responsible for making sure that the “data product” is:

  • Reusable
  • Reproducible
  • Well-documented
  • Scalable
  • Accessible
  • Easy to understand and use, enabling self-service

3. Embed collaboration in daily workflows

The role of metadata in data governance

A core outcome of data governance is making your organization’s data easy to access, understand, and consume. Metadata plays a central role in this outcome by offering relevant context that makes data discoverable and understandable to its consumers.

However, metadata cannot be housed in yet another tool that data teams must switch through to get the full context. Josh Wills, a software engineer at Slack, described the conundrum in his tweet — he has no desire to ever visit a third website to just “browse the metadata”.

screenshot showing tweet by Josh Wills

The need for embedding metadata in our daily workflows. Source: Twitter

What is embedded collaboration?

Embedded collaboration is about work happening where you are, with the least amount of friction.

With embedded collaboration, you can answer several questions about the origins and traceability of data, which further simplifies data governance.

As Atlan’s co-founder Prukalpa Sankar says, “embedded collaboration can unify dozens of micro-workflows that waste time, cause frustration, and lead to tool fatigue for data teams, and instead make these tasks delightful.”

What does embedded collaboration for data governance look like?

By embedding metadata into the daily workflows of your teams, you help them collaborate and discuss data using their tool of choice. For instance, they could search for data definitions with Slack, or trace lineage without leaving Looker.

So, anyone trying to understand a dataset can do so using their BI tool, and get all the context on that asset — glossary definition, Slack discussions, queries, data lineage mapping, and more.

4. Automate wherever possible

The rise of automation

Automation is already here in the form of RPA (robotic process automation), CPA (cognitive process automation), and LPA (low-code automation). Programmable, intelligent bots are performing repeatable and redundant manual tasks, automating non-routine tasks, and even replicating decisions requiring human judgment.

Here’s how Cathy Tornbohm, VP analyst at Gartner, describes the future of spending in the field of RPA:

“By achieving a growth rate of 31% in 2021, the RPA market grew well above the average worldwide software market growth rate of 16%.”

What would automation in data governance look like?

That’s why you should also leverage the potential of automation for data governance.

For instance, you can use programmable bots to auto-identify sensitive PII, HIPAA, and GDPR data. You may also automatically propagate custom classifications downstream and upstream.

5. Ensure data enablement with DataOps

DevOps and software development

DevOps rose to prominence because of its mission to deliver applications and services at scale by eliminating the silos in software development and operations.

It emphasizes developing a collaborative culture between the operations and development teams, and advocates using automation to make software delivery quicker with CI (continuous integration), CD (continuous delivery), and CD (continuous deployment).

SalesOps and sales productivity

Similarly, SalesOps came into the picture to reduce the friction between the various sales processes. According to HubSpot, SalesOps supports sales teams by offering insights on process bottlenecks, assisting with finding new leads and prospects, and using technology to make sales more efficient.

Both DevOps and SalesOps are a collection of philosophies, practices, and tools that reduce friction and promote collaboration across teams.

Data products also need a similar practice that focuses on tools, processes & culture to make the rest of the organization more data-driven and can help with better data governance. That’s where DataOps can help.

Implementing DataOps to elevate from data governance to data enablement

According to Gartner,

DataOps is “a collaborative data management practice focused on improving the communication, integration, and automation of data flows between data managers and data consumers across an organization.”

It applies the principles of lean manufacturing, Agile methodology, and DevOps to data. So, DataOps ensures that you:

  • Develop your data product with the goal of delivering value to end users and the business
  • Ship “data products” just like “software products” using the Agile methodology and automation (i.e., CI/CD pipelines)
  • Weave data governance into the daily workflows of everyone in your organization

6. Invest in the right technology

The consumerization of technology

Technology has undergone massive shifts in the last decade as production costs have decreased substantially and cloud computing has become the norm.

As a result, we live in an era where the “end users are also employees of enterprises, and their expectations of the digital technologies in the enterprise are conditioned by the technology they use in their everyday lives.

This phenomenon is called the consumerization of technology, and that’s why investing in the right technology requires you to look for the following characteristics:

  • An intuitive, memorable experience
  • Hyper personalization
  • Quick and snappy
  • Alive and constantly adapting
  • Multiple modalities with rich interactions
  • Anytime, anywhere
  • Collaborative

What tools are essential for data governance?

The tool you use to promote data governance across your organization must embody these characteristics.

To ensure you have a solution that lets you embrace data governance best practices, your chosen tool/platform must have the following capabilities:

  • An easily searchable data catalog with 360-degree data asset profiles
  • A data workspace that can be customized according to user roles, projects, or data domains
  • A business glossary that offers rich context on each data asset
  • Programmable bots to automate data tagging, classification, etc.
  • Cross-system, column-level data lineage
  • Data quality profiling
  • Granular, role-based access controls

7. Keep changing and adapting your outlook on data governance

The evolution of the data landscape and the modern data stack

The data landscape keeps evolving and the modern data stack keeps upgrading. Within two decades, we’ve gone from relational databases to cloud data lakehouses and the ecosystem will continue to evolve as more data and analytics use cases emerge.

Here’s how Matt Turck, VC at FirstMark, describes this evolution:

data warehouses have unlocked an entire ecosystem of tools and companies that revolve around them: ETL, ELT, reverse ETL, warehouse-centric data quality tools, metrics stores, augmented analytics, etc. Many refer to this ecosystem as the “modern data stack”.

The Machine learning, Artificial intelligence and Data (MAD) landscape by Matt Turck and John Wu

The Machine learning, Artificial intelligence and Data (MAD) landscape by Matt Turck and John Wu at FirstMark. Source: Matt Turck

Read moreModern data stack 101 and The future of the modern data stack

Why continuously reviewing your approach to data governance is a best practice?

While capturing and ingesting large volumes of data has become easier and cheaper, keeping track of all that data, getting adequate context, and using it for decision-making continue to be painful.

That’s why there’s a lot more room to evolve in the data tooling ecosystem. Matt Turck goes on to mention how data engineering tools and practices are still very much behind the level of sophistication and automation of their software engineering cousins.

That’s why it’s crucial to view data governance as a constantly evolving project, rather than a one-time exercise, just like the rest of the data stack.

Here’s how Snowflake emphasizes this need:

“As data volumes grow, new data streams emerge, and new access points emerge, you’ll need a policy for periodic reviews of your data governance structure — essentially governance of the data governance process.”

Why should you follow these data governance best practices?

Because they offer solutions to challenges that contribute to why data governance programs fail

Most organizations already have a data governance program in place. However, its effectiveness is far from guaranteed.

According to Gartner’s D&A governance survey in 2021, 61% said their governance objectives included optimization of data for business processes and productivity, but only 42% of that group believed that they were on track to meet that goal.

In the same survey, Gartner estimates that by 2025, 80% of organizations seeking to scale digital business will fail because they do not take a modern approach to data governance. Such an approach should be decentralized, community-led, and collaborative.

Data governance best practices: Next steps

Adopting a “data product” mindset, embedding collaboration in daily workflows, embracing DataOps, and leveraging highly customizable and programmable tools are critical.

You can start by identifying a high ROI use case for data governance and following the above-mentioned best practices. Once you’ve seen proof-of-concept, you can scale data governance for the remaining data and analytics use cases.

Ready to implement data governance best practices? Try Atlan. Say goodbye to the complex, bureaucratic version of data governance. Say hello to data enablement — a simpler, community-centered approach, with privacy at its core.

Related reads on data governance best practices

Share this article

[Website env: production]