Data Governance Software: Choosing the Right Fit for Your Business

Updated December 08th, 2023
header image

Share this article

Selecting the right data governance software isn’t an easy decision. How do you narrow down your options in a crowded market?

This article lists the basic capabilities and user stories a data governance software platform should fulfill. We’ll also give you a framework for identifying your unique needs and workflows.

Want to make data governance a business priority? We can help you craft a plan that’s too good to ignore! 👉 Talk to us

Table of Contents #

  1. What to look for in data governance software
  2. Comprehensive data governance and metadata management
  3. Automated data quality enforcement and monitoring
  4. Automated data lineage tracking
  5. Policy management and management of sensitive data
  6. Collaboration and workflow management
  7. Drive innovation through data
  8. Conclusion
  9. Data governance software: Related reads

What to look for in data governance software #

There are numerous components to data governance. The broad term covers metadata management, data classification, data quality, security and privacy, and change management, among other areas.

To support these at scale, data governance software must support multiple features to enable automation. How organizations go about these will differ in detail. But most will need the following features as a baseline:

  • Comprehensive data governance and metadata management
  • Automated data quality enforcement and monitoring
  • Automated data lineage tracking
  • Policy management and management of sensitive data
  • Collaboration and workflow management
  • Drive innovation through data

To translate this into a guide to evaluating data governance software, we can break these areas down into two lists:

  • User stories - The functions and operations that users will perform. For example, what does a data steward on a team need to do in order to manage data classifications or data quality? How does a data engineer transform data? What will business analysts require in terms of finding and making use of data?
  • Capabilities - The features of a data governance software platform that enable the user stories. Capabilities should specify exactly how users will use a feature to fulfill the requirements stated in the story. (For example, what mechanisms are available for loading metadata into the system?)

Let’s look at each of these buckets in detail and which user stories and capabilities belong to each.

Comprehensive data governance and metadata management #

The most basic feature of data governance software is collecting data and metadata no matter where it lives in an organization. This involves not just collecting but also enriching metadata - for example, by adding sensitivity classifications to identify and protect Personally Identifiable Information (PII).

Top-line user story: #

Can we manually/automatically add business context (e.g., field descriptions) so everyone can better understand what data we have?

Other user stories: #

  • Can we manually identify and enrich metadata - e.g., by attaching classification tags?
  • Using a low-code or no-code approach, can we create automated tasks that identify sensitive data and deploy these classifications at scale across our data estate?
  • Can we create and automatically enforce data policies and rules around data handling?

Data governance software capabilities: #

  • Configuring data catalog connectors to automatically collect data and metadata from cloud-based and on-premises data storage systems
  • Dashboards to find, review, and classify data and edit metadata
  • Intelligent automation tooling to define processes that identify data based on certain criteria and apply transformations, such as automated tagging & classification
  • Data policy definition tools that enable authoring, previewing, and deploying data rules and policies across the organization

Automated data quality enforcement and monitoring #

Data quality is a critical component within the framework of data governance. It ensures that data is correct, complete, and usable for business decision-making. Tasks within data quality include data cleansing, data enrichment, validation, and auditing.

Enforcing data quality at scale requires solid automation tooling and metrics.

Top-line user story: #

Can we define “data quality” and calculate an initial baseline quality score for all data across cloud-based and on-premises systems?

Other user stories: #

  • Can we create data quality KPIs and dashboards to track data quality and progress over time?

Data governance software capabilities: #

  • Out-of-the-box data quality profiling that hits most of the common criteria for data quality:
    • Column profile analysis
    • Table profile analysis
    • Value frequency profile analysis
    • Data quality insights (ex.: class/type violations, data duplication, formatting inconsistencies, missing values, suspicious values, potentially untagged sensitive information)
  • Customize data quality rules via WYSIWYG tooling, SQL statements, or other programmatic means (Python or JavaScript scripting)
  • Schedule data quality reports to run automatically
  • Create visualizations of data quality over time, along with actionable insights for communicating to the field and senior leadership
  • Automated alerting when the data governance software platform detects data quality issues

Automated data lineage tracking #

Data lineage is an indispensable tool in modern data governance. Using lineage, you can tell where data comes from, who owns it, and when it was last updated. When issues with data arise, you can use data lineage visualizations to track down the root cause and fix the issue at its source.

Top-line user story: #

Can we perform root cause analysis to find the source of a breaking change (e.g., when a report fails to render correctly)?

Other user stories: #

  • Can we automatically gather lineage information across our entire data estate (data warehouses, data lakes, analytics tools, data transformation tools, and machine learning projects)?
  • Can we manually load additional lineage information to supplement the automatically gathered data?
  • Can we program regularly scheduled scans of lineage data and metadata to ensure they are up to date?
  • Can we assess the potential impact of a change in our data stack so that we can work with downstream consumers before introducing a breaking change?

Data governance software capabilities: #

  • Automated and manual data lineage gathering via data connectors as well as upload of manual information (e.g., Excel or CSV files)
  • A visual data lineage map showing all upstream producers and downstream consumers, as well as their relationships with one another. Ability to zoom in and out and also see data relationships in both graphical and textual representations
  • Column-level data lineage that shows how data flows through your system at a columnar level and how column structures and data change over time
  • Impact analysis integration with tools such as GitHub to detect when a change to a field in your data transformation layer may cause a negative downstream impact

Policy management and management of sensitive data #

A key component of data governance is ensuring users can only access the data they have the right to access. Sensitive information such as customer PII should be hidden from view for everyone without a business justification to see it.

These rules should be enforced automatically by data governance software. They should also be highly configurable so that an org can tailor them to its specific data governance policies.

Top-line user story: #

Can we implement our data governance policy via active policy management rules enforced automatically by the system?

Other user stories: #

  • Can we assign people to roles and manage access to information accordingly?
  • Can we mask sensitive information from authorized individuals?

Data governance software capabilities: #

  • Defining, previewing, rolling out, and monitoring data policies and data access rules, with the ability to define different policies per rule (restrict, redact, obfuscate, substitute)
  • Role-based access control (RBAC) to control general access to data based on job function
  • Persona-based rules, enforced via policies, that control access to data based on how people within the organization use it (hiding engineering-specific data from sales roles; automatically masking sensitive data for unauthorized stakeholders)
  • Define and apply data rules based on roles, personas, and classifications, with rules automatically propagated across the system using data lineage

Collaboration and workflow management #

Many companies have historically approached data governance as a top-down exercise enforced from on high. That’s changing as data volumes rise and these processes fail to scale.

Modern approaches to data governance support a decentralized approach to governance that emphasizes collaboration and sharing. A good data governance software system enables employees to find data and work together using the same collaboration tools they already use on a daily basis.

Modern data governance also enables data democratization. This means that teams can own their end-to-end data processes and workflows while the organization sets standards for compliance & quality across the business.

Top-line user story: #

Can users who find issues with data in a data governance software platform discuss issues or report bugs directly from the platform without switching to another tool?

Other user stories: #

  • Can the organization monitor and measure the quality of data workflows to ensure quality and build trust with users?
  • Can we configure approvals to sign off on new workflows and workflow changes?
  • Can data owners find data in their current business productivity tools (e.g., Slack or Teams)?
  • Can the organization see who’s using what data in the system?

Data governance software capabilities: #

  • Data workflow creation and monitoring tools that manage workflow change approvals and monitor pipelines for ongoing data quality
  • Embedded collaboration tools for discussing and reporting data-related issues directly via the data governance platform
  • Metrics dashboards that display data usage metrics as well as data governance software platform adoption rates

Drive innovation through data #

Data governance automation means businesses can create higher-quality data in less time. This becomes a competitive advantage, as higher-quality data leads to faster and better decision-making. It also means businesses can bring data-driven products to market faster, thanks to increased confidence and trust in the underlying data.

The introduction of AI to the data governance software space opens up even greater possibilities for speed and agility. Using AI features trained on its own data, organizations can create documentation, rules, and policies faster than previously thought possible.

Top-line scenario: #

Can we use AI to learn from our existing data and suggest documentation for previously undocumented data assets?

Other scenarios: #

  • Can we leverage AI to suggest new business metadata and create new rules and policies that we can manually review?
  • Can we automatically update and track data lineage and updates to system metadata to manage data as it changes?
  • Can we access our data governance software platform via APIs in order to add our own processes, such as automated enrichment of metadata?

Data governance software capabilities: #

  • An AI recommendation engine that suggests new metadata, business glossary items, data asset documentation, data policies, and data rules based on current data, with the ability to manually review, edit, and apply them across the system
  • AI assistants that help users create SQL queries and mine their data governance software for business insights
  • Support for active metadata to automatically update data lineage, purge stale assets, raise security and data classification alerts, reduce the time required to perform root cause analysis, and optimize the data stack
  • An open API architecture that enables extending the data governance software platform via enhanced automation, connectivity to proprietary internal systems, or integration with third-party software

Conclusion #

Looking for a comprehensive data governance software platform? Whatever your data governance needs, Atlan can support them with features such as active metadata support, column-level data lineage, personalized data dashboards, AI, and an open API architecture.

Interested in seeing it for yourself? Talk to us today.

Share this article

[Website env: production]