Truedat: A Comprehensive Guide on This Open-Source Data Governance Tool

Updated August 24th, 2023
Truedat Data Governance

Share this article

Truedat is an open-source tool from Bluetab designed to address data quality and governance. This guide will explore the essential aspects of Truedat, including its origins, core functionalities, structure, setup, and how it compares with other tools in the market.

Table of contents #

  1. What is Truedat?
  2. Truedat: Architecture overview
  3. A summary of Truedat features that support data governance
  4. Getting started with Truedat
  5. Alternatives to Truedat
  6. Wrapping up
  7. Related reads

What is Truedat? #

Truedat is an open-source tool for data governance developed by Bluetab Solutions Group (now an IBM company).

It is equipped with a suite of integrated features, including:

  • A global search engine
  • Glossary
  • Automatic data cataloging capabilities
  • Quality rule management
  • Data life cycle visualization (Data Lineage)
  • Data intake and delivery management (Data Requests)
  • A dashboard for metrics on data status and quality

Truedat origins: The problem that it aims to solve #

In 2018, Bluetab (now an IBM Company) launched Truedat. The primary motivation behind Truedat’s development was the escalating demand for effective data governance solutions. Businesses were looking for tools that could:

  1. Catalog and oversee data from diverse sources
  2. Pinpoint and monitor data quality issues
  3. Implement and uphold data policies and protocols
  4. Conduct audits on data usage
  5. Track data governance activities

To understand what Truedat has to offer in terms of data governance, let’s get an overview of its architecture and its various modules.

Truedat: Architecture overview #

Truedat supports data governance by allowing you to organize and enrich information through configurable workflows and monitoring activity. Its principal components are:

  • Data catalog
  • Data quality
  • Data lineage
  • API integrations
  • Administration of roles
  • Dashboards

Let’s explore each component further.

Data catalog #

Truedat provides a centralized repository for technical metadata management captured from various data sources. This can include data asset source, type, tag, relationships, and ownership.

The catalog offers a single, structured inventory of data assets and integrates with several tools in the modern data stack.

Truedat’s data catalog module

Truedat’s data catalog module - Source: Truedat.

You can use the catalog to find information on data assets such as asset import date, version, quality, relationships, usage frequency, popularity, etc.

The catalog helps you understand data structure type, version, confidentiality, tags, changes made, etc. You can also share structures via Truedat and the relevant people will get notifications and emails.

Tables and files in Truedat are called Structures.

Data quality #

This module handles data quality rules, implementation, and governance. It aids in safeguarding data and ensures its consistent quality across systems by defining and implementing data quality rules.

Data quality rules are based on your organization’s quality principles. According to Truedat, the quality rule must include the following details:

  • Why use this rule?
  • How does it affect business?
  • Description of how the rule should be implemented and any other data that is considered relevant from a business/functional point of view

In Truedat, quality rules can be defined using business concepts (i.e., a business glossary entry), the data quality module itself, or a CSV file.

A workflow for implementing quality rules in Truedat

A workflow for implementing quality rules in Truedat - Source: Truedat.

Each quality rule should have a name, description, domain (in which the rule will be stored), concept (business glossary entry to which the rule will be applied), and status (active or inactive).

The data quality rule engine can be used to implement data quality rules, track the results of these implementations, and generate reports.

The data quality module also provides a framework for data quality governance. You can define roles and responsibilities for data quality rule management. You can also track changes made to data quality rules.

Data lineage #

Truedat’s data lineage and impact analysis module visualizes the life cycle of data assets from origin to consumption and helps in monitoring data movement, transformations, etc.

Truedat provides traceability analysis

Truedat provides traceability analysis - Source: Truedat.

You can select the resources that you wish to analyze, as well as the analysis type. These include traceability and impact analysis.

You can choose the level of depth you want to visualize to reduce the complexity of an asset’s lineage mapping.

You can define the level of depth for each lineage map

You can define the level of depth for each lineage map - Source: Truedat.

Lastly, you can download all the lineage information in CSV format and use it with other tools in your data stack.

API integrations #

You can connect Truedat’s data catalog with other tools in your data stack via APIs and automate the process. This helps in importing asset metadata from Truedat automatically.

APIs also help in customizing your Truedat installation by letting you set up permission groups.

Define custom permissions and permission groups to customize your Truedat installation

Define custom permissions and permission groups to customize your Truedat installation - Source: Truedat.

Notifications #

Truedat has a notification engine to let you keep an eye on the events that interest you. For example, you can subscribe to quality rules and get notified whenever a quality result gets generated.

Another example is subscribing to business concepts so that you get notified whenever someone adds a new entry or makes changes to an existing one.

You can also subscribe to get notifications on events that are tagged for a specific role.

Administration of roles #

Roles in Truedat aren’t preconfigured. You can define and customize as many roles as needed for your organization.

User roles in Truedat

User roles in Truedat - Source: Truedat.

You can configure specific rules for each user role. For instance, you can control which roles are allowed to draft, review, publish, or delete business concepts. You can also control who gets to:

  • View protected metadata in the data catalog
  • Create, modify, or delete data quality rules and quality workflow implementations
  • View lineage
  • Create, modify, or delete data domains
  • Request, review, allow, or remove data access requests

Dashboards #

Truedat integrates with Metabase to support data visualization and reporting. A business concepts dashboard can include metrics related to its status, completeness, quality rules, and more.

Meanwhile, a data quality dashboard can contain information about quality rules and implementations. Like other Truedat assets, you can manage permissions to access these dashboards.

A data quality dashboard in Truedat

A data quality dashboard in Truedat - Source: Truedat.

A summary of Truedat features that support data governance #

The architecture overview offered a glimpse of Truedat’s setup and capabilities. Here’s a summary of all the features in Truedat that help with data governance:

  • Centralized metadata repository
  • Granular and customizable role-based access control mechanisms
  • Data quality rules, metrics, and dashboards
  • Data lineage visualization and impact analysis
  • Notifications, annotations, and comments to enrich data and stay on top of everything
  • Data encryption and security measures in transit and at rest

Getting started with Truedat #

Here are the steps to take to set up Truedat:

  1. Prepare the environment to create and activate virtual environment
  2. Install Truedat and dependencies
  3. Create the database and set up the necessary migrations
  4. Launch Truedat

Alternatives to Truedat #

The open-source data governance landscape has several alternatives to Truedat, offering similar or complementary functionalities. These include Amundsen, DataHub, Apache Atlas, Magda, and OpenMetadata.

Read More → 7 open-source data governance tools to consider

Wrapping up #

Truedat is a customizable tool that offers several capabilities for data cataloging, quality management, and data governance.

While Truedat presents numerous advantages, it’s essential to acknowledge that open-source solutions like Truedat aren’t pre-configured, and can require significant time and resources for setup.

Moreover, the total cost of ownership, including the engineering and maintenance costs, can become substantial.

Alternatively, you can opt for an active data governance platform like Atlan. Atlan is open by default, extensible, user-friendly, and caters to a wide array of data sources and governance needs. It integrates with popular data warehouses, data lakes, ETL solutions, BI platforms, and tools in your data stack.

Share this article

[Website env: production]