Truedat: A Comprehensive Guide on This Open-Source Data Governance Tool
Share this article
Truedat is an open-source tool from Bluetab designed to address data quality and governance. This guide will explore the essential aspects of Truedat, including its origins, core functionalities, structure, setup, and how it compares with other tools in the market.
Table of contents
- What is Truedat?
- Truedat: Architecture overview
- A summary of Truedat features that support data governance
- Getting started with Truedat
- Alternatives to Truedat
- Wrapping up
- Related reads
What is Truedat?
It is equipped with a suite of integrated features, including:
- A global search engine
- Automatic data cataloging capabilities
- Quality rule management
- Data life cycle visualization (Data Lineage)
- Data intake and delivery management (Data Requests)
- A dashboard for metrics on data status and quality
Truedat origins: The problem that it aims to solve
In 2018, Bluetab (now an IBM Company) launched Truedat. The primary motivation behind Truedat’s development was the escalating demand for effective data governance solutions. Businesses were looking for tools that could:
- Catalog and oversee data from diverse sources
- Pinpoint and monitor data quality issues
- Implement and uphold data policies and protocols
- Conduct audits on data usage
- Track data governance activities
To understand what Truedat has to offer in terms of data governance, let’s get an overview of its architecture and its various modules.
Truedat: Architecture overview
Truedat supports data governance by allowing you to organize and enrich information through configurable workflows and monitoring activity. Its principal components are:
- Data catalog
- Data quality
- Data lineage
- API integrations
- Administration of roles
Let’s explore each component further.
Truedat provides a centralized repository for technical metadata management captured from various data sources. This can include data asset source, type, tag, relationships, and ownership.
The catalog offers a single, structured inventory of data assets and integrates with several tools in the modern data stack.
You can use the catalog to find information on data assets such as asset import date, version, quality, relationships, usage frequency, popularity, etc.
The catalog helps you understand data structure type, version, confidentiality, tags, changes made, etc. You can also share structures via Truedat and the relevant people will get notifications and emails.
Tables and files in Truedat are called Structures.
This module handles data quality rules, implementation, and governance. It aids in safeguarding data and ensures its consistent quality across systems by defining and implementing data quality rules.
Data quality rules are based on your organization’s quality principles. According to Truedat, the quality rule must include the following details:
- Why use this rule?
- How does it affect business?
- Description of how the rule should be implemented and any other data that is considered relevant from a business/functional point of view
In Truedat, quality rules can be defined using business concepts (i.e., a business glossary entry), the data quality module itself, or a CSV file.
Each quality rule should have a name, description, domain (in which the rule will be stored), concept (business glossary entry to which the rule will be applied), and status (active or inactive).
The data quality rule engine can be used to implement data quality rules, track the results of these implementations, and generate reports.
The data quality module also provides a framework for data quality governance. You can define roles and responsibilities for data quality rule management. You can also track changes made to data quality rules.
Truedat’s data lineage and impact analysis module visualizes the life cycle of data assets from origin to consumption and helps in monitoring data movement, transformations, etc.
You can select the resources that you wish to analyze, as well as the analysis type. These include traceability and impact analysis.
You can choose the level of depth you want to visualize to reduce the complexity of an asset’s lineage mapping.
Lastly, you can download all the lineage information in CSV format and use it with other tools in your data stack.
You can connect Truedat’s data catalog with other tools in your data stack via APIs and automate the process. This helps in importing asset metadata from Truedat automatically.
APIs also help in customizing your Truedat installation by letting you set up permission groups.
Truedat has a notification engine to let you keep an eye on the events that interest you. For example, you can subscribe to quality rules and get notified whenever a quality result gets generated.
Another example is subscribing to business concepts so that you get notified whenever someone adds a new entry or makes changes to an existing one.
You can also subscribe to get notifications on events that are tagged for a specific role.
Administration of roles
Roles in Truedat aren’t preconfigured. You can define and customize as many roles as needed for your organization.
You can configure specific rules for each user role. For instance, you can control which roles are allowed to draft, review, publish, or delete business concepts. You can also control who gets to:
- View protected metadata in the data catalog
- Create, modify, or delete data quality rules and quality workflow implementations
- View lineage
- Create, modify, or delete data domains
- Request, review, allow, or remove data access requests
Truedat integrates with Metabase to support data visualization and reporting. A business concepts dashboard can include metrics related to its status, completeness, quality rules, and more.
Meanwhile, a data quality dashboard can contain information about quality rules and implementations. Like other Truedat assets, you can manage permissions to access these dashboards.
A summary of Truedat features that support data governance
The architecture overview offered a glimpse of Truedat’s setup and capabilities. Here’s a summary of all the features in Truedat that help with data governance:
- Centralized metadata repository
- Granular and customizable role-based access control mechanisms
- Data quality rules, metrics, and dashboards
- Data lineage visualization and impact analysis
- Notifications, annotations, and comments to enrich data and stay on top of everything
- Data encryption and security measures in transit and at rest
Getting started with Truedat
Here are the steps to take to set up Truedat:
- Prepare the environment to create and activate virtual environment
- Install Truedat and dependencies
- Create the database and set up the necessary migrations
- Launch Truedat
Alternatives to Truedat
The open-source data governance landscape has several alternatives to Truedat, offering similar or complementary functionalities. These include Amundsen, DataHub, Apache Atlas, Magda, and OpenMetadata.
Read More → 7 open-source data governance tools to consider
Truedat is a customizable tool that offers several capabilities for data cataloging, quality management, and data governance.
While Truedat presents numerous advantages, it’s essential to acknowledge that open-source solutions like Truedat aren’t pre-configured, and can require significant time and resources for setup.
Moreover, the total cost of ownership, including the engineering and maintenance costs, can become substantial.
Alternatively, you can opt for an active data governance platform like Atlan. Atlan is open by default, extensible, user-friendly, and caters to a wide array of data sources and governance needs. It integrates with popular data warehouses, data lakes, ETL solutions, BI platforms, and tools in your data stack.
Truedat Data Governance: Related reads
- What is Data Governance? Its Importance, Principles & How to Get Started?
- Open Source Data Governance Tools - 7 Best to Consider in 2023
- Data Governance Policy: Examples, Templates & How to Write One
- 7 Best Practices for Data Governance to Follow in 2023
- Benefits of Data Governance: 4 Ways It Helps Build Great Data Teams
- Data Governance Roles and Responsibilities: A Quick Round-Up
- dbt Data Catalog: Discussing Native Features Plus Potential to Level Up Collaboration and Governance with Atlan
Share this article