Data Catalog Training: The What, Why, and How
Share this article
Data catalog training is vital to a data catalog’s success. A data catalog succeeds when users of all backgrounds and job roles find it easy to use.
In this article, explore how to design a training program that meets the needs of all its users and overcomes the most common challenges of data catalog training programs.
Table of contents #
- What is data catalog training?
- Why do you need data catalog training?
- What data catalog training needs to cover
- Challenges with data catalog training
- How to create a data catalog training program
- Wrapping up
- Related reads
What is data catalog training? #
Data catalog training educates employees, partners, and other stakeholders on the basic functions of a data catalog relative to their job roles.
It ensures both the adoption and success of a data catalog implementation by demonstrating the value a data catalog adds to their daily workflows.
Why do you need data catalog training? #
Data catalog training aims to educate data practitioners on the business value of the data catalog.
Training will help them explore how the data catalog works and how existing workflows will change to incorporate it, thereby improving data literacy across the organization.
With data catalog training, your employees can:
- Understand what data exists, where it is located, and how it can be used
- Learn how to use the data catalog to navigate through your data estate
- Understand how to share data, and insights, and collaborate on data-related projects.
- Be familiar with data privacy and security rules and ensure regulatory compliance
This is essential because data catalogs are still a relatively recent concept in the business world. Few users will have prior experience in using one.
Moreover, some employees may not know that the data catalog even exists. Even if they do, they may not understand how using it is an improvement over their current work processes.
Adapting data catalog training for various user roles #
It isn’t enough just to create an online course and be done with it. A successful data catalog training program needs to account for different end users, data teams, and data governance roles.
For example, a regular data catalog end user — a program manager, division manager, or field sales representative — maybe only needs to know the basics of finding data with natural language searches, tagging data, and reporting.
However, a data owner or business analyst might require more detailed training on SQL queries, filtering, advanced reporting, tracking data lineage, and maintaining a business glossary.
Moreover, data scientists, engineers, and IT staff will need in-depth training on integrating their data pipelines, tools, and other services with the data catalog via open APIs.
What data catalog training needs to cover #
Your data catalog training program should be specific to your organization. It should reflect both existing processes and the details of your data stack.
That said, users at most companies will fall into one or more of the following roles. So, use the table below as a general roadmap for planning out your data catalog training program.
Role | Training |
---|---|
All users | 1. Data catalog core concepts 2. Catalog usage (natural language queries, filtering) 3. Data lineage: viewing and using 4. Tagging and classification 5. Reporting 6. Collaboration features 7. Security and compliance regulations specific to your company’s industry |
Data team (business roles, such as data product owners, and data stewards) | 1. Everything included in the general training for all users 2. Onboarding new data sources 3. Implementing governance controls 4. Documentation standards and protocols |
Data team (technical roles, such as data scientists, data engineers, and IT staff) | 1. Everything included in the general training for all users 2. Onboarding new data sources 3. Architecture and integration issues (for example, SSO integration, role-based access control security) 4. Data integrations with native connectors 5. Open API and custom integrations |
Data governance team | 1. Everything included in the general training for all users 2. Implementing governance controls 3. Compliance alerts, workflows, and reporting requirements |
Challenges with data catalog training #
You’re likely to run into a few challenges while rolling out your data catalog training program. Here’s what to look out for:
- Getting user buy-in
- Tackling tool complexity
- Grappling with poor data literacy
- Keeping the data catalog training program up-to-date
Let’s delve into the specifics of each challenge.
Getting user buy-in #
It’s hard to get people to change their habits — even when the change is an improvement.
So, data catalog training should lay out the value of the data catalog explicitly in terms of better data quality and faster decision-making.
Training exercises should be hands-on, outcome-centric, and reinforce this value proposition. For example, show how using the data catalog, users can find specific data and incorporate it into reports faster than they could have before.
Tackling tool complexity #
Depending on the data catalog you choose, the learning curve may be steep, especially for business users without a technical background.
Additionally, if the catalog interface is clunky and not designed for user experience, understanding the full functionality of the data catalog may be difficult.
So, choosing a data catalog tool that best fits the skills and needs of your users, and comes with a clean, simple UI designed for enhancing user experience can help mitigate this challenge.
Grappling with poor data literacy #
Exploring the complete potential of data catalogs often requires a degree of data literacy. The users must have a fundamental understanding of their data assets, governance, privacy, and how data can be used responsibly.
With tailored, role-based training and an emphasis on building a culture of data literacy within your organization, you can overcome this challenge.
Keeping the data catalog training program up-to-date #
Training has a short shelf life. It goes out of date as systems and processes change.
So, task certain employees in your data governance initiative with monitoring training and providing a refresh periodically to keep everyone updated.
How to create a data catalog training program: A step-by-step guide for your organization #
Putting together a training program can be daunting. Follow these steps to ensure a successful launch and roll-out:
- Define the users of your data catalog
- Identify relevant vendor training
- Create training for organization-specific components
- Define training formats
- Create your training environments
- Set deadlines and track completion progress
- Offer a community support venue
Let’s explore each step further.
Define the users of your data catalog #
Using the data catalog training role outline above, define who will make use of the data catalog at your organization and determine:
- What features of the data catalog will they use?
- What level of access do they need?
- What level of experience do they have with data catalogs or similar systems?
Identify relevant vendor training #
Most data catalog vendors create and maintain training for their products. See what your vendor offers and identify which offerings will fit into your own training program.
Some vendors also offer training by certified instructors. Such courses come with pre-made materials and demonstration environments for hands-on practice.
If you’re still shopping for a data catalog, take a close look at the training and customer support potential vendors provide and ask these questions:
- Is it complete?
- Is it current?
- Is it easy to understand?
- Does it cost extra or does it come as part of the product?
Create training for organization-specific components #
What your vendor offers won’t provide everything your organization needs. You still need to train users in how the data catalog:
- Fits into existing data-related workflows
- Integrates with other elements of your data stack
- Replaces or obsoletes certain tools or processes
To identify what custom training you need, look at the user roles we discussed above and identify what tasks each user performs related to data. In training, this is called job task analysis.
Here’s how Christopher Pappas, the founder of eLearning Industry Inc, puts it:
A Job Task Analysis (JTA) gives you the opportunity to evaluate employee work duties to personalize the online training experience. It delves into every step and skill involved so that you can pinpoint areas for improvement and provide just-in-time support.
Once you’ve identified the tasks everyone performs, identify how the data catalog changes or impacts these tasks. These will constitute the core of the training you’ll need to create.
Define training formats #
Whether you create training internally or outsource it to a training vendor, you’ll need to decide what format the training takes. The standard options are:
- Instructor-led training
- Self-paced training
Instructor-led training #
A trained and certified instructor conducts a training class. The instructor helps employees with hands-on training and answers questions as they arise.
The training may be conducted in person or virtually.
Instructor-led training provides the opportunity for more personalized training and hands-on assistance.
However, it’s more expensive to create and offer. Moreover, it requires employees to block anywhere from a few hours to several days off of their schedules.
Self-paced training #
Training is published as courseware on a training portal and employees can take it at their own pace.
Self-paced training can be taken anytime, anywhere. Employees can do their training when it fits their schedules.
On the downside, they won’t get the opportunity to interact with data catalog experts.
Generally, instructor-led training is a good choice when you’re introducing employees to the data catalog for the first time. You can then use self-paced training to provide updates or dive into more specialized topics.
Create your training environments #
As part of their training, users should engage in hands-on learning with your new data catalog. But, you don’t want them using your production data catalog for training.
So, work with your IT department and vendor in the early stages of deploying your data catalog to set up non-production environments for training.
These environments should mimic your production data catalog environment, using mock data or real-world data that’s been anonymized and cleansed of sensitive information.
Set deadlines and track completion progress #
Use your company’s existing learning management system (LMS) — like Cornerstone or Canvas — to capture and track who’s taken what training.
Using an LMS allows you to:
- Host self-paced training
- Set deadlines for completion
- Provide roll-up reports to management on how many people in the organization have taken the training
- Generate compliance reports on subjects such as training on adherence to regulatory guidelines
Offer a community support venue #
The best way for people to learn is by asking questions.
No matter what format your training takes, consider offering an internal discussion forum where users can ask questions and others can share their wisdom.
This enables those who have already mastered a given aspect of the data catalog to share their acquired knowledge with their co-workers.
Wrapping up #
In this article, we’ve given you the basics for how to create a successful data catalog training program from scratch. With well-structured and role-based training, you can ensure that your employees understand and explore the multiple functions of a data catalog.
By investing time and effort in providing thorough and effective data catalog training, you can ensure that your employees are fully leveraging their data assets and building an organization-wide data culture.
However, the success of a data catalog is more than about training. It certainly helps when you choose a data catalog that comes with a modern, user-friendly interface that adapts to different data users.
Data catalog training: Related reads #
- Enterprise data catalog: Definition, Importance & benefits
- Data Catalog: The Must-Have Tool for Data Leaders in 2023
- Data catalog benefits: 5 key reasons why you need one
- Open Source Data Catalog Software: 5 Popular Tools to Consider in 2023
- Data Catalog Platform: The Key To Future-Proofing Your Data Stack
- Top Data Catalog Use Cases Intrinsic to Data-Led Enterprises
- AWS Glue Data Catalog: Architecture, Components, and Crawlers
- Airbnb Data Catalog — Democratizing Data With Dataportal
- Lexikon: Spotify’s Efficient Solution For Data Discovery And What You Can Learn From It
- Google Cloud Data Catalog Guide - Everything You Need to Know
Share this article