Data stewardship is a vital (if not misunderstood) discipline that promotes effective data management in the enterprise organization. Here’s everything you need to know about what data stewardship is, why it’s important, and how it relates to data governance.
Data stewardship definition
Data stewardship is a collection of data management functions that ensure an organization’s business users have access to trustworthy, high-quality data. The DAMA Dictionary of Data Management states, “Data Stewardship is the most common label to describe accountability and responsibility for data and processes that ensure effective control and use of data assets.”
Put another way by the University of Michigan in their data stewardship policy, “effective data stewardship is the process for maximizing the value of data as an institutional resource.”
Data stewardship touches every phase of the data life cycle:
- Data creation
- Data processing
- Data storage
- Data usage
- Data archiving
- Data destruction
By combining business acumen with technical skills, data stewards make sure all departments have access to reliable data in their daily work. As such, they oversee specific activities such as data collection, data cataloging, and data inventorying that enable users to find and extract value from data sets.
At a higher level, data stewards contribute to building a modern data culture inside an organization. For example, they may help identify strategic use cases for data and educate the C-suite on new ways to leverage data in decision-making.
Why is data stewardship important?
Data stewardship is important because it facilitates the accuracy, usability, and accessibility of an organization’s data assets. When data reliability is high, employees are more likely to consistently use the data at their disposal to enhance their daily activities.
Without data stewards to implement and enforce data governance, businesses are left constantly fighting fires when data dashboards and products are inaccurate.
What are the benefits of data stewardship?
The benefits of data stewardship include:
- Improved data quality and reliability
- More effective implementation of data governance
- Stronger data documentation and awareness about data management best practices
- Faster and more valuable analytics programs
- Reduced risks around data-related security and privacy requirements
- Enhanced ability to meet compliance requirements and regulations
Consider this example of the power of effective data stewardship: The Earth Science Information Partners (ESIP) wanted to foster greater data collaboration between scientists. To that end, they identified data stewards to create citation guidelines and uniform metrics that could be used to find information across many different types of data repositories. This visibility of shared data and chance of collaborating on it has a significant impact on the progress of issues like climate change between engineering geologists, geochemists, oceanographers, etc.
From a corporate organization point of view, data stewards act as bridges. As Laura Maden said in Disrupting Data Governance, “Data stewards were meant to help solidify the squishy… They speak the language of IT and translate that back to the business. The role requires the patience of a kindergarten teacher and the ability to successfully negotiate a hostage situation.”
Data stewardship and data governance
Data stewardship and data governance and closely interlinked, however, these terms are not synonymous:
- Data governance is a set of policies, processes, and standards to collect, manage, and store data.
- Data stewardship is the implementation of those policies, processes, and standards.
In other words, data governance is the strategy that determines how data management decisions should be made, and data stewardship is the tactical implementation based on that strategy. To help connect the dots, here are a few terms related to data governance and their application to data stewardship:
- Data architecture refers to the models, policies, rules, and standards that govern which data is collected and how it is stored, arranged, integrated, and used in data systems. Data architecture is designed collaboratively by Chief Data Officers, data stewards, data admins, etc. to guide how data flows across the organization.
- Data curation is the end-to-end process of preparing and managing data (identifying, cleaning, and transforming it) so that it is ready for business use — a crucial skill of the data steward.
- Data glossaries are collections of terms and definitions that help data users understand data assets' key characteristics. Data stewards often oversee the creation and maintenance of an organization’s data glossary.
- Data governance frameworks build upon data architecture to encompass all the people, processes, technologies, and workflows needed to support governance and guide how data stewards make decisions.
- Metadata management entails collecting, categorizing, integrating, and maintaining high-quality metadata. This is another vital discipline for data stewards to master in order to effectively manage vast amounts of data.
Who are data stewards and how do they enforce data governance?
A data steward is a subject matter expert who is responsible for defining and maintaining the integrity of a specific type of data or data domain. They help the organization build data glossaries, create and maintain data quality rules, and determine who has access to data.
Here’s how data stewards slot into other data governance roles and responsibilities. The most common data governance roles are:
- Data admin: Responsible for operationalizing the data governance program by processing and transforming data into best-fit data models
- Data custodian: Handles the movement, security, storage, and use of data (e.g., the technical aspect of setting up permission controls, versioning master data, configuring system backups, etc.)
- Data steward: Enables data collaboration and democratization by serving as the bridge between the business users and the IT department; oversees the standardization of data definitions and optimization of data-related workflows and communications
- Data user: Anyone who extracts value from data: marketers, researchers, executives, business managers — in some organizations, virtually every employee may be a data user
The above definitions are not hard and fast rules. For many businesses, there may be significant overlap or consolidation of these roles. The number of data stewards and exact purview of each steward is highly dependent on the needs of the organization.
For example, the world-renowned SAS Institute notes that data stewardship may be broken down by:
- Subject area (e.g., customer or product)
- Function (e.g., finance or sales)
- Business process (e.g., procurement or enrollment)
- Systems (e.g., billing or inventorying)
- Project (e.g., launching or refining the program)
How are data stewards chosen?
So, how do you choose data stewards for the above data domains? Robert Seiner, founder of KIK Consulting and The Data Administration Newsletter, has a simple model for choosing data stewards. This model notes that data stewards may be appointed in one of three different ways:
- Employees may be assigned to be data stewards and own certain data domains (aka told they have no choice but to be held accountable for the integrity of the data).
- Data steward roles may be identified and designed with the idea that the employees who are the best fit will naturally fill the role (e.g., a senior data team member with both engineering and analytics skills).
- Data stewards may be recognized based on the ad hoc work that employees are already doing to govern data.
The first two methods of “assigning” and “identifying” are very much part of the restrictive, top-down data governance models that are (hopefully) on their way out.
The third “recognizing” method is more aligned with the future of bottom-up, collaborative data governance. Seiner notes this non-invasive perspective on data governance is “built on the premise that people are already governing data, but they are governing data in an informal manner, leading to inefficiencies and ineffectiveness in the way data is managed.”
What’s next for data governance and data stewardship?
It’s no secret that data governance has a reputation for being restrictive and bureaucratic. The “dreaded G word” is often called boring, as is the technical-sounding discipline of “data stewardship.”
But what if governance and stewardship earned a reputation for aiding innovation instead of stifling it? They are in need of something more collaborative and community-led than the data governance programs of old.
Thankfully, there is a new vision of data governance and stewardship emerging, one that takes a decentralized, bottom-up approach where all data professionals are able to contribute instead of only a chosen few. Instead of completing a data project and then making changes based on data governance requirements, the future of data governance will be baked directly into the daily workflows of data professionals.
In practice, this means adopting a data platform that allows for data access flexibility, permission controls, etc. depending on evolving business needs — and empowering more data professionals to contribute to data stewardship.
How Atlan helps in data stewardship and governance
Atlan is a third-generation data catalog and metadata management tool built on the premise of embedded collaboration that is key in today’s modern workplace, borrowing principles from GitHub, Figma, Slack, Notion, Superhuman, and other modern tools that are commonplace today.
Atlan helps data stewards make access control & governance a breeze with automated PII classification, granular access control & usage tracking.
Data stewardship: Related reads
- What is data stewardship: Meaning, benefits, and its importance in data governance
- What is data governance: Definition, importance, and components
- What is data quality? Examples, dimensions, metrics, and best practices
- 6 commonly referenced data governance frameworks in 2022