Data stewardship is the process of ensuring that all organizational data adheres to policies and procedures established by data governance. The data governance program defines the policies.
In other words, it’s all about accountability — implementing the data governance program — and data stewards are individuals in-charge of overseeing it.
In this article, we'll explore the concept of data stewardship to understand its role in data governance. We'll also delve into the roles and responsibilities of data stewards and how they ensure good governance.
Let's begin by understanding how governance and stewardship fit together.
What Is the Difference Between Data Governance and Data Stewardship?
According to Gartner, data governance is the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption, and control of data and analytics.
In other words, it helps organizations understand who has access to their data and decide how they should use it. That's where data stewardship comes into the picture.
Here's how PTAC (Privacy Technical Assistance Center), U.S. Department of Education puts it:
Data stewardship can be defined as a comprehensive approach to data management to ensure the quality, integrity, accessibility and security of the data.
Meanwhile, Dataversity puts it this way:
Data Stewardship means the formalization of accountability over the management of data, and the data-related resources.
So, while data governance programs set the rules, data stewardship oversees the smooth implementation of those rules.
Now let's look at the people responsible for stewardship.
Data Stewardship Management: The Role of the Data Steward
Let's go back to PTAC:
Data stewards are managers and administrators within an organization who are responsible for implementing data governance policies and standards and maintaining data quality and security.
Datastewards are considered to be the glue holding a data governance program together. They're responsible for governing data and ensuring compliance. That means overseeing:
- What data gets captured
- Where and how the data gets stored
- How data gets transformed and consumed
- Who has access and why
TechTarget sums it up best:
In summary, data stewards are accountable for the management of all data within and used by the enterprise and ensuring that the data-related rules as established by the data governance program are followed.
So, what would that involve?
What Does a Data Steward Do?
The role of a data steward varies depending on:
- The size of an organization
- The level of maturity of the data governance program
So, that means there is no "one size fits all" definition or approach.
Let's look at the level of maturity of data governance first. For instance, if the organization suffers from poor data quality, then the role of a data steward would be:
- Ensuring the quality, integrity and reliability of all data
- Enforcing rules that dictate its use across the organization
On the other hand, if discovery and access to data is a challenge, then the data steward would focus on:
- Consolidating data from diverse sources in a single repository
- Providing the right people with access to the right data
- Establishing data security and access controls to comply with regulations
The organization's size also dictates the number of data stewards required and their responsibilities.
For instance, being responsible for all the marketing data of a large corporation is a big ask. In such cases, the corporation could split its marketing data into sub-domains such as customer, product, and transactional data. Each of these sub-domains could be "owned" by a data steward.
So, how many data stewards do you need?
As we've mentioned earlier, this depends on the size of your organization and the data governance model adopted. For instance, according to SAS, you can have a data steward for each:
- Subject area (customer or product)
- Function (finance or sales)
- Business process (procurement or enrollment)
- Systems (billing or inventorying)
Summing it all up: The top three mandates of a data steward
However, if we were to generalize the role and put together the three top mandates of a data steward, that would involve:
- Taking complete ownership of their data domain — knowing what data falls under the domain, where it's located and how it's meant to be used
- Tracking the progress of the governance program with the right metrics and reporting practices
- Exploring and implementing new use cases for data-driven decision-making
Another thing to note — as the adoption of data-driven culture and governance initiatives grows, the role of a data steward will evolve.
As we've mentioned already, data stewards are all about accountability over data management and governance. This role is focused on the operational and administrativeaspects of information management across an organization.
As such, it requires familiarity with both technology and business processes, along with compliance requirements across geographies.
Since data stewards oversee the rules, they should have experience working with huge datasets so that they know when things aren't right.
On the other hand, the roles of data scientists and analysts are largely analytical in nature. Let's have a look.
Data analysts use existing data (i.e., structured data sets) to create ad-hoc reports that inform business decision-making — dashboards, KPIs, and other metrics. They also make it easier for non-technical users to consume data with useful visualizations.
Here's how KDNuggets puts it:
Data analysts are descriptive statisticians; here is the current data, here is what it looks like, and here is what we know from it.
But what if the existing data — all cleaned and neatly organized — isn't enough to make a decision? Or, the data assets required for decision-making don't exist yet?
That's when you call the big guns — data scientists. Data scientists are all about extracting valuable insights from data and using it to tell a story.
We'd like to quote from KDNuggets again.
"This role is the Jack of All Trades of the data world as it involves knowing how to:
- Extract data from a non-relational database
- Engineer features after some initial exploratory descriptive analysis
- Select an appropriate machine learning algorithm to perform some predictive analytics on the data
- Analyze the results of said predictive task statistically"
A fun — albeit crude — way to think of the difference between data stewards as opposed to data scientists and analysts is a soccer analogy:
- Stewards are like referees, as they must know the rules of the game and understand it inside-out. (P.S. Goalkeepers with a bird's eye view of the game and instructing their teammates on the correct positions could also fit into this role.)
- Scientists are like the creative midfielders who think outside the box to create innovative plays and work with the team (wingers and strikers) to make great passes.
- Analysts are like strikers with a single job — have great closing techniques to score a goal. They don't have to think of creative plays or engineer passes like the midfielders. Instead, just collaborate with their teammates to make a precise shot.
If you're still confused between the roles of an analyst and a scientist, then here's a handy flowchart from KDNuggets to understand the difference between data analysts, data scientists and other humans of data.
However, please note that this is more like a "quick classification map" and doesn't explore the nuances of each role.
Why Is Data Stewardship Important?
Only one in five companies has any kind of real-time data-monitoring protocol in place. So, more than 80% of companies are highly vulnerable to data misuse according to BCG.
That's why data stewardship is so important. It's crucial for the success of any governance initiative and without data stewards steering the ship, data security and quality challenges are here to stay. Let's see why.
1. Data security and privacy
Let's begin by stating the obvious — the pandemic has fast-forwarded digitization initiatives. We're in a world where cloud computing isn't optional but is crucial for business continuity.
Even before the era of rapid cloud adoption, data-related laws such as GDPR and CCPA were already cracking down on how organizations collected, stored, and used consumer data. As digitization continues to grow, more laws expecting better scrutiny and transparency from organizations will pop up.
Effective data stewardship can help organizations have control over the data they collect and the measures to store, safeguard and use it without attracting hefty fines or sanctions.
It also goes beyond compliance and provides organizations an opportunity to drive the conversation — such transparency helps them share their data management practices as well as challenges.
According to BCG, when a company is open with regulators about its data practices, regulators are more likely to view new uses or consumer feedback positively. The goal is to give regulators a broad understanding of practices and issues on the horizon. This lets companies help drive the conversation instead of simply reacting to it.
2. Data quality and trustworthiness
Another element that has evolved is the perception of the role data plays in business. According to Rob Casper, CDO at JP Morgan Chase:
Data is the lifeblood of organizations. And it’s not just financial services. As more and more people digitize all that they do, it all comes down to having transparency and access to that data in a way that’s going to deliver value.
However, translating data into business value is easier said than done. Finding the right data and having access to it is one part of the problem. The other part? Data quality. One in every three business leaders don't trust the data they use.
A solid data governance program with standards for data quality overseen by capable data stewards is the key to solving the credibility problem.
3. Data discovery
Knowing where data resides is a top mandate of any data steward. It's the key to:
- Tracking the lineage of data assets
- Documenting the transformations they've undergone
- Mapping data available to the relevant users whenever they need it
Sure, modern data governance platforms like Atlan automate several aspects of governance such as classifying data, populating business glossaries to add context, mapping data lineage, and running quality checks.
However, without data stewards spearheading the efforts, technology can only do so much. Stewards dictate the rules and definitions governing data and without their efforts, glossaries could include vague descriptions, data tags and classes could be incorrect and quality check algorithms could be incomplete.
That's why effective data stewardship is a prerequisite for tackling the data discovery challenges.
Benefits of Data Stewardship
In summary, good data stewardship is essential for good governance. If done right, data stewardship helps organizations:
- Enforce governance programs with clear policies, processes, and standards for data
- Incorporate a culture of data documentation and raise awareness about data-related best practices
- Meet compliance requirements as dictated by regulatory bodies across geographies
- Monitor data quality, integrity, security, and privacy to reduce risks and improve the overall management and use of data
- Support all humans of data (i.e., analysts, scientists, business managers) in extracting value from data and using it for decision-making
More importantly, data stewardship is the key to data democratization — a world where the right people instantly get access to the right data. That's when organizations grow exponentially.
Ted Colbert, the CIO of Boeing, sums it up best:
You have to have a platform through which people can easily access data. That helps people believe in it and deliver solutions that don’t require an expensive data scientist. When people begin to believe in the data, it’s a game-changer: They begin to change their behaviors, based on a new understanding of all the richness trapped beneath the surface of our systems and processes.
If data stewardship holds the key to good governance and data management, then why aren't more organizations going all-in and reaping its rewards?
Well, like all the other aspects of data management, stewardship comes with its fair share of challenges.
The Challenges that Modern Data Stewards Face
1. Buy-in from the C-suite
Lack of senior management commitment has killed many governance programs and doomed many data stewards.
According to BCG, senior line executives in most companies don't engage in policy and procedure. Instead, they delegate that responsibility to the legal and IT teams.
The result? Data misuse and compliance issues. Here's an example.
When cars used Google Maps, the app also collected data from home Wi-Fi networks (passwords and emails) and built customer profiles to improve the Street View feature of Google Maps. All this without the consent of Google's consumers — they had no idea Google Maps was using data from their home networks to profile them.
When Google investigated the case, they discovered that senior executives weren't in the loop and had no clue about these activities. Had they known, they wouldn't have approved these activities and Google wouldn't have faced a major trust and compliance problem.
That's why for data stewardship to work, the C-suite must participate actively and play an active role in the decision-making processes. Their involvement also makes transparency and open communication the top mandates for the entire data ecosystem.
As Rob Casper, the CDO at JP Morgan Chase, puts it:
If everybody sees what everybody else is doing, then the great ideas tend to rise to the top and the bad ideas tend to fall away.
2. Vague job descriptions
Even though organizations understand the importance of data stewardship, they're not clear on the role of a data steward. In many cases, data stewards are just figureheads, without any real authority and with limited understanding of their responsibilities.
According to SAS, data stewards become roving linebackers, going from meeting to meeting with no real authority to resolve data quality problems or enhance metadata management capabilities.
For data stewards to do their jobs, they need a clear description of their roles, responsibilities, and powers. For instance, rather than saying the data steward is responsible for all marketing data, it's useful to be more specific. That means including:
- Details of the type of data to be governed (customer, product or financial data)
- Goals or KPIs with a reasonable timeline (reduce errors in data by 10% in the first 6 months and by 20% in the next 6 months)
- Reporting parameters (how the steward will track progress and who will be in charge of monitoring the steward)
According to Gartner:
To achieve the greatest benefit, enterprises must properly define the data steward role, select the right individuals to fill the role, and guide the stewards' behavior in accordance with best practices.
3. Inadequate tooling
As the volume of data ingested by applications and systems grow and become more complex, the task of data stewardship also becomes more challenging.
Relying on spreadsheets, meeting notes, or project management tools alone isn't enough. Remember how we mentioned technology can only do so much? Well, once the mandate of a data steward is clear (domain and role, KPIs, action plan), technology can be a game-changer.
Imagine a single platform that can handle everything from data discovery to governance, while facilitating collaboration and data sharing.
For instance, a platform that automates data profiling, metadata crawling and data classification can save 95% of the time spent looking for data.
Similarly, a centralized repository for all data with a single dashboard to take stock of all the assets makes data available and accessible to all non-tech users.
Automated quality checks and alerts help catch data problems before they become too big to handle.
In-built collaboration features (chats, comments, and notes) and easy data sharing (with just a click) foster a culture of collaboration, communication and transparency.
Data stewards can do their jobs well with access to the right tools and technologies that come equipped with features, just like the ones we've mentioned above.
Data steward is a complex role that requires someone who understands both business and IT and can communicate to the C-suite effectively to influence data-related policies favorably.
As Laura Madsen said:
Data stewards were meant to help solidify the squishy… They speak the language of IT and translate that back to the business. The role requires the patience of a kindergarten teacher and the ability to successfully negotiate a hostage situation.
In many cases, data stewards might already be there — folks who oversee the accountability and management of data. However, they lack the authority, clarity, and motivation to play a crucial role in data governance.
That's why organizations should work on:
- Solving their data stewardship challenges (detailed above)
- Establishing a solid data governance framework
- Appointing the right people as data stewards to implement, run and monitor governance initiatives
- Educating them with governance related best practices
- Equipping stewards with modern data governance and stewardship tools like Atlan to support their efforts
Ready to make data governance effortless with our modern, people-centric platform?
- What is data governance: Definition, importance, and components
- Data Stewardship: The key to implementing a robust data governance program
- 6 commonly referenced data governance frameworks in 2022
- Data management 101: Four things every data professional should know
- What is data quality? Examples, dimensions, metrics, and best practices