Multi-Cloud Data Governance: Five Rules for Success

Updated December 23rd, 2024

Share this article

Multi-cloud is the new black. One study, from OVHcloud, found that 62% of UK enterprise companies are currently deploying on multiple cloud providers. While multi-cloud has numerous benefits, it also comes with some challenges — not the least of which is managing data governance in a technologically heterogeneous environment.
See How Atlan Simplifies Data Governance – Start Product Tour

In this article, we’ll delve into what multi-cloud data governance is and cover five strategies for making it easier to implement and manage.


Table of Contents #

  1. What is multi-cloud data governance?
  2. The challenges with multi-cloud data governance
  3. Five best practices for multi-cloud data governance
  4. Conclusion
  5. Related reads

What is multi-cloud data governance? #

Multi-cloud data governance extends data governance — your collection of processes and tools for controlling decision rights and accountability around data — to cover data stored across multiple cloud service providers. It ensures that data is governed and compliant regardless of whether it’s object/metadata storage in an AWS S3 bucket, unstructured data in Microsoft Azure Data Lake, or relational data in a distributed SQL database like Cockroach DB.

Enterprises with critical data workloads that cannot tolerate downtime — like financial services — move to multi-cloud architecture to guarantee operational resilience, even if an entire cloud platform goes down. Other companies go multi-cloud to ensure compliance with data sovereignty regulations by distributing their data across different cloud service providers (CSPs).

Regardless of motive, any multi-cloud deployment encounters the very real challenge of governing its data so it’s always consistent and available across multiple platforms. This is why managing multi-cloud data demands specific tools and solutions that can handle the complexities of multi-cloud data governance.


The challenges with multi-cloud data governance #

There are four things that make multi-cloud data governance a challenge:

  • Tracking assets and costs
  • Monitoring data flows
  • Using cloud vendor-specific tools
  • Ensuring security across cloud boundaries

Tracking assets and costs #


One of the benefits of the cloud is that engineers can easily spin up new resources. However, that’s also one of the cloud’s key business challenges.

Unless closely tracked, teams can create data stores that end up becoming data siloes — ungoverned and hard-to-find repositories of data that may not conform to the company’s data quality and data governance policies. Unless closely monitored and governed, multi-cloud deployments can degrade your overall data governance quality.

Teams may also leave old, no longer utilized data infrastructure up and running, incurring higher than necessary (and completely unneeded) cloud spend. Without appropriate controls in place, managers and executives may not notice these overruns until massive cloud bills start hammering their budgets.

Monitoring data flows #


A related risk is that teams may transmit data across clouds in violation of privacy regulations. That can result in expensive fines and a loss of customer trust.

Meta learned this the hard way when the Irish Data Protection Authority found Meta Platforms Ireland Limited had violated data protection regulations by transmitting Irish users’ data to the United States. The fine: 1.4 billion Euros — the largest GDPR fine to date.

Using vendor-specific tools #


Many cloud vendors have their own data governance tools, such as data catalogs, compliance monitoring tools, and the like. The cloud’s pay-as-you-go model makes it tempting to take advantage of these default solutions.

Because these tools are designed for a specific cloud platform, they may not work well — or even at all — with other tools. This is especially a pitfall when reconciling multiple data tools, given that most companies draw data from over 400 distinct data sources.

Most cloud-specific data catalogs, though, don’t offer a diverse library of data connectors required to cover this breadth and diversity of data. This forces companies to build custom workarounds.

Ensuring security across cloud boundaries #


Another challenge is securing cross-cloud data boundaries. 31% of data leaders cited this as a major concern for their multi-cloud implementations.

Going multi-cloud means you can’t just use a single cloud-specific identity provider, such as AWS’s Identity and Access Management (IAM), across your entire data estate. Instead, you need to manage and integrate multiple security systems — which increases the risk of a data leak or security breach.


Five best practices for multi-cloud data governance #

Solving multi-cloud data governance isn’t easy. However, there are a few best practices that do help simplify it:

  • Add multi-cloud to your cloud data governance framework
  • Control and monitor cloud access
  • Provide management tools for common operations
  • Use Single Sign-On (SSO) and federation everywhere
  • Use third-party data governance and monitoring tools

Let’s look at each of these in detail.

Add multi-cloud to your cloud data governance framework #


The first step is raising awareness among teams around the need for good governance of multi-cloud systems. This means updating your cloud data governance guidelines to address multi-cloud usage specifically.

Guidelines should discuss onboarding and approval procedures, preferred clouds for specific workloads, and the responsibilities that teams assume when launching multi-cloud solutions.

Control and monitor cloud access #


Good multi-cloud governance means you can’t just tell teams to create a new cloud account and have at it. Rather, you need to provide the tooling to approve, create, and monitor usage of new cloud accounts across all supported clouds.

Common multi-cloud management tools should include:

  • Tracking spend and billing it back against the appropriate internal cost center
  • Deploying data governance policies to enforce compliance
  • Tagging data resources to track compliance - e.g., adding sensitivity labels to identify and track Personally Identifiable Information (PII)

Some large companies may create their own internal toolsets to manage this. For most orgs, adapting Software as a Service (SaaS) multi-cloud management solutions for cross-platform deployment, cost management, monitoring, and security will be a faster and more cost-effective alternative.

Provide cross-cloud management tools #


Along with monitoring, provide cross-cloud templates for common application and data workload scenarios within your company using services such as Terraform and Ansible. This helps ensure uniformity across deployments. It also makes developers more productive by reducing the platform-specific knowledge they need to deploy on a new cloud provider.

Use Single Sign-On (SSO) and federation everywhere #


SSO integrates a centralized identity database across multiple cloud-based applications. Using a single identity provider reduces the risk involved in creating and securing multiple authentication credentials and permissions. It also simplifies decommissioning access (e.g., when someone leaves the company), as IT personnel only need to deactivate one set of credentials.

All internal multi-cloud and SaaS management tools should use SSO wherever possible.

Use cloud-agnostic data governance and monitoring tools #


Cloud service providers provide their own platform-specific solutions for data cataloging, data governance, compliance management, application performance monitoring, etc. Defaulting to these may be tempting when you’re using a sole provider, but single-vendor solutions — particularly for data — make multi-cloud deployment even more complicated than it already is.

Using cloud-agnostic tools and solutions from third-party providers prevents getting locked in to a single cloud provider. These solutions are also more likely to support the various data sources in use at your company because they are built to integrate with all types of data sources across all kinds of clouds.

For application performance management (APM), vendors like DataDog and Splunk provide tools to consolidate logs and metrics from all major cloud providers, as well as private and hybrid clouds.

For data cataloging and data governance, Atlan is a comprehensive third-party, cross-platform solution. It supports 80+ data sources with out-of-the-box connectors and can serve as the single source of truth in your organization — no matter where your data lives.

Atlan provides other best-in-class features that make it a solid choice for data governance:

  • An easy-to-use UX with a natural language interface that data analysts, data stewards, and business users can leverage to discover, activate, and govern data
  • Full column-level data lineage to build trust in data and resolve problems at their source
  • AI copilot support for everything from documentation to SQL generation to policy creation

Conclusion #

Supporting multi-cloud data governance requires new tools and processes to ensure data remains governed, secure, and compliant regardless of where it lives. Leveraging cloud-agnostic, cross-platform tools like Atlan reduces the time and effort required to remain compliant with applicable regulations and keep critical business data safe.

Learn more about how Atlan can help you successfully govern cross-cloud - book a demo today.



Share this article

[Website env: production]