MCP Server for Databricks: Deployment Types & Setup

Emily Winks profile picture
Data Governance Expert
Updated:05/12/2026
|
Published:05/12/2026
10 min read

Key takeaways

  • Databricks offers three MCP server deployment options: fully-managed, custom via Databricks Apps, external via AI Gateway.
  • Fully-managed MCP servers on Databricks have four sub-types: Unity Catalog functions, SQL, Genie Spaces, and Vector Search.
  • A Unity Catalog function becomes an MCP tool automatically at deployment, with no additional configuration required.
  • Databricks MCP context stops at the lakehouse boundary. Cross-stack context requires a separate enterprise context layer.

What is the MCP server for Databricks?

The MCP server for Databricks is an interface that exposes Databricks data, Unity Catalog metadata, and AI services to external agents and tools. Databricks released managed MCP servers (Public Preview) in January 2026, providing three deployment options that differ by how much control your team holds over infrastructure, authentication, and authorization. Each option serves a distinct use case, from fully-managed servers that handle all infrastructure automatically, to external connections that proxy third-party tools through AI Gateway.

Three deployment options:

  • Fully-managed MCP servers: Databricks owns and operates the MCP server, handling infrastructure, authentication, and authorization.
  • Custom MCP servers with Databricks Apps: You own the server but use Databricks Apps infrastructure to build and customize it.
  • External MCP servers with AI Gateway: Connect external third-party servers using Unity Catalog HTTP connections, with managed OAuth flows for select providers.

Is your AI context-ready?

Assess Your Readiness

Build Your AI Context Stack

Get the blueprint for implementing context graphs across your enterprise. This guide walks through the four-layer architecture from metadata foundation to agent orchestration, with practical implementation steps for 2026.

Get the Stack Guide

Why do you need the MCP server for Databricks?

Permalink to “Why do you need the MCP server for Databricks?”

Databricks supports agentic data engineering and analysis in many ways. It laid the foundation for building agents with Mosaic AI and kept building on it with services like the AI Gateway, Genie, and Agent Bricks:

  • Mosaic AI: The foundational layer for building, training, and serving ML models and agents inside Databricks

  • AI Gateway: Manages routing, rate limiting, and governance for LLM calls across the organization

  • Genie: Lets business users query data in natural language without writing SQL

  • Agent Bricks: Pre-built agent components that accelerate development for agents running within Databricks

The common thread across all four: they are built for agents operating inside Databricks, not for external agents. There was no standard way to discover and call lakehouse resources until MCP servers arrived.

The result is a Databricks lakehouse that any MCP-compatible agent can reach, without rebuilding the connection layer each time.


How can you set up Databricks-managed MCP servers for AI agents?

Permalink to “How can you set up Databricks-managed MCP servers for AI agents?”

Managed MCP servers are deployed in a workspace and use Databricks serverless compute. These servers are tied to other Databricks resources.

To connect to a lakehouse, you might go via a Unity Catalog function or a SQL warehouse. For other use cases, you might want your MCP server to connect to a Genie Space or a Vector Search index.

Every MCP server is deployed with a URL that the agent can use to call the tools it is allowed to use. There are four types of managed MCP servers, each scoped to one Databricks resource:

  • Unity Catalog functions: You can use out-of-the-box Unity Catalog functions or write your own in Python and SQL to run specific, predefined queries and scripts

  • Databricks SQL: This is where you give your AI agents access to your lakehouse using a SQL warehouse as the compute layer; Unity Catalog still governs all of this

  • Genie Spaces: Lets an AI agent use a Genie Space to ask natural language questions and offload the query translation and execution to Genie

  • Vector Search: Best when you need semantic search, especially over unstructured data in documentation, communication threads, support tickets, among other things

The important thing to note here is that a Unity Catalog function or a Genie Space automatically becomes a tool available through an MCP server for that service.

For instance, a Unity Catalog function lookup_account becomes available at /api/2.0/mcp/functions/main/support/lookup_account as soon as it’s deployed.

Unity Catalog enforces permissions on every call, and AI Gateway is the central control plane for managing access and monitoring activity across all your MCP servers.


How can you use Databricks Apps to deploy custom MCP servers for AI agents?

Permalink to “How can you use Databricks Apps to deploy custom MCP servers for AI agents?”

A custom MCP server deployed on Databricks Apps still lives in Databricks, similar to a managed MCP server, but with this, you own the build, deployment, and tool logic.

You can write an MCP server using popular tools like FastMCP (Python), the MCP Python SDK, rmcp (Rust), xmcp (TypeScript), among others.

Once you have a server like the one defined in the following code snippet, you can deploy it using the databricks apps deploy command:

from fastmcp import FastMCP

mcp = FastMCP("lakehouse")
@mcp.tool()

def get_lakehouse_table_status(table: str) -> dict:
    """Return some custom status metrics for a lakehouse table"""
    ... # the logic goes here.

mcp.run()

These custom MCP servers let you do the following things that fully-managed MCP servers don’t:

  • Deploy custom tool logic using existing internal APIs, ML models, and services behind a tool.

  • Use the language of your choice as long as your custom server has Streamable HTTP transport.

  • Choose from multiple authentication options, such as OAuth, service principal, on-behalf-of-user (OBO), and automatic authentication passthrough.

If you are working with multiple such servers, you can now (as of April 2026) register them within Databricks’ Supervisor Agent for multi-agent orchestration. These servers are billed on Databricks Apps pricing. Custom servers can also host third-party MCP servers, but you will need to handle the authentication and setup yourself.

For integrating with external tools such as GitHub or Glean, a better approach is to connect them all via Databricks’ AI Gateway. Let’s see how.

CIO Guide to Context Graphs

For data leaders evaluating where to start, Atlan's CIO guide to context graphs walks through a practical four-layer architecture from metadata foundation to agent orchestration.

Get the CIO Guide

How can you use AI Gateway to connect to external MCP servers in Databricks?

Permalink to “How can you use AI Gateway to connect to external MCP servers in Databricks?”

The AI Gateway option is best used when you want to connect third-party tools with their own MCP servers, while using Databricks’ infrastructure for routing. Some such common integrations that Databricks supports with managed OAuth flows include GitHub, Glean, Google Drive, and SharePoint.

You can also install any other server via a custom Unity Catalog HTTP connection or, for servers that support OAuth 2.0 Dynamic Client Registration (DCR), via DCR. The provider must use Streamable HTTP transport in all cases.

AI Gateway makes the external MCP server appear to be an internal, managed MCP server by providing a similar endpoint at /api/2.0/mcp/external/<connection_name>, which is a proxy endpoint. The AI agent connects to the MCP server via the proxy URL.

Databricks supports both shared principal authentication and per-user authentication, allowing you to choose between a single service account and letting each user authenticate with their own credentials.

While this gives you an easy way to connect MCP servers with other tools in your stack, the challenge is that context starts to get siloed in those tools, making it harder to govern and organize. So, it’s better to first establish a balance among the three MCP deployments and their usage within Databricks, and at the same time, it’s important to ensure that context is well-organized and always available for the AI agents to use.

With just the Databricks MCP servers, you can’t get there if you have a broad data stack. Let’s see why and how to fix it.


How can you fix the context crunch for AI agents and the lakehouse?

Permalink to “How can you fix the context crunch for AI agents and the lakehouse?”

Large organizations seldom run on a single technology stack. As a result of hedging their bets on tooling, spread-out teams, and various types of business changes over time, enterprises end up with many tools beyond a typical data warehouse or lakehouse. These tools can be specialty databases, documentation, and visualization engines, among other things. All of these tools serve specific purposes and have highly valuable business context fragmented away in siloes.

AI agents cannot (and should not) connect directly to those siloes for context because of the security, cost, risk, and maintainability aspects. What’s needed instead is an enterprise context layer like Atlan that brings all the siloes together for a full view of the context.

This layer spans all your organizational tools and talks not just to the lakehouse in Databricks but also to systems beyond it. It sits between your business systems and your AI agents, connecting lineage, business definitions, knowledge, quality scores, and access policies into a unified context store every agent can query.


How does Atlan’s enterprise context layer power AI agents across your data and AI stack?

Permalink to “How does Atlan’s enterprise context layer power AI agents across your data and AI stack?”

Atlan has been architected as an AI-native context layer, powered by the Context Lakehouse and built on open standards and proven technologies like Iceberg. Atlan enriches this context layer and lets agents use it via the fully-managed Atlan MCP server, which provides context on governance, lineage, quality, and more.


Key capabilities include:

  • Enterprise Data Graph: All the key information from 80+ connectors lands into the enterprise data graph, which then becomes context for every other tool in your stack.

  • Context Agents: These are Atlan’s own AI agents that use the Enterprise Data Graph to enrich metadata and context automatically.

  • Context Engineering Studio: The Context Engineering Studio is a visual interface that allows you to bootstrap, simulate, evaluate, deploy, and observe context to AI agents.

  • Atlan MCP server: Atlan MCP server is the key part of the context puzzle that facilitates context in and out of Atlan using official and custom connectors. The Atlan MCP server is available on Databricks Marketplace for one-click install.

Inside Atlan AI Labs & The 5x Accuracy Factor

Learn how context engineering drove 5x AI accuracy in real customer systems. Explore real experiments, quantifiable results, and a repeatable playbook for closing the gap between AI demos and production-ready systems.

Download Ebook

Moving forward with MCP server for Databricks

Permalink to “Moving forward with MCP server for Databricks”

Databricks provides three options for connecting your AI agents to your lakehouse. When working with Databricks alone, all these options make sense and work quite well.

But context seldom lives just inside the lakehouse. It’s spread out across the organization in various tools, and not necessarily data tools, but also code repositories, documentation, tickets, among other things.

To bring this all together into a single context layer for the enterprise, that’s where Atlan comes in.

Atlan’s enterprise context layer runs on the Context Lakehouse, an Iceberg-native foundation that unifies business meaning (definitions, lineage, ownership, trust signals) across all tools and exposes it to agents via the Atlan MCP server. It is powered by the Enterprise Data Graph, Atlan’s own Context Agents, and the Context Engineering Studio.

That’s the missing half of the picture for any Databricks-based agent that has to reason about data shaped by tools outside the lakehouse.


FAQs about MCP server for Databricks

Permalink to “FAQs about MCP server for Databricks”

1. How does the Databricks-managed MCP server work?

Permalink to “1. How does the Databricks-managed MCP server work?”

Databricks operates a managed MCP server and exposes various resources, such as Genie Spaces, SQL warehouses, and Unity Catalog functions as agent tools via an API. MCP clients can connect to this managed server, discover the available tools, and call them. Unity Catalog enforces permissions on every call, while AI Gateway provides centralized access management and monitoring.

2. Can the Databricks MCP servers work with the Atlan MCP server?

Permalink to “2. Can the Databricks MCP servers work with the Atlan MCP server?”

Yes, they are absolutely designed to work together. MCP clients and agents don’t just call one MCP server; they connect to multiple MCP servers at the same time. An agent will retrieve context from Atlan’s MCP server, then perform data engineering or analytics work in Databricks Lakehouse via the Databricks MCP server.

3. What MCP clients can connect to the Databricks and Atlan MCP servers?

Permalink to “3. What MCP clients can connect to the Databricks and Atlan MCP servers?”

Both Atlan and Databricks MCP servers work with any MCP-compliant client. The most popular options include Claude, Claude Code, ChatGPT, CrewAI, Bedrock AgentCore, and Cursor, among others.

4. When should I choose a custom MCP server over a managed one in Databricks?

Permalink to “4. When should I choose a custom MCP server over a managed one in Databricks?”

You should use a custom MCP server only when your tooling logic requires customization and is not covered by the out-of-the-box tools, such as Unity Catalog functions, Vector Search, Genie Spaces, SQL queries, etc. It also makes sense to use them when you need slightly more granular control over how and when the server can be used.

5. What transport protocol do Databricks MCP servers use and why?

Permalink to “5. What transport protocol do Databricks MCP servers use and why?”

Databricks MCP servers use Streamable HTTP. Databricks requires it for custom and external MCP servers because they need an HTTP-compatible transport to be reachable as Databricks Apps or through AI Gateway proxies. Streamable HTTP also lets servers handle multiple clients concurrently using standard POST and GET requests, without maintaining persistent connections per client.

Share this article

signoff-panel-logo

Atlan is the next-generation platform for data and AI governance. It is a control plane that stitches together a business's disparate data infrastructure, cataloging and enriching data with business context and security.

Bridge the context gap.
Ship AI that works.

[Website env: production]