Databricks ships pre-built MCP endpoints for Vector Search, Genie, SQL, and Unity Catalog functions — but every endpoint speaks HTTP over OAuth, and every MCP client speaks stdio over pipes. Here's how a small proxy bridges the two transports, where the OAuth token comes from, and the two proxy implementations this repo wires up in .mcp.json.example.
Three takeaways. The rest of this guide unpacks them.
Four ready-to-use endpoints under /api/2.0/mcp/… — Vector Search, Genie, DBSQL, and Unity Catalog functions. Unity Catalog permissions are always enforced server-side. Visible in your workspace under AI Gateway → MCPs.
Claude Code, Cursor, Windsurf launch MCP servers as subprocesses and pipe JSON-RPC over stdin/stdout. The managed endpoints require workspace OAuth. A proxy reads stdio, gets a Bearer token from your ~/.databrickscfg, and POSTs to the HTTPS endpoint.
Option 1: mcp_proxy.py in this repo — ~190 lines, PEP 723 inline deps, no install step. Option 2: uvx uc-mcp-proxy — published package fetched on first run. Same wire protocol, different distribution.
Databricks runs the MCP server for you. You point an MCP-compatible client at a URL on your workspace, authenticate with workspace OAuth, and get a set of tools that are governed by Unity Catalog.
The endpoints are exposed under your workspace host at /api/2.0/mcp/… — same hostname and auth surface as the rest of the Databricks REST API. You don't deploy, version, or scale anything. The tradeoff: you lose the per-request control that a custom MCP server gives you (warehouse pinning, custom result shaping, additional tools). For most agent-to-data-and-tools wiring, that's a fair trade.
In your Databricks workspace, navigate to AI Gateway → MCPs. Each enabled server shows its URL and current status. AI Gateway also surfaces usage telemetry alongside other AI resources, so you can monitor managed MCP traffic the same way you monitor Mosaic AI endpoints.
Four server types, each with its own URL pattern and OAuth scope. Pick whichever matches the data or tool surface your agent needs. You can wire several into the same MCP client; tool names get prefixed by server name, so they don't collide.
| Server | URL pattern | OAuth scope | Read/write |
|---|---|---|---|
| Vector Search Query Vector Search indexes with Databricks-managed embeddings. |
/api/2.0/mcp/vector-search/{catalog}/{schema}or scope to one index by appending /{index_name} |
vector-search | Read |
| Genie Space Natural-language analytics over a curated Genie space. |
/api/2.0/mcp/genie/{genie_space_id} |
genie | Read |
| Databricks SQL Run AI-generated SQL via a warehouse. The endpoint that picks the warehouse server-side. |
/api/2.0/mcp/sql |
sql | Read + write |
| UC Functions Invoke Unity Catalog SQL or Python functions as tools. |
/api/2.0/mcp/functions/{catalog}/{schema}or scope to one function by appending /{function_name} |
unity-catalog | Read (function-dependent) |
For Vector Search and UC Functions, you can stop the URL at the schema level (exposes every index or function in that schema as a separate tool) or extend it to a specific resource. Schema-scoped is the right default when you want the agent to choose among related tools; resource-scoped is the right default when you want tight blast radius. Genie URLs always end at the space ID; SQL has no parameters.
vector-search, genie, sql, or unity-catalog. The CLI proxy path sidesteps this by reusing your existing CLI token.An MCP client and a managed Databricks endpoint speak incompatible protocols. The mismatch is not subtle — transport, framing, and authentication all differ.
| Concern | MCP client (Claude Code, Cursor, …) | Databricks managed MCP endpoint |
|---|---|---|
| Transport | stdio — launches the server as a subprocess and pipes data over stdin/stdout | HTTPS — streamable-http endpoint, no subprocess model |
| Framing | Newline-delimited JSON-RPC frames | HTTP POST with JSON body; responses can be JSON or Server-Sent Events |
| Authentication | No native concept — just runs a binary that's already trusted | Workspace OAuth (or PAT) on every request as Authorization: Bearer <token> |
| Session | One long-lived subprocess per server entry | Stateless HTTP; session ID returned in Mcp-Session-Id header and must be echoed back |
A proxy is the small piece of code that resolves these four mismatches. It is launched as a stdio subprocess by the MCP client, reads JSON-RPC frames from stdin, opens an HTTPS connection to the managed endpoint, attaches a Bearer token, deals with SSE response parsing if the server uses it, tracks the session ID, and writes responses back to stdout. From the client's point of view, the proxy looks like a normal local MCP server.
Some MCP clients do support streamable-http natively (the docs show examples for Cursor and ChatGPT). But the dominant clients today — Claude Code, Claude Desktop, Windsurf — expect stdio. A proxy is the universal answer: it works with every MCP client because every MCP client supports stdio.
Three boxes, two arrows, one auth round-trip. Walk through one request from the moment the agent calls a tool to the moment the result lands back in the client.
At launch, the MCP client reads .mcp.json, finds your server entry, and runs the configured command as a subprocess. The subprocess inherits stdin/stdout/stderr from the client. From this moment until shutdown, the proxy and client communicate over those pipes only.
The client sends a JSON-RPC initialize request over stdin. Before forwarding it, the proxy builds a WorkspaceClient from the --profile argument. The Databricks SDK reads ~/.databrickscfg, finds the host and auth method, and either reuses a cached OAuth token from ~/.databricks/token-cache.json or kicks off a browser-based OAuth flow on first use.
The proxy serializes the JSON-RPC frame as the body of an HTTP POST to the managed endpoint URL. It attaches an Authorization: Bearer <token> header from the SDK. If the server returns an Mcp-Session-Id header, the proxy caches it and echoes it back on every subsequent request — that's how the server tracks session state across stateless HTTP calls.
POST /api/2.0/mcp/sql HTTP/1.1
Host: <workspace-host>.cloud.databricks.com
Content-Type: application/json
Accept: application/json, text/event-stream
Authorization: Bearer dapi…
Mcp-Session-Id: 7c1f…
{"jsonrpc":"2.0","method":"tools/call","params":{…},"id":1}
The endpoint replies with either application/json (a single response frame) or text/event-stream (one or more events split by blank lines, each data:-prefixed). The proxy demultiplexes SSE into individual JSON-RPC frames, then writes each frame to stdout newline-delimited. The client reads them just as if they had come from a local in-process server.
If the server returns HTTP 404 or 410 on a request that carries a session ID, the proxy treats that as session expiry, replays the cached initialize frame to mint a new session, and retries the original request once. The client never sees the blip. This is the only piece of stateful logic the proxy carries.
The proxy runs on your machine as you. The Bearer token never leaves your machine in a form the client can extract — the client only ever sees JSON-RPC frames on stdio. Token refresh is handled by the SDK using the same trust boundary as databricks CLI commands you've already authorized.
Both implementations follow the same architecture from the previous section. The difference is whether the proxy code lives in your repo or is fetched from PyPI on first run. The right choice depends on how much control you want over the code.
mcp_proxy.py~190 lines of Python in this repo. PEP 723 inline dependencies mean uv run resolves databricks-sdk on demand — no install step, no shared venv.
mcp_proxy.py to every consuming repo (or referencing it by absolute path)"command": "uv",
"args": [
"run", "./mcp_proxy.py",
"--path", "/api/2.0/mcp/sql",
"--profile", "<your-profile>"
]
uvx uc-mcp-proxy from PyPIPublished package fetched on demand by uvx. Documented in Databricks' own connect-clients guide as the recommended CLI-auth path for Cursor and Windsurf.
.mcp.json entryuvx install on every machine that runs the client"command": "uvx",
"args": [
"uc-mcp-proxy",
"--url", "https://<host>/api/2.0/mcp/sql",
"--auth-type", "databricks-cli",
"--profile", "<your-profile>"
]
Default to Option B (uvx uc-mcp-proxy) when you're standing up a new wiring fast or sharing a recipe with teammates. Switch to Option A (hand-rolled) when you want to add behavior the package doesn't expose — structured request logging, custom retries, OBO token shaping, or anything that turns the proxy into more than a pipe.
The hand-rolled proxy resolves the workspace host from your CLI profile, so you only pass the API path. The PyPI proxy is profile-agnostic and takes the full URL. Two style choices for the same destination.
| Concern | Hand-rolled mcp_proxy.py |
uvx uc-mcp-proxy |
|---|---|---|
| How the workspace host is supplied | Derived from --profile via ~/.databrickscfg |
Hardcoded into --url |
| How the endpoint path is supplied | --path /api/2.0/mcp/… |
Trailing path on --url |
| Auth-type selector | Implicit — SDK default chain | Explicit — --auth-type databricks-cli |
| Install step | None — uv run resolves PEP 723 deps |
None — uvx fetches on first run |
| Repo footprint | One file (~190 lines) | Zero files |
Add server entries to .mcp.json (project-level) or ~/.claude/.mcp.json (global), then restart your MCP client — servers only load at startup. The full template lives in .mcp.json.example in this repo; below are the four canonical patterns.
Both proxy variants reuse the OAuth token cached by the Databricks CLI. If you haven't authenticated against the workspace yet, do it now:
databricks auth login --host https://<workspace-host>.cloud.databricks.com --profile <your-profile>
This kicks off a browser flow, mints a token, and caches it at ~/.databricks/token-cache.json keyed on the profile name. From then on the proxy can mint Bearer tokens silently. The token auto-refreshes; you don't need to redo this until the refresh token expires.
"vs-customer-support": {
"command": "uv",
"args": [
"run", "./mcp_proxy.py",
"--path",
"/api/2.0/mcp/vector-search/prod/support",
"--profile", "<your-profile>"
]
}
"vs-customer-support": {
"type": "stdio",
"command": "uvx",
"args": [
"uc-mcp-proxy",
"--url", "https://<host>/api/2.0/mcp/vector-search/prod/support",
"--auth-type", "databricks-cli",
"--profile", "<your-profile>"
]
}
"genie-billing": {
"command": "uv",
"args": [
"run", "./mcp_proxy.py",
"--path",
"/api/2.0/mcp/genie/<genie-space-id>",
"--profile", "<your-profile>"
]
}
"genie-billing": {
"type": "stdio",
"command": "uvx",
"args": [
"uc-mcp-proxy",
"--url", "https://<host>/api/2.0/mcp/genie/<genie-space-id>",
"--auth-type", "databricks-cli",
"--profile", "<your-profile>"
]
}
"dbsql": {
"command": "uv",
"args": [
"run", "./mcp_proxy.py",
"--path", "/api/2.0/mcp/sql",
"--profile", "<your-profile>"
]
}
"dbsql": {
"type": "stdio",
"command": "uvx",
"args": [
"uc-mcp-proxy",
"--url", "https://<host>/api/2.0/mcp/sql",
"--auth-type", "databricks-cli",
"--profile", "<your-profile>"
]
}
"ucfunc-retail": {
"command": "uv",
"args": [
"run", "./mcp_proxy.py",
"--path",
"/api/2.0/mcp/functions/prod/retail",
"--profile", "<your-profile>"
]
}
"ucfunc-retail": {
"type": "stdio",
"command": "uvx",
"args": [
"uc-mcp-proxy",
"--url", "https://<host>/api/2.0/mcp/functions/prod/retail",
"--auth-type", "databricks-cli",
"--profile", "<your-profile>"
]
}
The /api/2.0/mcp/sql endpoint picks the SQL warehouse server-side using a five-tier priority algorithm and ignores every client-side pinning attempt. For deterministic routing — cost attribution, team isolation, SLA enforcement — see the warehouse-pinning guide and the per-user override guide. The proxy itself doesn't influence warehouse selection.
Where the Bearer token actually comes from when you write --profile dev and don't think about it again. Three layers of resolution, all handled by the Databricks SDK.
~/.databrickscfg resolutionThe proxy passes profile=<your-profile> into WorkspaceClient. The SDK opens ~/.databrickscfg, finds the matching [your-profile] section, and reads host plus whatever auth fields are configured (token, auth_type, client_id, …). For OAuth profiles created via databricks auth login, only host appears in the file — the rest is implicit.
The SDK looks for a cached token in ~/.databricks/token-cache.json, keyed by workspace host plus profile. If a non-expired access token exists, it's used directly. If the access token has expired but a refresh token is present, the SDK calls Databricks to exchange the refresh token for a new access token and updates the cache in place — no user interaction.
If neither an access nor a refresh token is usable, the SDK starts the OAuth code flow: opens a localhost listener, prints a URL, and waits for the user to complete the browser-based consent. This is the only path that requires human interaction; once it completes, the cache is populated and subsequent requests stay quiet.
By the time the proxy POSTs the JSON-RPC frame, all three layers have collapsed into a single header. The endpoint never sees your profile name, never knows whether the token was cached or just minted, and never cares which CLI version produced it.
Authorization: Bearer dapi0123abc456def…
If a request fails with 401, the fastest diagnostic is databricks current-user me --profile <your-profile>. If that command succeeds, your CLI auth is healthy and the issue is elsewhere (UC permissions, IP allowlist, scope mismatch). If it fails, re-run databricks auth login for the profile.
After editing .mcp.json, restart your MCP client. The new tools should appear in the tool list, prefixed by the server name you gave them. Three quick checks turn up most wiring problems.
Confirms the proxy will be able to mint a token. If this fails, fix the profile first; everything downstream depends on it.
databricks current-user me --profile <your-profile>
# expect: JSON with your userName, emails, id
An unauthenticated curl should hit the endpoint and come back with a 401 Unauthorized. That's the signal that DNS, TLS, and the workspace are reachable; the server simply rejected the request because it had no Bearer token.
curl -i https://<workspace-host>.cloud.databricks.com/api/2.0/mcp/sql \
-X POST -H "Content-Type: application/json" -d '{}'
# expect: HTTP/2 401 with a www-authenticate header
If you get 403, the workspace's IP allowlist is blocking you. If the request hangs or TLS fails, the workspace URL or your DNS is wrong.
Run the proxy command directly, send a JSON-RPC initialize on stdin, and confirm a well-formed response comes back on stdout. This is the same call your MCP client makes at startup.
echo '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"manual","version":"1"}},"id":1}' \
| uv run ./mcp_proxy.py --path /api/2.0/mcp/sql --profile <your-profile>
# expect: one JSON-RPC response with serverInfo and capabilities
Tools never appear in client: the MCP client did not restart, or the JSON in .mcp.json is malformed (run python -m json.tool .mcp.json to validate).
401 from the endpoint, CLI auth passes: the workspace and the profile don't match. Make sure the profile's host is the same workspace that owns the resource you're trying to access.
403 or empty tool list: Unity Catalog permissions. The user behind the OAuth token needs grants on the underlying catalog, schema, function, index, or Genie space.
Proxy exits immediately: Python error from databricks-sdk. Run the proxy command from a terminal to see the stderr traceback.
A proxy is the right tool for stdio-only MCP clients connecting to Databricks-hosted endpoints with user-scoped auth. Three situations call for a different pattern.
The managed /api/2.0/mcp/sql endpoint chooses warehouses server-side; a proxy can't influence that selection. If you need per-agent or per-workload pinning — cost attribution, isolation, SLA-bound compute — run a custom MCP server that calls the SDK with an explicit warehouse_id. The rest of this repo is exactly that pattern.
Cursor and ChatGPT support direct streamable-http connections without a stdio bridge. In that case, point the client straight at the managed endpoint with either a static PAT in Authorization headers or the standard MCP OAuth flow (registered as an OAuth app in your Databricks account). See the connect-clients docs for client-specific recipes.
The proxy reuses your interactive CLI token, which is tied to a user identity. For service-principal-attributed traffic (a deployed agent making MCP calls on behalf of an automation, not a person), use OAuth M2M with client_id and client_secret directly in the WorkspaceClient — either inside your own agent code or inside a custom MCP server hosted as a Databricks App.
Connecting a stdio MCP client (Claude Code, Cursor, Windsurf, Claude Desktop) to a Databricks-hosted managed MCP endpoint, with auth resolved through your local Databricks CLI profile. That's the most common path. For the alternatives above, the docs in the footer have client-specific recipes.