Databricks Apps (Python)
Skill: databricks-app-python
What You Can Build
Section titled “What You Can Build”You can deploy production Python web apps directly on the Databricks platform — dashboards with Dash, prototypes with Streamlit, ML demos with Gradio, or REST APIs with FastAPI. Each app gets OAuth authentication, resource bindings for SQL warehouses and Lakebase, and access to model serving endpoints. Ask your AI coding assistant for a specific framework and use case and it will generate the app code, app.yaml config, and deployment commands in one pass.
In Action
Section titled “In Action”“Build a Streamlit app that queries customer order data from a SQL warehouse and displays an interactive revenue dashboard with filters for date range and region.”
import osimport streamlit as stfrom databricks.sdk.core import Configfrom databricks import sql
st.set_page_config(page_title="Revenue Dashboard", layout="wide")
@st.cache_resourcedef get_connection(): cfg = Config() return sql.connect( server_hostname=cfg.host, http_path=f"/sql/1.0/warehouses/{os.getenv('DATABRICKS_WAREHOUSE_ID')}", credentials_provider=lambda: cfg.authenticate, )
conn = get_connection()
col1, col2 = st.columns(2)date_range = col1.date_input("Date range", value=[])region = col2.selectbox("Region", ["All", "NA", "EMEA", "APAC"])
query = """ SELECT order_date, region, SUM(amount) AS revenue FROM catalog.schema.orders WHERE order_date BETWEEN :start AND :end"""params = {"start": date_range[0], "end": date_range[1]}if region != "All": query += " AND region = :region" params["region"] = regionquery += " GROUP BY order_date, region ORDER BY order_date"
with conn.cursor() as cur: cur.execute(query, params) df = cur.fetchall_arrow().to_pandas()
st.line_chart(df, x="order_date", y="revenue", color="region")Key decisions:
@st.cache_resourceon the connection — Streamlit reruns the script on every interaction. Without caching, you open a new SQL connection per click and exhaust the pool within minutes.Config()for authentication — auto-detectsDATABRICKS_CLIENT_ID/DATABRICKS_CLIENT_SECRETfrom the service principal injected at deploy time. Never hardcode tokens.DATABRICKS_WAREHOUSE_IDfrom environment — declared viavalueFrominapp.yamlso the warehouse ID isn’t baked into code. Swap warehouses by changing the resource binding, not the app.st.set_page_config()as the first call — Streamlit throws a hard error if any otherst.*call runs before this.- Parameterized SQL — the Databricks SQL connector supports
:nameparameters, which prevents injection and improves warehouse query caching.
More Patterns
Section titled “More Patterns”FastAPI backend with Lakebase persistence
Section titled “FastAPI backend with Lakebase persistence”“Create a FastAPI app that stores and retrieves feature flags in Lakebase, with OAuth for service-to-service auth.”
import osimport uuidfrom contextlib import asynccontextmanagerfrom fastapi import FastAPIfrom pydantic import BaseModelimport psycopgfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.core import Config
w = WorkspaceClient()INSTANCE = os.getenv("LAKEBASE_INSTANCE_NAME")
def get_connection(): instance = w.database.get_database_instance(name=INSTANCE) cred = w.database.generate_database_credential( request_id=str(uuid.uuid4()), instance_names=[INSTANCE], ) return psycopg.connect( host=instance.read_write_dns, dbname=os.getenv("LAKEBASE_DATABASE_NAME", "postgres"), user=w.current_user.me().user_name, password=cred.token, sslmode="require", )
class Flag(BaseModel): name: str enabled: bool = False
app = FastAPI()
@app.post("/flags")def create_flag(flag: Flag): with get_connection() as conn: with conn.cursor() as cur: cur.execute( "INSERT INTO flags (name, enabled) VALUES (%s, %s) RETURNING id", (flag.name, flag.enabled), ) return {"id": cur.fetchone()[0]}Lakebase requires psycopg in requirements.txt — it is not pre-installed. Tokens expire after 1 hour, so production apps need a refresh loop or fresh credentials per request. For low-traffic APIs, generating a token per request is simpler than managing background refresh.
Gradio ML demo with model serving
Section titled “Gradio ML demo with model serving”“Build a Gradio chat app that sends user messages to my model serving endpoint and streams responses back.”
import osimport gradio as grfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.service.serving import ChatMessage, ChatMessageRole
w = WorkspaceClient()endpoint = os.getenv("SERVING_ENDPOINT")
def chat(message, history): messages = [ ChatMessage(role=ChatMessageRole.SYSTEM, content="You are a helpful assistant.") ] for user_msg, bot_msg in history: messages.append(ChatMessage(role=ChatMessageRole.USER, content=user_msg)) messages.append(ChatMessage(role=ChatMessageRole.ASSISTANT, content=bot_msg)) messages.append(ChatMessage(role=ChatMessageRole.USER, content=message))
response = w.serving_endpoints.query( name=endpoint, messages=messages, ) return response.choices[0].message.content
gr.ChatInterface(fn=chat, title="Ask the Model").launch( server_name="0.0.0.0", server_port=int(os.getenv("DATABRICKS_APP_PORT", "8000")))The WorkspaceClient() handles service principal auth automatically. Bind the serving endpoint name through app.yaml resources so the same app code works across dev and production endpoints.
app.yaml resource bindings
Section titled “app.yaml resource bindings”“Configure my app to connect to a SQL warehouse and a Lakebase instance with proper resource declarations.”
command: ["streamlit", "run", "app.py"]resources: - name: sql-warehouse sql_warehouse: id: ${DATABRICKS_WAREHOUSE_ID} permission: CAN_USE - name: lakebase-db database: instance: my-lakebase-instanceenv: - name: DATABRICKS_WAREHOUSE_ID valueFrom: sql-warehouse - name: LAKEBASE_INSTANCE_NAME value: my-lakebase-instanceEvery external resource — warehouses, Lakebase instances, serving endpoints, secrets — gets declared here. The valueFrom pattern injects resource IDs as environment variables at runtime, keeping code portable across workspaces.
Watch Out For
Section titled “Watch Out For”- Missing
requirements.txtentries — Dash, Streamlit, Gradio, Flask, and FastAPI are pre-installed. Butpsycopg2,asyncpg,dash-bootstrap-components, and any other third-party package must be listed explicitly or the app crashes on deploy with an import error. - Port binding on non-Streamlit frameworks — Streamlit auto-binds to the correct port. Flask, FastAPI, and Gradio must read
DATABRICKS_APP_PORT(defaults to 8000). Using port 8080 or any other value causes a health check failure and the app never reaches “running” state. - User auth tokens only exist when deployed — the
x-forwarded-access-tokenheader is injected by the Databricks Apps proxy. Locally, it does not exist. Use the backend toggle pattern (USE_MOCK_BACKENDenv var) so development works without deploy-only headers. - Unstyled Dash layouts — Dash ships with no CSS. Add
dash-bootstrap-componentstorequirements.txtand passexternal_stylesheets=[dbc.themes.BOOTSTRAP]to the Dash constructor, or every component renders as unstyled HTML.