Skip to content

Lakebase OLTP

Skill: databricks-app-appkit

You can add persistent read/write storage to your AppKit app using Lakebase, which exposes a standard PostgreSQL interface. While config/queries/ handles read-only analytics from a SQL warehouse, Lakebase handles the transactional side — user submissions, app state, form data, and any table your app creates and modifies at runtime.

“Using TypeScript, scaffold an AppKit app with both Lakebase and analytics, then set up the connection pool.”

Terminal window
databricks apps init --name my-app --features analytics,lakebase \
--set "analytics.sql-warehouse.id=<WAREHOUSE_ID>" \
--set "lakebase.postgres.branch=<BRANCH_NAME>" \
--set "lakebase.postgres.database=<DATABASE_NAME>" \
--run none --profile <PROFILE>
import { createLakebasePool } from "@databricks/lakebase";
// Call once at module level -- not inside request handlers
const pool = createLakebasePool();

Key decisions:

  • createLakebasePool() reads connection environment variables automatically — no manual config needed
  • Initialize the pool once at module scope. Creating it per-request causes connection exhaustion.
  • Lakebase uses standard PostgreSQL syntax — not Databricks SQL. Parameters are positional ($1, $2), not named (:param).
  • The service principal cannot access the public schema. Create a custom schema at startup.

“Using TypeScript, add list and create endpoints backed by Lakebase.”

import { z } from "zod";
export const appRouter = t.router({
listItems: t.procedure.query(async () => {
const { rows } = await pool.query(
"SELECT * FROM app_data.items ORDER BY created_at DESC LIMIT 100"
);
return rows;
}),
createItem: t.procedure
.input(z.object({ name: z.string().min(1) }))
.mutation(async ({ input }) => {
const { rows } = await pool.query(
"INSERT INTO app_data.items (name) VALUES ($1) RETURNING *",
[input.name]
);
return rows[0];
}),
});

Read operations use .query(), write operations use .mutation(). The Lakebase pool handles connection management and returns results in standard { rows } format.

“Using TypeScript, create the app’s schema and tables on first deploy.”

async function initializeSchema() {
await pool.query(`
CREATE SCHEMA IF NOT EXISTS app_data;
CREATE TABLE IF NOT EXISTS app_data.items (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
status TEXT DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT NOW()
);
`);
}
// Call during app startup
initializeSchema();

The IF NOT EXISTS clauses make this idempotent — safe to run on every deploy. Always use a custom schema name (not public) because the Lakebase service principal typically lacks access to the default schema.

“Using TypeScript, build a dashboard that reads from a SQL warehouse and writes to Lakebase.”

// Read analytics data via query file + component
<BarChart queryKey="revenue_by_region" parameters={{}} xKey="region" yKey="revenue" />
// Write user feedback to Lakebase via tRPC
const handleSubmit = async (feedback: string) => {
await trpc.submitFeedback.mutate({ text: feedback, region: selectedRegion });
};
// Server-side tRPC endpoint
submitFeedback: t.procedure
.input(z.object({ text: z.string(), region: z.string() }))
.mutation(async ({ input }) => {
await pool.query(
"INSERT INTO app_data.feedback (text, region) VALUES ($1, $2)",
[input.text, input.region]
);
return { success: true };
}),

Analytics (read-only, aggregated) comes from the SQL warehouse via queryKey. User-generated content (read/write, transactional) goes to Lakebase via tRPC. This separation keeps each backend doing what it does well.

  • Using useAnalyticsQuery for Lakebase datauseAnalyticsQuery routes queries to the SQL warehouse, not Lakebase. For Lakebase reads, use tRPC query procedures that call pool.query().
  • Putting Lakebase SQL in config/queries/ — query files are executed against the SQL warehouse using Databricks SQL dialect. Lakebase uses standard PostgreSQL. Mixing them causes syntax errors.
  • Using named parameters (:param) for Lakebase — Lakebase uses PostgreSQL’s positional parameter syntax ($1, $2). Named parameters like :param_name are a Databricks SQL convention that does not work here.
  • Creating the pool inside request handlerscreateLakebasePool() opens a set of connections. Calling it per-request opens new pools on every interaction, exhausting connections within minutes. Initialize once at module scope.