SQL Scripting & Stored Procedures
Skill: databricks-dbsql
What You Can Build
Section titled “What You Can Build”SQL scripting brings procedural logic to Databricks SQL — variables, control flow, exception handling, and stored procedures that persist in Unity Catalog. You can write conditional ETL pipelines, reusable data processing routines, and recursive queries for hierarchy traversal, all without leaving SQL. Available in DBR 16.3+ and Databricks SQL.
In Action
Section titled “In Action”“Write a SQL script that checks for new orders, processes them into a target table with a timestamp, and returns the status and row count.”
BEGIN DECLARE v_count INT; DECLARE v_status STRING DEFAULT 'pending';
SET v_count = (SELECT COUNT(*) FROM catalog.schema.raw_orders WHERE status = 'new');
IF v_count > 0 THEN INSERT INTO catalog.schema.processed_orders SELECT *, current_timestamp() AS processed_at FROM catalog.schema.raw_orders WHERE status = 'new'; SET v_status = 'completed'; ELSE SET v_status = 'skipped'; END IF;
SELECT v_status AS result, v_count AS rows_processed;ENDKey decisions:
- Every SQL script starts with a
BEGIN...ENDcompound statement - Declarations must appear before executable statements — variables first, then conditions, then handlers
SETassigns values from scalar subqueries or expressions- The compound statement must be the sole statement in a notebook cell
- Variables are scoped to their enclosing
BEGIN...ENDblock
More Patterns
Section titled “More Patterns”Create a reusable stored procedure with error handling
Section titled “Create a reusable stored procedure with error handling”“Write a stored procedure that upserts customer data with MERGE, returns the affected row count, and handles failures gracefully.”
CREATE OR REPLACE PROCEDURE catalog.schema.upsert_customers( IN p_source STRING, OUT p_rows_affected INT)LANGUAGE SQLSQL SECURITY INVOKERBEGIN DECLARE EXIT HANDLER FOR SQLEXCEPTION BEGIN SET p_rows_affected = -1; SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = concat('Upsert failed for source: ', p_source); END;
MERGE INTO catalog.schema.dim_customer AS t USING (SELECT * FROM identifier(p_source)) AS s ON t.customer_id = s.customer_id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT *;
SET p_rows_affected = (SELECT COUNT(*) FROM identifier(p_source));END;
-- Invoke:CALL catalog.schema.upsert_customers('catalog.schema.staging_customers', ?);Stored procedures persist in Unity Catalog and are invoked with CALL. The EXIT HANDLER FOR SQLEXCEPTION catches any error inside the procedure body, sets the output to -1, and raises a custom error message. Available in DBR 17.0+ (Public Preview).
Traverse an organizational hierarchy with recursive CTEs
Section titled “Traverse an organizational hierarchy with recursive CTEs”“Write a SQL query that walks an employee hierarchy from the CEO down, tracking depth and the full path.”
WITH RECURSIVE org_chart AS ( SELECT employee_id, name, manager_id, 0 AS depth, ARRAY(name) AS path FROM catalog.schema.employees WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.name, e.manager_id, o.depth + 1, array_append(o.path, e.name) FROM catalog.schema.employees e JOIN org_chart o ON e.manager_id = o.employee_id WHERE o.depth < 10)SELECT * FROM org_chart ORDER BY depth, name;Recursive CTEs need an anchor query (seed rows) joined to a recursive step via UNION ALL. The WHERE o.depth < 10 acts as a safety valve. Default recursion depth is 100 (override with MAX RECURSION LEVEL), and the default row limit is 1,000,000. Available in DBR 17.0+ and DBSQL 2025.20+.
Execute atomic multi-statement transactions
Section titled “Execute atomic multi-statement transactions”“Write a SQL script that inserts a customer, their first order, and an audit log entry as a single atomic operation.”
BEGIN ATOMIC INSERT INTO customers (id, name) VALUES (1, 'Alice'); INSERT INTO orders (id, customer_id, amount) VALUES (1, 1, 250.00); INSERT INTO audit_log (action, ts) VALUES ('new_customer_order', current_timestamp());END;If any statement fails, all changes roll back. Tables used in BEGIN ATOMIC blocks must have the catalogManaged table feature enabled. This is a Preview feature.
Build dynamic SQL for parameterized operations
Section titled “Build dynamic SQL for parameterized operations”“Write a SQL script that uses EXECUTE IMMEDIATE to run a parameterized query and capture the result.”
BEGIN DECLARE total INT; EXECUTE IMMEDIATE 'SELECT SUM(c1) FROM VALUES(:a), (:b) AS t(c1)' INTO total USING (5 AS a, 6 AS b); VALUES (total); -- Returns 11END;EXECUTE IMMEDIATE constructs SQL from strings at runtime. Use named parameters (:param) for clarity, and INTO to capture single-row results into declared variables. Available since DBR 14.3.
Watch Out For
Section titled “Watch Out For”- Mixing declarations with executable statements — declarations must come first in a
BEGIN...ENDblock. Variable declarations, then condition declarations, then handler declarations, then executable statements. Breaking this order causes syntax errors. - Forgetting
WHEREguards on recursive CTEs — without a depth limit or termination condition, recursive queries hit the 100-level default and error out. Always add aWHERE depth < Nclause. - Using
BEGIN ATOMICwithout thecatalogManagedtable feature — existing tables can’t be upgraded in place. You need to recreate them withTBLPROPERTIES ('delta.feature.catalogManaged' = 'supported'). - Expecting
CREATE OR REPLACE TEMP TABLE— it’s not supported yet. Drop the temp table first, then recreate it.