Managed Iceberg Tables
Skill: databricks-iceberg
What You Can Build
Section titled “What You Can Build”Managed Iceberg tables are native Apache Iceberg tables stored in Unity Catalog. You get full read/write from Databricks and from any external engine via the Iceberg REST Catalog endpoint. Use them when you need a true open table format that works across platforms — Spark, PyIceberg, Snowflake, Trino, Flink — without the limitations of Delta-only features.
In Action
Section titled “In Action”“Write SQL to create a managed Iceberg table for order events, partitioned by order_date, and load data from a staging table.”
CREATE TABLE analytics.gold.order_eventsUSING ICEBERGPARTITIONED BY (order_date)AS SELECT * FROM analytics.staging.raw_orders;Key decisions:
USING ICEBERGmakes this a native Iceberg table, not Delta — external engines can read and write via IRC without conversionPARTITIONED BYoverCLUSTER BY— both produce Liquid Clustering under the hood, butPARTITIONED BYauto-disables deletion vectors and row tracking, and works on any engine (EMR, OSS Spark, Trino), not just Databricks- Only plain column references work in
PARTITIONED BY— expression transforms likebucket(),years(),months()are not supported - If you must use
CLUSTER BYon Iceberg v2, manually set'delta.enableDeletionVectors' = falseand'delta.enableRowTracking' = falsein TBLPROPERTIES, or you getMANAGED_ICEBERG_ATTEMPTED_TO_ENABLE_CLUSTERING_WITHOUT_DISABLING_DVS_OR_ROW_TRACKING
More Patterns
Section titled “More Patterns”Upsert with MERGE
Section titled “Upsert with MERGE”“Write SQL to upsert incoming event data into a managed Iceberg table, matching on event_id.”
MERGE INTO analytics.gold.order_events AS targetUSING analytics.staging.new_events AS sourceON target.event_id = source.event_idWHEN MATCHED THEN UPDATE SET *WHEN NOT MATCHED THEN INSERT *;Managed Iceberg tables support all standard DML — INSERT, UPDATE, DELETE, and MERGE. This is one of the advantages over UniForm, where the table is read-only from external engines.
Create an Iceberg v3 table with VARIANT
Section titled “Create an Iceberg v3 table with VARIANT”“Write SQL to create an Iceberg v3 table that stores semi-structured event payloads as VARIANT, clustered by event_date.”
CREATE TABLE analytics.gold.events_v3 ( event_id BIGINT, event_date DATE, payload VARIANT)USING ICEBERGTBLPROPERTIES ('format-version' = '3')CLUSTER BY (event_date);Iceberg v3 (Beta, DBR 17.3+) adds deletion vectors, the VARIANT type, and row lineage. On v3, CLUSTER BY works without manually disabling DVs or row tracking. The tradeoff: external engines must use Iceberg library 1.9.0+, and you cannot downgrade back to v2 once upgraded.
Convert a Delta table to managed Iceberg
Section titled “Convert a Delta table to managed Iceberg”“Write SQL to convert an existing Delta table to a managed Iceberg table using DEEP CLONE.”
CREATE TABLE analytics.gold.events_icebergUSING ICEBERGDEEP CLONE analytics.gold.events_delta;DEEP CLONE copies data files into a new managed Iceberg table. SHALLOW CLONE is not supported for Iceberg. For foreign Iceberg tables (from Snowflake or another catalog), use CTAS instead:
CREATE TABLE analytics.gold.events_managedUSING ICEBERGPARTITIONED BY (event_date)AS SELECT * FROM foreign_catalog.schema.events;Query historical snapshots
Section titled “Query historical snapshots”“Write SQL to read the order_events table as it existed at a specific timestamp.”
SELECT * FROM analytics.gold.order_events TIMESTAMP AS OF '2025-06-01T00:00:00Z';
-- Or by snapshot IDSELECT * FROM analytics.gold.order_events VERSION AS OF 1234567890;Time travel works the same as Delta. External engines can also query the snapshot history metadata table via table_name.snapshots.
Watch Out For
Section titled “Watch Out For”- Never install an Iceberg library into DBR — Databricks Runtime includes built-in Iceberg support. Adding a JAR causes version conflicts and class-loading errors.
- Never set
write.metadata.pathorwrite.metadata.previous-versions-max— Databricks manages metadata locations automatically. Overriding these causes metadata corruption. - Default compression is zstd, not snappy — older Iceberg readers that do not support zstd will fail silently. Check reader compatibility or set
write.parquet.compression-codectosnappyin TBLPROPERTIES. - No Change Data Feed (CDF) on Iceberg tables — CDF is Delta-only. If you need CDF for downstream streaming tables, use Delta + UniForm instead.