Managed Iceberg Tables
Skill: databricks-iceberg
What You Can Build
Section titled “What You Can Build”Managed Iceberg tables give you native Apache Iceberg storage inside Unity Catalog. You get full DML support in Databricks and read/write access from external engines via the Iceberg REST Catalog — no format conversion, no metadata sync delays. When you need a single table that Spark, PyIceberg, Snowflake, and Flink can all hit directly, this is the path.
In Action
Section titled “In Action”“Using SQL, create a managed Iceberg v2 table for an orders dataset with Liquid Clustering on order_date. Use PARTITIONED BY for cross-platform compatibility.”
CREATE TABLE analytics.sales.orders ( order_id BIGINT, customer_id BIGINT, amount DECIMAL(10,2), region STRING, order_date DATE)USING ICEBERGPARTITIONED BY (order_date);Key decisions:
USING ICEBERGmakes this a native Iceberg table, not Delta with UniFormPARTITIONED BYauto-handles deletion vector and row-tracking properties — no manual TBLPROPERTIES needed on v2 or v3- Both
PARTITIONED BYandCLUSTER BYproduce the same Iceberg metadata for external engines — UC maps both to Liquid Clustering keys exposed as Iceberg partition fields PARTITIONED BYis standard Iceberg DDL that any engine can create;CLUSTER BYis Databricks-only
More Patterns
Section titled “More Patterns”Create from existing data with CTAS
Section titled “Create from existing data with CTAS”“Using SQL, create a managed Iceberg table from raw events, clustered on event_date, targeting Iceberg v3.”
CREATE TABLE analytics.bronze.events_archiveUSING ICEBERGTBLPROPERTIES ('format-version' = '3')CLUSTER BY (event_date)AS SELECT * FROM analytics.staging.raw_eventsWHERE event_date < '2025-01-01';On v3, CLUSTER BY works without extra TBLPROPERTIES because deletion vectors and row tracking are handled natively. On v2, you’d need to explicitly set 'delta.enableDeletionVectors' = false and 'delta.enableRowTracking' = false.
Upsert with MERGE
Section titled “Upsert with MERGE”“Using SQL, write a MERGE statement that upserts staged order data into a managed Iceberg table.”
MERGE INTO analytics.sales.orders AS targetUSING analytics.staging.new_orders AS sourceON target.order_id = source.order_idWHEN MATCHED THEN UPDATE SET *WHEN NOT MATCHED THEN INSERT *;Managed Iceberg tables support full MERGE, INSERT, UPDATE, and DELETE. The one DML gap: Change Data Feed (CDF) is Delta-only. If you need CDF, use Delta with UniForm instead.
Time travel queries
Section titled “Time travel queries”“Using SQL, query an Iceberg table as it existed at a specific point in time.”
-- By timestampSELECT * FROM analytics.sales.ordersTIMESTAMP AS OF '2025-06-01T00:00:00Z';
-- By snapshot IDSELECT * FROM analytics.sales.ordersVERSION AS OF 1234567890;Time travel works against Iceberg snapshots. External engines can also inspect snapshot history via SELECT * FROM analytics.sales.orders.snapshots.
Convert a Delta table to managed Iceberg
Section titled “Convert a Delta table to managed Iceberg”“Using SQL, deep-clone an existing Delta table into a managed Iceberg table with Liquid Clustering.”
CREATE TABLE analytics.sales.orders_icebergUSING ICEBERGTBLPROPERTIES ('format-version' = '3')CLUSTER BY (order_date)DEEP CLONE analytics.sales.orders_delta;DEEP CLONE copies all data into native Iceberg format. SHALLOW CLONE is not supported for Iceberg tables — always use DEEP CLONE or CTAS.
Watch Out For
Section titled “Watch Out For”- Expression transforms not supported —
bucket(),years(),months(),days(), andhours()don’t work withPARTITIONED BYon managed Iceberg tables. Only plain column references are allowed. CLUSTER BYon v2 requires manual TBLPROPERTIES — forgetting'delta.enableDeletionVectors' = falseand'delta.enableRowTracking' = falseproduces aMANAGED_ICEBERG_ATTEMPTED_TO_ENABLE_CLUSTERING_WITHOUT_DISABLING_DVS_OR_ROW_TRACKINGerror. UsePARTITIONED BYto avoid this entirely.- No CDF on Iceberg — Change Data Feed is a Delta feature. If downstream consumers depend on CDF, keep the table as Delta and add UniForm for external Iceberg reads.
- Do not install an Iceberg library into DBR — Databricks Runtime includes built-in Iceberg support. Adding an Iceberg JAR causes version conflicts.
- v3 is irreversible — once you upgrade a table to format-version 3, you cannot downgrade. External engines must use Iceberg library 1.9.0+ to read v3 tables.
- Enable Predictive Optimization — it’s recommended but not auto-enabled for managed Iceberg tables. Turn it on at the catalog or schema level to get automatic compaction, vacuum, and statistics collection.