Skip to content

UniForm & Compatibility Mode

Skill: databricks-iceberg

UniForm lets you keep writing Delta internally while external engines read the same table as Iceberg. No migration, no dual pipelines. Databricks auto-generates Iceberg metadata after each Delta transaction, so Snowflake, PyIceberg, or Trino can query via the IRC endpoint. Compatibility Mode extends this to streaming tables and materialized views created in Spark Declarative Pipelines.

“Write SQL to enable UniForm on an existing Delta customer table so Snowflake can read it as Iceberg.”

ALTER TABLE analytics.gold.customers
SET TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableIcebergCompatV2' = 'true',
'delta.universalFormat.enabledFormats' = 'iceberg'
);

Key decisions:

  • Column mapping must be name mode — UniForm requires it. If the table uses id mode, migrate to name first.
  • Deletion vectors must be disabled — UniForm is incompatible with DVs. If they are currently enabled, disable and purge before enabling UniForm.
  • Iceberg metadata is generated asynchronously — there is a brief delay (typically seconds, occasionally minutes for large transactions) before external engines see the latest data.
  • External reads are read-only — the underlying format is still Delta. External engines cannot write back through the IRC endpoint.

“Write SQL to create a new Delta table with UniForm enabled from the start.”

CREATE TABLE analytics.gold.customers (
customer_id BIGINT,
name STRING,
region STRING,
updated_at TIMESTAMP
)
TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableIcebergCompatV2' = 'true',
'delta.universalFormat.enabledFormats' = 'iceberg'
);

Setting the properties at creation time avoids the disable-DVs-then-purge dance required on existing tables.

Fix an existing table that has deletion vectors enabled

Section titled “Fix an existing table that has deletion vectors enabled”

“Write SQL to disable deletion vectors on an existing table, purge them, then enable UniForm.”

-- Step 1: Disable deletion vectors
ALTER TABLE analytics.gold.customers
SET TBLPROPERTIES ('delta.enableDeletionVectors' = 'false');
-- Step 2: Rewrite files to remove existing DVs
REORG TABLE analytics.gold.customers APPLY (PURGE);
-- Step 3: Enable UniForm
ALTER TABLE analytics.gold.customers
SET TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableIcebergCompatV2' = 'true',
'delta.universalFormat.enabledFormats' = 'iceberg'
);

The REORG ... APPLY (PURGE) step is critical — it physically rewrites data files to remove deletion vector references. Without it, the Iceberg metadata generation will fail.

Enable Compatibility Mode on a streaming table

Section titled “Enable Compatibility Mode on a streaming table”

“Write SQL to create a streaming table in an SDP pipeline that external engines can read as Iceberg.”

CREATE OR REFRESH STREAMING TABLE my_events
TBLPROPERTIES (
'delta.universalFormat.enabledFormats' = 'compatibility',
'delta.universalFormat.compatibility.location' = 's3://my-bucket/iceberg-compat/my_events/'
)
AS SELECT * FROM STREAM read_files('/Volumes/catalog/schema/raw/events/');

Compatibility Mode is the only way to expose streaming tables and materialized views as Iceberg. It writes a separate copy of the data to the external location in Iceberg-compatible format, so factor in additional storage costs. The Python SDP equivalent:

from pyspark import pipelines as dp
@dp.table(
name="my_events",
table_properties={
"delta.universalFormat.enabledFormats": "compatibility",
"delta.universalFormat.compatibility.location": "s3://my-bucket/iceberg-compat/my_events/",
},
)
def my_events():
return (
spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "json")
.load("/Volumes/catalog/schema/raw/events/")
)

Control refresh frequency for Compatibility Mode

Section titled “Control refresh frequency for Compatibility Mode”

“Write SQL to create a streaming table with Compatibility Mode that refreshes Iceberg metadata after every commit.”

CREATE OR REFRESH STREAMING TABLE my_events
TBLPROPERTIES (
'delta.universalFormat.enabledFormats' = 'compatibility',
'delta.universalFormat.compatibility.location' = 's3://my-bucket/iceberg-compat/my_events/',
'delta.universalFormat.compatibility.targetRefreshInterval' = '0 MINUTES'
)
AS SELECT * FROM STREAM read_files('/Volumes/catalog/schema/raw/events/');

0 MINUTES checks for changes after every commit — the default for SDP tables. For non-SDP tables the default is 1 HOUR. Values below 1 HOUR like 30 MINUTES will not actually refresh faster than hourly.

  • Compatibility Mode doubles storage cost — it writes a full copy of the data to the external location, not a pointer. Plan for this on large tables.
  • The compatibility.location must be a pre-configured external location — if the path is not registered as an external location in Unity Catalog, the metadata generation fails silently.
  • UniForm async delay can surprise pipelines — if an external engine reads immediately after a Databricks write, it may see stale data. Use DESCRIBE EXTENDED table_name to check Iceberg metadata generation status.
  • Disabling UniForm is a one-liner, but re-enabling requires the DV purge cycle againUNSET TBLPROPERTIES ('delta.universalFormat.enabledFormats') removes UniForm, but if DVs get re-enabled in the meantime, you must purge before turning UniForm back on.