Skip to content

Creating & Managing Genie Spaces

Skill: databricks-genie

A Genie Space connects Unity Catalog tables to a natural language interface — users ask questions in plain English and get SQL-generated answers. You can ask your AI coding assistant to inspect your table schemas, create a space with curated sample questions, and test it via the Conversation API. When your space is production-ready, export it and migrate across workspaces with catalog remapping for dev/staging/prod promotion.

“Create a Genie Space for our retail sales data using the customers, orders, and products tables in my_catalog.sales, with sample questions that reference actual column names.”

# Step 1: Inspect tables to understand the data
get_table_details(
catalog="my_catalog",
schema="sales",
table_stat_level="SIMPLE"
)
# Step 2: Create the space with curated context
create_or_update_genie(
display_name="Sales Analytics",
table_identifiers=[
"my_catalog.sales.customers",
"my_catalog.sales.orders",
"my_catalog.sales.products"
],
description="""Explore retail sales data with three related tables:
- customers: demographics including region, segment, and signup date
- orders: transaction history with order_date, total_amount, and status
- products: catalog with category, price, and inventory
Tables join on customer_id and product_id.""",
sample_questions=[
"What were total sales last month?",
"Who are our top 10 customers by total_amount?",
"How many orders were placed in Q4 by region?",
"What's the average order value by customer segment?",
"Which product categories have the highest revenue?",
"Show me customers who haven't ordered in 90 days"
]
)

Key decisions:

  • Always inspect tables firstget_table_details returns column names, types, sample values, and cardinality. Sample questions that reference actual column names produce dramatically better SQL generation.
  • Descriptions explain relationships — Genie uses the description to understand how tables join. Without it, multi-table questions produce wrong joins or errors.
  • Silver/gold tables only — bronze tables have quality issues that confuse Genie. Use cleaned, validated tables for best results.
  • Warehouse auto-detection — when warehouse_id is omitted, the tool picks the smallest running warehouse. Specify it explicitly for production spaces.

“Update my Sales Analytics Genie Space with a new table and refreshed sample questions.”

create_or_update_genie(
display_name="Sales Analytics",
space_id="01abc123...",
table_identifiers=[
"my_catalog.sales.customers",
"my_catalog.sales.orders",
"my_catalog.sales.products",
"my_catalog.sales.returns"
],
sample_questions=[
"What were total sales last month?",
"Who are our top 10 customers by revenue?",
"What is the return rate by product category?"
],
description="Updated description with returns data.",
)

create_or_update_genie handles both create and update. Pass space_id to target a specific space, or omit it to match by display_name. If no match is found, a new space is created.

“Export my production Genie Space and create a dev copy in the same workspace.”

# Export the source space
source = migrate_genie(type="export", space_id="01abc123...")
# Clone as a new space
migrate_genie(
type="import",
warehouse_id=source["warehouse_id"],
serialized_space=source["serialized_space"],
title="Sales Analytics (Dev Copy)",
description=source["description"],
)

The export captures everything — tables, sample questions, certified SQL queries, join specs, filter snippets, and benchmarks. The import creates a brand new space with a new space_id.

Migrate across workspaces with catalog remapping

Section titled “Migrate across workspaces with catalog remapping”

“Migrate my Sales Analytics Genie Space from prod to dev, remapping the catalog name from prod_catalog to dev_catalog.”

# Step 1: Export from source workspace
exported = migrate_genie(type="export", space_id="01abc123...")
# Step 2: Remap catalog references in the serialized config
modified = exported["serialized_space"].replace(
"prod_catalog",
"dev_catalog"
)
# Step 3: Import to target workspace
migrate_genie(
type="import",
warehouse_id="target_warehouse_id",
serialized_space=modified,
title=exported["title"],
description=exported["description"]
)

Catalog names appear everywhere inside serialized_space — table identifiers, SQL FROM clauses, join specs, and filter snippets. A single .replace() on the whole string covers all occurrences. Each MCP server is workspace-scoped, so for cross-workspace migration, configure one server entry per workspace profile in your IDE’s MCP config and export from the source server, import via the target.

“Migrate three Genie Spaces from prod to dev in one operation.”

space_ids = ["id1", "id2", "id3"]
results = []
for sid in space_ids:
exported = migrate_genie(type="export", space_id=sid)
modified = exported["serialized_space"].replace("prod_catalog", "dev_catalog")
result = migrate_genie(
type="import",
warehouse_id="dev_warehouse_id",
serialized_space=modified,
title=exported["title"],
description=exported["description"]
)
results.append(result["space_id"])

After migration, update databricks.yml with the new dev space_id values under the dev target’s genie_space_ids variable.

  • Sample questions without actual column names — generic questions like “show me the data” teach Genie nothing. Questions that reference real columns (“total_amount by region”) dramatically improve SQL generation accuracy.
  • Missing table comments and descriptive column names — Genie uses Unity Catalog metadata for query generation. customer_name works better than cust_nm. Add COMMENT ON TABLE and COMMENT ON COLUMN for best results.
  • migrate_genie(type="export") returns empty serialized_space — this means the caller lacks CAN EDIT permission on the source space. Check space permissions.
  • serialized_space with multi-line SQL causes JSON parse errors on import — SQL arrays inside the config may contain \n escape sequences. Flatten to single-line strings before importing to avoid double-escaping issues.