Skip to content

Genie Conversations

Skill: databricks-genie

The Genie Conversation API lets you ask natural language questions to a curated Genie Space and get back SQL, results, and column metadata — programmatically. Use it when you want your AI coding assistant to query business data through a space that already has curated logic, certified queries, and domain-specific instructions. Genie handles the SQL generation; you handle the conversation flow.

“Ask my Sales Genie what total revenue was last month, then follow up with a regional breakdown.”

# First question -- starts a new conversation
result = ask_genie(
space_id="01abc123...",
question="What were total sales last month?"
)
# result["status"] == "COMPLETED"
# result["sql"] contains the generated query
# result["data"] contains [[125430.50]]
# Follow-up -- reuse conversation_id for context
followup = ask_genie(
space_id="01abc123...",
question="Break that down by region",
conversation_id=result["conversation_id"]
)

Key decisions:

  • conversation_id carries context — passing it from a previous response tells Genie that “that” refers to “total sales last month.” Omit it to start a fresh conversation on a different topic.
  • Use Genie when curated logic exists — if the space has certified queries defining “active customer” or “churn rate,” Genie applies those rules. For ad-hoc queries where you already know the SQL, use direct SQL execution instead.
  • The response includes the generated SQL — you can inspect, modify, or log it. The sql field shows exactly what Genie ran against the warehouse.
  • Genie may ask for clarification — if the question is ambiguous, text_response contains a follow-up question instead of results. Rephrase with more specifics.

“Write Python to handle success, failure, and timeout responses from the Genie Conversation API.”

result = ask_genie(space_id="01abc123...", question="Who are our top 10 customers?")
if result["status"] == "COMPLETED":
print(f"SQL: {result['sql']}")
print(f"Rows: {result['row_count']}")
for row in result["data"]:
print(row)
elif result["status"] == "FAILED":
print(f"Error: {result['error']}")
# Rephrase the question or check space configuration
elif result["status"] == "TIMEOUT":
print("Query took too long -- simplify the question or increase timeout")

The status field is one of COMPLETED, FAILED, CANCELLED, or TIMEOUT. Always check it before accessing data or sql. Failed responses often mean the question is outside the scope of the space’s tables — not that something is broken.

“Write Python to detect when Genie asks for clarification instead of returning results.”

result = ask_genie(space_id="01abc123...", question="Show me the data")
if result.get("text_response"):
# Genie is asking for clarification, not returning results
print(f"Genie asks: {result['text_response']}")
# Rephrase: "Show me the top 10 orders by amount this month"

Vague questions like “show me the data” or “what’s happening” trigger clarification instead of SQL generation. Questions that reference actual column names and time ranges (“total revenue by region last quarter”) get direct answers.

“Write Python showing appropriate timeout settings for different types of Genie questions.”

# Simple aggregation -- fast
ask_genie(space_id, "How many orders today?", timeout_seconds=30)
# Multi-table join -- moderate
ask_genie(space_id, "What's the average order value by customer segment?",
timeout_seconds=60)
# Full-table scan with complex logic -- slow
ask_genie(space_id, "Calculate customer lifetime value for all customers",
timeout_seconds=180)

The default timeout works for most questions. Bump it for questions that imply large scans or complex joins. If a question consistently times out, the underlying SQL may need optimization — check the generated SQL in the response.

“Show a complete multi-turn data exploration session using Genie.”

# Start with a broad question
r1 = ask_genie(space_id, "What were total sales by month this year?")
conv_id = r1["conversation_id"]
# Drill into the interesting finding
r2 = ask_genie(space_id, "Which month had the highest growth?",
conversation_id=conv_id)
# Go deeper
r3 = ask_genie(space_id, "What products drove that growth?",
conversation_id=conv_id)
# Switch topics -- new conversation
r4 = ask_genie(space_id, "How many new customers signed up last quarter?")

Keep the same conversation_id for related follow-ups within a topic. Start a new conversation (omit conversation_id) when switching to an unrelated question. Mixing topics in one conversation confuses the context and produces worse SQL.

  • Reusing conversation_id across unrelated topics — Genie uses prior context to interpret follow-ups. Asking “How many employees?” after a sales conversation causes Genie to look for employee data in sales tables. Start a fresh conversation for new topics.
  • “Space not found” errors — verify the space_id is correct and you have access. Use get_genie(space_id) to confirm the space exists before querying.
  • Timeout on a running warehouse — if the SQL warehouse is stopped, it must start before executing. The startup time counts against your timeout. Set a longer timeout or ensure the warehouse is running before querying.
  • Genie generates SQL, it does not guarantee correctness — always review the sql field for critical decisions. Add SQL instructions and certified queries in the Genie Space UI to improve accuracy for your domain.