Genie Conversations
Skill: databricks-genie
What You Can Build
Section titled “What You Can Build”The Genie Conversation API lets you ask natural language questions to a curated Genie Space and get back SQL, results, and column metadata — programmatically. Use it when you want your AI coding assistant to query business data through a space that already has curated logic, certified queries, and domain-specific instructions. Genie handles the SQL generation; you handle the conversation flow.
In Action
Section titled “In Action”“Ask my Sales Genie what total revenue was last month, then follow up with a regional breakdown.”
# First question -- starts a new conversationresult = ask_genie( space_id="01abc123...", question="What were total sales last month?")# result["status"] == "COMPLETED"# result["sql"] contains the generated query# result["data"] contains [[125430.50]]
# Follow-up -- reuse conversation_id for contextfollowup = ask_genie( space_id="01abc123...", question="Break that down by region", conversation_id=result["conversation_id"])Key decisions:
conversation_idcarries context — passing it from a previous response tells Genie that “that” refers to “total sales last month.” Omit it to start a fresh conversation on a different topic.- Use Genie when curated logic exists — if the space has certified queries defining “active customer” or “churn rate,” Genie applies those rules. For ad-hoc queries where you already know the SQL, use direct SQL execution instead.
- The response includes the generated SQL — you can inspect, modify, or log it. The
sqlfield shows exactly what Genie ran against the warehouse. - Genie may ask for clarification — if the question is ambiguous,
text_responsecontains a follow-up question instead of results. Rephrase with more specifics.
More Patterns
Section titled “More Patterns”Handle response statuses
Section titled “Handle response statuses”“Write Python to handle success, failure, and timeout responses from the Genie Conversation API.”
result = ask_genie(space_id="01abc123...", question="Who are our top 10 customers?")
if result["status"] == "COMPLETED": print(f"SQL: {result['sql']}") print(f"Rows: {result['row_count']}") for row in result["data"]: print(row)
elif result["status"] == "FAILED": print(f"Error: {result['error']}") # Rephrase the question or check space configuration
elif result["status"] == "TIMEOUT": print("Query took too long -- simplify the question or increase timeout")The status field is one of COMPLETED, FAILED, CANCELLED, or TIMEOUT. Always check it before accessing data or sql. Failed responses often mean the question is outside the scope of the space’s tables — not that something is broken.
Handle clarification requests
Section titled “Handle clarification requests”“Write Python to detect when Genie asks for clarification instead of returning results.”
result = ask_genie(space_id="01abc123...", question="Show me the data")
if result.get("text_response"): # Genie is asking for clarification, not returning results print(f"Genie asks: {result['text_response']}") # Rephrase: "Show me the top 10 orders by amount this month"Vague questions like “show me the data” or “what’s happening” trigger clarification instead of SQL generation. Questions that reference actual column names and time ranges (“total revenue by region last quarter”) get direct answers.
Set timeouts based on query complexity
Section titled “Set timeouts based on query complexity”“Write Python showing appropriate timeout settings for different types of Genie questions.”
# Simple aggregation -- fastask_genie(space_id, "How many orders today?", timeout_seconds=30)
# Multi-table join -- moderateask_genie(space_id, "What's the average order value by customer segment?", timeout_seconds=60)
# Full-table scan with complex logic -- slowask_genie(space_id, "Calculate customer lifetime value for all customers", timeout_seconds=180)The default timeout works for most questions. Bump it for questions that imply large scans or complex joins. If a question consistently times out, the underlying SQL may need optimization — check the generated SQL in the response.
Multi-turn exploration workflow
Section titled “Multi-turn exploration workflow”“Show a complete multi-turn data exploration session using Genie.”
# Start with a broad questionr1 = ask_genie(space_id, "What were total sales by month this year?")conv_id = r1["conversation_id"]
# Drill into the interesting findingr2 = ask_genie(space_id, "Which month had the highest growth?", conversation_id=conv_id)
# Go deeperr3 = ask_genie(space_id, "What products drove that growth?", conversation_id=conv_id)
# Switch topics -- new conversationr4 = ask_genie(space_id, "How many new customers signed up last quarter?")Keep the same conversation_id for related follow-ups within a topic. Start a new conversation (omit conversation_id) when switching to an unrelated question. Mixing topics in one conversation confuses the context and produces worse SQL.
Watch Out For
Section titled “Watch Out For”- Reusing conversation_id across unrelated topics — Genie uses prior context to interpret follow-ups. Asking “How many employees?” after a sales conversation causes Genie to look for employee data in sales tables. Start a fresh conversation for new topics.
- “Space not found” errors — verify the
space_idis correct and you have access. Useget_genie(space_id)to confirm the space exists before querying. - Timeout on a running warehouse — if the SQL warehouse is stopped, it must start before executing. The startup time counts against your timeout. Set a longer timeout or ensure the warehouse is running before querying.
- Genie generates SQL, it does not guarantee correctness — always review the
sqlfield for critical decisions. Add SQL instructions and certified queries in the Genie Space UI to improve accuracy for your domain.