Tools Integration
Skill: databricks-model-serving
What You Can Build
Section titled “What You Can Build”You can give your agent the ability to query databases, search documents, call APIs, and execute arbitrary Python — all through a standardized tool interface. Unity Catalog functions give you governed SQL and Python tools. Vector Search gives you semantic retrieval. Custom LangChain tools give you everything else. Combine them and the agent decides which tool to call based on the user’s question.
In Action
Section titled “In Action”“Add Unity Catalog functions as tools to my agent so it can look up customer data. Use Python.”
from databricks_langchain import ChatDatabricks, UCFunctionToolkit
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")
uc_toolkit = UCFunctionToolkit( function_names=[ "catalog.schema.get_customer_info", "catalog.schema.lookup_order_status", "system.ai.python_exec", ])
tools = []tools.extend(uc_toolkit.tools)
llm_with_tools = llm.bind_tools(tools)Key decisions:
- Explicit function names over wildcards (
catalog.schema.*) — list each function so you control exactly what the agent can access. Wildcards are convenient for development but risky in production. system.ai.python_execgives the agent a sandboxed Python interpreter. Powerful for computation but use with caution — it can execute arbitrary code.bind_toolsattaches tool schemas to the LLM so it knows what tools are available and how to call them. This is a LangChain convention, not Databricks-specific.- UC functions are governed by Unity Catalog permissions, so the agent inherits the deployer’s access level.
More Patterns
Section titled “More Patterns”Add Vector Search as a Retriever Tool
Section titled “Add Vector Search as a Retriever Tool”“Give my agent a Vector Search index it can query for relevant documentation. Use Python.”
from databricks_langchain import VectorSearchRetrieverTool
vs_tool = VectorSearchRetrieverTool( index_name="catalog.schema.docs_index", num_results=5,)
tools = [vs_tool]The agent calls this tool when it needs to find relevant documents. num_results controls how many chunks come back. For more precise retrieval, add filters:
vs_tool = VectorSearchRetrieverTool( index_name="catalog.schema.docs_index", num_results=10, filters={"doc_type": "technical", "status": "published"}, columns=["content", "title", "url"],)Filters narrow the search space before the vector similarity runs. Specifying columns reduces payload size by returning only what the agent needs.
Create Custom LangChain Tools
Section titled “Create Custom LangChain Tools”“Build custom tools for my agent that get the current time and evaluate math expressions. Use Python.”
from langchain_core.tools import tool
@tooldef get_current_time(timezone: str = "UTC") -> str: """Get the current time in the specified timezone.
Args: timezone: The timezone (e.g., 'UTC', 'America/New_York') """ from datetime import datetime import pytz
tz = pytz.timezone(timezone) now = datetime.now(tz) return now.strftime("%Y-%m-%d %H:%M:%S %Z")
@tooldef calculate(expression: str) -> str: """Evaluate a mathematical expression.
Args: expression: A math expression like '2 + 2' or 'sqrt(16)' """ import math allowed = {k: v for k, v in math.__dict__.items() if not k.startswith('_')} try: result = eval(expression, {"__builtins__": {}}, allowed) return str(result) except Exception as e: return f"Error: {e}"
tools = [get_current_time, calculate]Custom tools run inside the serving endpoint process — they do not need resource declarations because they do not call external Databricks services. The docstring becomes the tool description the LLM sees, so write it clearly.
Combine All Tool Types
Section titled “Combine All Tool Types”“Set up a tool list with UC functions, Vector Search, and custom tools, then bind them to my LLM. Use Python.”
from databricks_langchain import ChatDatabricks, UCFunctionToolkit, VectorSearchRetrieverToolfrom langchain_core.tools import tool
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")
tools = []
# 1. UC Functionsuc_toolkit = UCFunctionToolkit(function_names=["catalog.schema.get_customer_info"])tools.extend(uc_toolkit.tools)
# 2. Vector Searchvs_tool = VectorSearchRetrieverTool(index_name="catalog.schema.docs_index")tools.append(vs_tool)
# 3. Custom tools@tooldef my_custom_tool(query: str) -> str: """Custom tool description.""" return f"Result for: {query}"
tools.append(my_custom_tool)
llm_with_tools = llm.bind_tools(tools)Order does not matter in the tools list. The LLM picks which tool to call based on the user’s question and the tool descriptions.
Declare Resources for Logging
Section titled “Declare Resources for Logging”“Build the resource list for log_model that covers all my agent’s external dependencies. Use Python.”
from mlflow.models.resources import ( DatabricksServingEndpoint, DatabricksFunction,)from unitycatalog.ai.langchain.toolkit import UnityCatalogToolfrom databricks_langchain import VectorSearchRetrieverTool
resources = [DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT)]
for tool in tools: if isinstance(tool, UnityCatalogTool): resources.append(DatabricksFunction(function_name=tool.uc_function_name)) elif isinstance(tool, VectorSearchRetrieverTool): resources.extend(tool.resources) # Custom tools don't need resources -- they run in-process
mlflow.pyfunc.log_model( name="agent", python_model="agent.py", resources=resources, pip_requirements=["mlflow==3.6.0", "databricks-langchain", "langgraph==0.3.4"],)Every external Databricks service your tools call must be declared in resources for the serving endpoint to configure auth passthrough. Custom LangChain tools are excluded because they execute inside the endpoint.
Watch Out For
Section titled “Watch Out For”- Missing resource declarations — UC functions and Vector Search indexes need explicit entries in
resourceswhen you calllog_model. Without them, the serving endpoint cannot authenticate and you get permission errors at query time, not at deployment time. - Wildcard function names in production —
UCFunctionToolkit(function_names=["catalog.schema.*"])grabs every function in the schema. New functions added later become available to the agent without review. Use explicit names in production. - Vague tool docstrings — the LLM uses the docstring to decide when to call a tool. A docstring like “Useful tool” gives the LLM no signal. Write specific descriptions: “Get customer information by customer ID from the CRM database.”
- Forgetting
VectorSearchRetrieverTool.resources— this tool wraps both a vector index and its embedding endpoint. Calltool.resourcesto get both resource declarations, not just the index.