Development & Testing Workflow
Skill: databricks-model-serving
What You Can Build
Section titled “What You Can Build”You can go from a local agent.py file to a tested, working agent on Databricks in minutes. The workflow is write locally, upload, install packages, test, and iterate — each step driven by a single prompt to your AI coding assistant. Catching errors here saves you the 15-minute deployment cycle for every bug.
In Action
Section titled “In Action”“Upload my agent folder to Databricks and test it on a cluster. Use the agent files in ./my_agent/.”
import mlflowfrom mlflow.pyfunc import ResponsesAgentfrom mlflow.types.responses import ResponsesAgentRequest, ResponsesAgentResponsefrom databricks_langchain import ChatDatabricks
LLM_ENDPOINT = "databricks-meta-llama-3-3-70b-instruct"
class MyAgent(ResponsesAgent): def __init__(self): self.llm = ChatDatabricks(endpoint=LLM_ENDPOINT)
def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse: messages = [{"role": m.role, "content": m.content} for m in request.input] response = self.llm.invoke(messages) return ResponsesAgentResponse( output=[self.create_text_output_item(text=response.content, id="msg_1")] )
AGENT = MyAgent()mlflow.models.set_model(AGENT)Key decisions:
- Write locally, run remotely — your AI coding assistant edits
agent.pyon your machine and pushes it to the workspace for execution. No notebook editing required. - Test with real endpoints — local unit tests catch syntax errors but miss auth issues, missing packages, and endpoint connectivity. Remote testing on a cluster catches all of these.
- Keep the project structure flat —
agent.py,test_agent.py,log_model.pyin one folder. The upload and execution tools work best with a single directory. - Re-upload after every change — workspace files do not auto-sync. Each iteration requires upload then run.
More Patterns
Section titled “More Patterns”Write a Local Test Script
Section titled “Write a Local Test Script”“Create a test script that validates my agent handles basic requests. Use Python.”
from agent import AGENTfrom mlflow.types.responses import ResponsesAgentRequest, ChatContext
request = ResponsesAgentRequest( input=[{"role": "user", "content": "What is Databricks?"}], context=ChatContext(user_id="test@example.com"),)
# Non-streamingresult = AGENT.predict(request)print("Response:", result.model_dump(exclude_none=True))
# Streamingfor event in AGENT.predict_stream(request): print(event)Run this on the cluster after uploading. It imports your agent directly and calls predict, exercising the same code path the serving endpoint will use. If this works, deployment will work.
Install Packages on the Cluster
Section titled “Install Packages on the Cluster”“Install the MLflow 3 agent packages on my Databricks cluster.”
%pip install -U mlflow==3.6.0 databricks-langchain langgraph==0.3.4 databricks-agents pydanticdbutils.library.restartPython()The restartPython() call is mandatory after %pip install. Without it, the new packages are installed but the running Python process still has the old versions loaded.
Verify the Environment
Section titled “Verify the Environment”“Check which versions of the agent packages are installed on the cluster. Use Python.”
import pkg_resources
for pkg in ['mlflow', 'langchain', 'langgraph', 'pydantic', 'databricks-langchain']: try: version = pkg_resources.get_distribution(pkg).version print(f"{pkg}: {version}") except pkg_resources.DistributionNotFound: print(f"{pkg}: NOT INSTALLED")Run this before testing your agent. Version mismatches between your local pip_requirements and the cluster are the most common source of “works locally, fails remotely” bugs.
Smoke-Test a Foundation Model Endpoint
Section titled “Smoke-Test a Foundation Model Endpoint”“Verify that my cluster can reach the foundation model endpoint before testing the full agent. Use Python.”
from databricks_langchain import ChatDatabricks
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")response = llm.invoke([{"role": "user", "content": "Hello!"}])print(response.content)If this fails with a permission or connectivity error, your agent will fail too. Isolate the problem before debugging your agent code.
Watch Out For
Section titled “Watch Out For”- Skipping
restartPython()after%pip install— the Python process caches old module versions. You will see stale behavior or import errors until you restart. - Reusing a stale execution context — if you see strange errors after multiple iterations, let your AI coding assistant create a fresh context rather than reusing the old one.
- Testing only locally — local tests with mocked endpoints miss auth failures, package version conflicts, and network issues that only surface on the cluster. Always run
test_agent.pyon Databricks before logging the model. - Forgetting to re-upload after edits — workspace files do not auto-sync from your local machine. Every code change requires uploading the folder again before re-running.