Classical ML Models
Skill: databricks-model-serving
What You Can Build
Section titled “What You Can Build”You can go from a trained scikit-learn model to a live REST API in under ten minutes. MLflow autolog captures everything — parameters, metrics, model artifacts, input examples — and registers the model to Unity Catalog automatically. From there, you deploy to a serving endpoint that scales to zero when idle.
In Action
Section titled “In Action”“Train a logistic regression classifier, auto-register it to Unity Catalog, and deploy it to a serving endpoint. Use Python.”
import mlflowimport mlflow.sklearnfrom sklearn.linear_model import LogisticRegressionfrom sklearn.model_selection import train_test_split
# Autolog handles logging, signature inference, and UC registrationmlflow.sklearn.autolog( log_input_examples=True, registered_model_name="main.models.churn_classifier")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)
model = LogisticRegression()model.fit(X_train, y_train)# Model is now logged and registered in Unity Catalog automaticallyKey decisions:
autologcaptures parameters, metrics, model artifacts, and input examples with zero extra coderegistered_model_nameauto-registers to Unity Catalog on every training run, creating new versions automaticallylog_input_examples=Truesaves sample inputs alongside the model, which helps with debugging and schema inference at serving time- Supported frameworks:
sklearn,xgboost,lightgbm,pytorch,tensorflow,spark
More Patterns
Section titled “More Patterns”Manual Logging with Custom Metrics
Section titled “Manual Logging with Custom Metrics”“Train a random forest with custom metrics and explicit signature logging for tighter control. Use Python.”
import mlflowfrom sklearn.ensemble import RandomForestClassifierfrom mlflow.models import infer_signature
mlflow.set_registry_uri("databricks-uc")
with mlflow.start_run(): model = RandomForestClassifier(n_estimators=100) model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test) mlflow.log_metric("accuracy", accuracy)
signature = infer_signature(X_train, model.predict(X_train))
mlflow.sklearn.log_model( model, artifact_path="model", signature=signature, input_example=X_train[:5], registered_model_name="main.models.random_forest" )Manual logging gives you control over which metrics get recorded and how the model signature is inferred. Use it when autolog captures too much or too little for your use case.
Deploy to a Serving Endpoint
Section titled “Deploy to a Serving Endpoint”“Deploy my registered model to a serving endpoint with scale-to-zero using the Databricks SDK. Use Python.”
from databricks.sdk import WorkspaceClientfrom datetime import timedelta
w = WorkspaceClient()
endpoint = w.serving_endpoints.create_and_wait( name="churn-classifier", config={ "served_entities": [{ "entity_name": "main.models.churn_classifier", "entity_version": "1", "workload_size": "Small", "scale_to_zero_enabled": True }] }, timeout=timedelta(minutes=10))Classical ML endpoints typically provision in 2-5 minutes. scale_to_zero_enabled shuts down compute when there’s no traffic, which keeps costs near zero for dev and staging environments.
Query the Live Endpoint
Section titled “Query the Live Endpoint”“Send a prediction request to my deployed sklearn model endpoint. Use Python.”
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
response = w.serving_endpoints.query( name="churn-classifier", dataframe_records=[ {"age": 25, "income": 50000, "credit_score": 720}, {"age": 45, "income": 120000, "credit_score": 680} ])print(response.predictions)The dataframe_records format accepts a list of dictionaries where keys match your model’s input feature names. The response contains predictions in the same order as your input records.
Watch Out For
Section titled “Watch Out For”- Forgetting
log_input_examples=True— without input examples, the serving endpoint can’t validate the request schema, which makes debugging payload issues harder. - Version numbers are auto-incremented — every
model.fit()call with autolog creates a new version in Unity Catalog. Usemlflow.sklearn.autolog(disable=True)during experimentation to avoid version sprawl. - Scale-to-zero adds cold start latency — the first request after idle triggers a cold start (30-60 seconds). For latency-sensitive production use, set
scale_to_zero_enabled=False. - Missing
mlflow.set_registry_uri("databricks-uc")— without this, manual logging targets the workspace model registry instead of Unity Catalog. Autolog withregistered_model_namein three-level format handles this automatically.