Databricks¶
This guide walks you through setting up a Haute pipeline to deploy to Databricks Model Serving - step by step, from scratch. Once set up, every time you merge a change to main, CI automatically deploys your pipeline as a live API.
What is Databricks Model Serving?
Databricks is a cloud platform for data and AI. Model Serving is a feature that takes a model (in this case, your pricing pipeline) and hosts it as a live web address (called an API) that accepts quote data and returns results. You don't need to manage any servers.
New to the command line?
This guide involves typing a few setup commands in a terminal. If you've never done that before, read Before You Start first - it takes five minutes and will make everything below much clearer.
What this guide covers¶
This guide has 9 steps. You don't need to do them all in one sitting.
- Steps 1-3 are one-time credential setup - get your workspace URL, create a token, and give them to CI. This takes 10 minutes.
- Steps 4-6 are Databricks infrastructure - set up Unity Catalog, MLflow, and Model Serving. You or your data engineer do these once. If your workspace is already set up, you may be able to skip some of these.
- Steps 7-9 are your ongoing workflow - review your config, deploy by merging to main, and call the API.
How it works¶
You never run haute deploy yourself. Your workflow is:
- Edit your pipeline locally and preview with
haute serve - Push your changes and open a pull request - CI validates automatically
- Merge to main - CI deploys to a staging endpoint, runs smoke tests, generates an impact report
- Review the impact report and approve - CI deploys to production
All you need to set up is the configuration (haute.toml), the Databricks infrastructure (steps below), and CI secrets. After that, merging to main triggers everything automatically.
Prerequisites¶
Before you start, you need:
- A Databricks workspace (your organisation probably already has one)
- Python 3.11+ installed on your machine
- Haute installed with Databricks extras (
uv add "haute[databricks]"- see Installing Haute)
If you haven't initialised your project yet, open your VS Code terminal and run:
Your team may have already done this
If you cloned an existing project that already has a haute.toml file, skip this step - it's already initialised. This command is only needed once per project, and whoever set up the repository probably ran it already.
Step 1: Get your Databricks workspace URL¶
Your workspace URL is the web address you use to log into Databricks. It looks like one of these:
| Cloud | URL format |
|---|---|
| Azure (most common for UK insurance) | https://adb-1234567890123456.12.azuredatabricks.net |
| AWS | https://dbc-abc12345-1234.cloud.databricks.com |
| GCP | https://1234567890123456.gcp.databricks.com |
If you don't know your workspace URL, open Databricks in your browser and copy the URL from the address bar (just the part before any / path).
Don't have a Databricks workspace?
Ask your IT team - most insurance companies with a data platform already have one. If you need to create one, the cheapest option is Azure Databricks on the Premium tier (required for Model Serving).
Step 2: Create a Personal Access Token¶
A Personal Access Token (PAT) is like a password that Haute uses to talk to Databricks on your behalf. To create one:
- Open your Databricks workspace in a browser
- Click your user icon in the top-right corner → Settings
- Go to Developer → Access tokens
- Click Manage → Generate new token
- Fill in:
- Comment:
haute-deploy(so you remember what it's for) - Lifetime: 90 days (or whatever your organisation allows)
- Comment:
- Click Generate
- Copy the token immediately - you won't be able to see it again
The token looks like: dapi_your_token_here
Step 3: Add credentials to CI¶
Since deployment runs in CI (not on your laptop), your credentials need to be stored as encrypted secrets in your CI provider. This is a one-time setup.
The two values you need are:
| Secret name | Value |
|---|---|
DATABRICKS_HOST |
Your workspace URL from Step 1 |
DATABRICKS_TOKEN |
Your personal access token from Step 2 |
How to add them depends on your CI provider:
- GitHub Actions - see Adding GitHub Secrets
- GitLab CI/CD - see Adding CI/CD Variables
- Azure DevOps - see Creating a Variable Group
Don't know which CI provider you're using?
It depends on where your code is hosted. If you access your project on github.com, you're using GitHub. If it's gitlab.com (or a company GitLab server), you're using GitLab. If it's dev.azure.com, you're using Azure DevOps. If you're not sure, ask your IT team or tech lead: "Where is our code repository hosted?" They'll tell you GitHub, GitLab, or Azure DevOps.
You don't need credentials on your laptop
You never run haute deploy locally, so you don't need a .env file with passwords on your machine. The CI runner has the credentials; you just merge to main. The .env.example file in your project is a reference for whoever sets up the CI secrets.
Step 4: Set up Unity Catalog¶
Unity Catalog is where Databricks registers your deployed model. Think of it as a filing system - you need a catalog (like a cabinet) and a schema (like a folder inside it).
Check if Unity Catalog is enabled¶
- In your Databricks workspace, click Catalog in the left sidebar
- If you see a catalog browser with tables and schemas, it's enabled
- If you don't see it, ask your workspace admin to enable it
Create a schema for your models¶
Most workspaces already have a main catalog. You just need a schema inside it.
To run the SQL command below, open your Databricks workspace in a browser and do one of the following:
- Option A: SQL Editor - click SQL Editor in the left sidebar, paste the command, and click Run
- Option B: Notebook - click New → Notebook, change the language to SQL (dropdown at the top), paste the command, and click Run
You can name the schema anything you like - pricing is a sensible default.
Update haute.toml¶
Make sure your haute.toml matches:
Step 5: Set up an MLflow experiment¶
What is MLflow?
MLflow is a logging system built into Databricks that keeps an audit trail of every version you deploy - think of it as a version history for your pricing models. Every time you deploy, Haute creates an MLflow run recording what was deployed, when, and by whom. These runs are grouped inside an experiment (which is just a named folder for runs).
Every time you deploy, Haute logs the deployment as an MLflow run inside an experiment. This gives you a full history of every version you've ever deployed.
Option A: Let Haute create it automatically¶
If the experiment doesn't exist, Haute will create it on first deploy. Just set the path in haute.toml:
Option B: Create it manually¶
- In Databricks, click Experiments in the left sidebar (under Machine Learning)
- Click Create Experiment
- Set the name to
/Shared/haute/motor-pricing(or whatever matches your pipeline) - Click Create
Naming convention
We recommend /Shared/haute/<your-model-name> so all team members can access it:
Step 6: Check that Model Serving is available¶
Databricks Model Serving is the feature that actually hosts your pipeline as an API. To check it's available:
- Click Serving in the left sidebar
- If you see the Serving page, you're good
- If not, ask your workspace admin - Model Serving requires a Premium tier workspace
Cost
Model Serving is billed per compute-second. With serving_scale_to_zero = true in your config, you only pay when the endpoint is actually receiving requests. For a Small workload with occasional traffic, expect less than £5/month for dev/test.
Step 7: Review your full configuration¶
Here's what your complete haute.toml should look like:
[project]
name = "motor-pricing"
pipeline = "main.py"
[deploy]
target = "databricks"
model_name = "motor-pricing"
endpoint_name = "motor-pricing"
[deploy.databricks]
experiment_name = "/Shared/haute/motor-pricing"
catalog = "main"
schema = "pricing"
serving_workload_size = "Small"
serving_scale_to_zero = true
[test_quotes]
dir = "tests/quotes"
What each setting means¶
| Setting | What it does | Example |
|---|---|---|
target |
Tells Haute to deploy to Databricks | "databricks" |
model_name |
The name of your model in the Databricks Model Registry | "motor-pricing" |
endpoint_name |
The name of the serving endpoint (becomes part of the API URL) | "motor-pricing" |
experiment_name |
Where MLflow logs each deployment | "/Shared/haute/motor-pricing" |
catalog |
Unity Catalog name | "main" |
schema |
Unity Catalog schema | "pricing" |
serving_workload_size |
How much compute to allocate - Small, Medium, or Large |
"Small" |
serving_scale_to_zero |
Whether the endpoint shuts down when idle (saves money) | true |
Step 8: Deploy by merging to main¶
Once your configuration is set up and CI secrets are in place, you're ready to deploy. You don't run any deploy command - you just merge to main:
- Push your changes to a branch and open a pull request
- CI validates automatically - lints your code, runs tests, and does a dry-run deploy to check your pipeline parses and test quotes pass
- Merge the PR - CI deploys to a staging endpoint (
motor-pricing-staging), runs smoke tests, and generates an impact report - Review the impact report - download it from CI and check the premium changes make sense
- Approve - CI deploys to the real production endpoint
The first deploy will also create the serving endpoint. This can take 5–10 minutes to provision.
What does success look like?
After a successful deploy, you should see:
- In your CI provider - all pipeline steps show green ✓ (validation, staging deploy, smoke test, impact analysis)
- In Databricks - click Serving in the left sidebar and you'll see your endpoint (e.g.
motor-pricing) with status Ready and a green indicator - In MLflow - click Experiments in the left sidebar, navigate to your experiment (e.g.
/Shared/haute/motor-pricing), and you'll see a new run logged with the deployment details
If you see all three, congratulations - your pipeline is live and serving premiums!
If CI reports errors during validation, the most common causes are:
- Missing model files - check that the files referenced in your pipeline (e.g.
models/freq.cbm) exist - Test quote schema mismatch - your test quote JSON fields don't match what your pipeline expects
- Pipeline syntax error - there's a bug in your Python file
See your CI provider's setup guide for full details: GitHub Actions, GitLab, or Azure DevOps.
Step 9: Call the API¶
Once the endpoint is live, you can call it from any system that can make HTTP requests.
Using Python (recommended)¶
If you're more comfortable with Python than the command line, this is the easiest way to test:
import requests
# Replace these with your actual values
url = "https://adb-xxx.12.azuredatabricks.net/serving-endpoints/motor-pricing/invocations"
token = "dapi_your_token_here" # Your Databricks personal access token from Step 2
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
}
payload = {
"dataframe_records": [
{"IDpol": 99001, "VehPower": 7, "DrivAge": 42, "Area": "C", "VehBrand": "B12"}
]
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())
Using the Databricks SDK¶
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
response = w.serving_endpoints.query(
name="motor-pricing",
dataframe_records=[
{"IDpol": 99001, "VehPower": 7, "DrivAge": 42, "Area": "C", "VehBrand": "B12"}
],
)
print(response.predictions)
Using curl (advanced)¶
curl is a command-line tool for sending web requests. You don't need to use it - the Python examples above are easier and work the same on Windows - but it's included for reference:
curl -X POST `
"https://<your-workspace-url>/serving-endpoints/motor-pricing/invocations" `
-H "Authorization: Bearer <your-token-here>" `
-H "Content-Type: application/json" `
-d '{\"dataframe_records\": [{\"IDpol\": 99001, \"VehPower\": 7, \"DrivAge\": 42, \"Area\": \"C\", \"VehBrand\": \"B12\"}]}'
Prefer the Python example above
The Python examples are simpler and avoid Windows command-line quoting issues. Use those unless you have a specific reason to use curl.
Troubleshooting¶
"PERMISSION_DENIED" on deploy¶
Your token needs these permissions:
- Can Manage on the MLflow experiment
- USE CATALOG and USE SCHEMA on Unity Catalog
- Can Manage on serving endpoints (or ask an admin to create the endpoint first)
Ask your Databricks admin to grant these if you see permission errors. This is usually someone in your IT or data engineering team - the person who set up the Databricks workspace.
"Endpoint not found" when calling the API¶
After deploying, the serving endpoint can take 5–10 minutes to provision. Check its status:
Or in the Databricks UI: click Serving in the left sidebar and look for your endpoint.
Token expired¶
Tokens have a lifetime (default 90 days). If your deploy suddenly fails with an authentication error, generate a new token (Step 2) and update the DATABRICKS_TOKEN secret in your CI provider.
Slow first request (cold start)¶
With serving_scale_to_zero = true, the endpoint shuts down when it's idle. The first request after idle can take 30–60 seconds. For production endpoints that need consistent response times, set serving_scale_to_zero = false in haute.toml.
Missing dependency error on the endpoint¶
The deployed model needs the haute package. If you see No module named 'haute' in the endpoint logs, make sure haute is published and accessible from your Databricks workspace.
Checklist¶
Before your first deploy, confirm:
- [ ] You have your Databricks workspace URL
- [ ] You have a Personal Access Token
- [ ] Both are added as CI secrets (
DATABRICKS_HOST,DATABRICKS_TOKEN) - [ ] Unity Catalog is enabled with a catalog and schema
- [ ]
haute.tomlhas the correctexperiment_name,catalog, andschema - [ ] Model Serving is available in your workspace
- [ ] You have at least one test quote JSON file in
tests/quotes/ - [ ] CI/CD workflows are committed to your repository
- [ ] CI validation passes on a pull request