Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
217 changes: 156 additions & 61 deletions content/recipes/ai-chat-model-serving.md
Original file line number Diff line number Diff line change
@@ -1,144 +1,239 @@
## AI Chat with Databricks Model Serving
## Streaming AI Chat with Model Serving

Build a streaming AI chat experience in a Databricks App using AI SDK, AI Elements, and a Databricks Model Serving endpoint.
Build a streaming AI chat experience in a Databricks App using Vercel AI SDK with Databricks Model Serving and OpenAI-compatible endpoints.

### 1. Follow the prerequisite recipes first
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to change this now but thinking prerequisite should likely be metadata and not part of the content. I know you didn't introduce them just have thought a bit more about it since I added this.

Template = recipe A + recipe C

If recipe C requires recipe B then we should always add it to the cookbook to be complete.

Template = recipe A + recipe B + recipe C

And for standalone recipes, we can render the meta data of prerequisite into the content.


Complete these recipes before adding chat:

- [`Databricks Local Bootstrap`](/resources/base-app-template#databricks-local-bootstrap)
- [`Lakebase Data Persistence`](/resources/data-app-template#lakebase-data-persistence)
- [`Create a Databricks Model Serving endpoint`](/resources/ai-chat-app-template#create-a-databricks-model-serving-endpoint)
- [`Query AI Gateway Endpoints`](/resources/ai-chat-app-template#query-ai-gateway-endpoints)

### 2. Install AI SDK and AI Elements packages
:::tip[Deploy before local development]
Lakebase tables are owned by the identity that creates them. Deploy the app first so the service principal creates and owns the schemas. Then grant yourself local dev access:

```bash
npm install ai @ai-sdk/react
npx shadcn@latest add @ai-elements/all
databricks psql --project <project-name> --profile <PROFILE> -- -c "
CREATE EXTENSION IF NOT EXISTS databricks_auth;
SELECT databricks_create_role('<your-email>', 'USER');
GRANT databricks_superuser TO \"<your-email>\";
"
```

### 3. Install AI SDK and AI Elements agent skills
> **Note**: If you are the Lakebase project owner, `databricks_create_role` may fail with `role already exists` and `GRANT databricks_superuser` may fail with `permission denied to grant role`. Both errors are safe to ignore — the project owner already has the necessary access.

Install and verify the Databricks agent skills, then add AI SDK and AI Elements skills from your agent skill registry.
If you run `npm run dev` before deploying, your user creates schemas that the deployed service principal cannot access. `CREATE SCHEMA IF NOT EXISTS` appears to succeed even on schemas you don't own, but subsequent `CREATE TABLE` fails with `permission denied`.
:::

### 2. Install AI SDK packages

```bash
npx skills add databricks/databricks-agent-skills --all
npx skills add <ai-sdk-skill-id> --all
npx skills add <ai-elements-skill-id> --all
npx skills list
npm install ai@6 @ai-sdk/react@3 @ai-sdk/openai @databricks/sdk-experimental
```

### 4. Configure environment variables for model serving
> **Version note**: This recipe uses AI SDK v6 APIs (`TextStreamChatTransport`, `sendMessage({ text })`, transport-based `useChat`). Tested with `ai@6.1`, `@ai-sdk/react@3.1`, and `@ai-sdk/openai@3.x`.

> **Note**: `@databricks/sdk-experimental` is included in the scaffolded `package.json`. It is listed here for reference if adding AI chat to an existing project.

> **Optional**: For pre-built chat UI components, initialize shadcn and add AI Elements:
>
> ```bash
> npx shadcn@latest init
> ```
>
> This basic recipe works without AI Elements — they're optional prebuilt components.

### 3. Configure environment variables for AI Gateway

Set your workspace URL and endpoint name for local development:
Configure your Databricks workspace ID and model endpoint:

For local development (`.env`):

```bash
echo 'DATABRICKS_HOST=https://<workspace>.cloud.databricks.com
DATABRICKS_SERVING_ENDPOINT=<endpoint-name>' >> .env
echo 'DATABRICKS_WORKSPACE_ID=<your-workspace-id>' >> .env
echo 'DATABRICKS_ENDPOINT=<your-endpoint>' >> .env
echo 'DATABRICKS_CONFIG_PROFILE=DEFAULT' >> .env
```

For deployment, keep endpoint selection in `app.yaml`:
For deployment in Databricks Apps (`app.yaml`):

```yaml
env:
- name: DATABRICKS_SERVING_ENDPOINT
value: "<endpoint-name>"
- name: DATABRICKS_WORKSPACE_ID
value: "<your-workspace-id>"
- name: DATABRICKS_ENDPOINT
value: "<your-endpoint>"
```

### 5. Use Databricks SDK auth chain in server runtime (recommended)
> **Workspace ID**: AppKit auto-discovers this at runtime. For explicit setup, run `databricks api get /api/2.1/unity-catalog/current-metastore-assignment --profile <PROFILE>` and use the `workspace_id` field.

> **Model compatibility**: This recipe uses OpenAI-compatible models served via Databricks AI Gateway, which support the AI SDK's streaming API. The AI Gateway URL uses the `/mlflow/v1` path (not `/openai/v1`).

> **Find your endpoint**: Run `databricks serving-endpoints list --profile <PROFILE>` to see available models. Common endpoints include `databricks-meta-llama-3-3-70b-instruct` and `databricks-claude-sonnet-4`, but availability varies by workspace.

In Databricks Apps runtime, prefer the Databricks SDK auth chain (service principal/workspace auth) instead of requiring `DATABRICKS_TOKEN` in app env:
### 4. Configure authentication helper

Create a helper function that works for both local development and deployed apps:

```typescript
import { getWorkspaceClient } from "@databricks/appkit";
import { Config } from "@databricks/sdk-experimental";

async function getDatabricksToken() {
// For deployed apps, use service principal token
if (process.env.DATABRICKS_TOKEN) {
return process.env.DATABRICKS_TOKEN;
}

const workspaceClient = getWorkspaceClient({});
// For local dev, use CLI profile auth via Databricks SDK
const config = new Config({
profile: process.env.DATABRICKS_CONFIG_PROFILE || "DEFAULT",
});
await config.ensureResolved();
const headers = new Headers();
await config.authenticate(headers);
const authHeader = headers.get("Authorization");
if (!authHeader) {
throw new Error(
"Failed to get Databricks token. Check your CLI profile or set DATABRICKS_TOKEN.",
);
}
return authHeader.replace("Bearer ", "");
}
```

### 6. Add `/api/chat` route using serving endpoint query
This function uses the Databricks SDK auth chain, which reads ~/.databrickscfg profiles and handles OAuth token refresh. For deployed apps, set DATABRICKS_TOKEN directly.

Create a server route that accepts chat messages and queries the serving endpoint:
> **User identity in deployed apps**: Databricks Apps injects user identity via request headers. Extract it with `req.header("x-forwarded-email")` or `req.header("x-forwarded-user")`. Use this for chat persistence and access control.

```typescript
import { getWorkspaceClient } from "@databricks/appkit";
### 5. Add `/api/chat` route with streaming

Create a server route using the AI SDK's streaming support:

const workspaceClient = getWorkspaceClient({});
```typescript
import { createOpenAI } from "@ai-sdk/openai";
import { streamText, type UIMessage } from "ai";

app.post("/api/chat", async (req, res) => {
const { messages } = req.body;
const endpoint = process.env.DATABRICKS_SERVING_ENDPOINT;
if (!endpoint) {
res.status(500).json({ error: "DATABRICKS_SERVING_ENDPOINT is missing" });
return;
}

const result = await workspaceClient.servingEndpoints.query({
name: endpoint,
messages,
});

res.json({
message: result.choices?.[0]?.message ?? null,
});
// AI SDK v6 client sends UIMessage objects with a parts array.
// Convert to CoreMessage format for streamText().
const coreMessages = (messages as UIMessage[]).map((m) => ({
role: m.role as "user" | "assistant" | "system",
content:
m.parts
?.filter((p) => p.type === "text" && p.text)
.map((p) => p.text)
.join("") ??
m.content ??
"",
}));

try {
const token = await getDatabricksToken();
const endpoint = process.env.DATABRICKS_ENDPOINT || "<your-endpoint>";

// Configure Databricks AI Gateway as OpenAI-compatible provider
const databricks = createOpenAI({
baseURL: `https://${process.env.DATABRICKS_WORKSPACE_ID}.ai-gateway.cloud.databricks.com/mlflow/v1`,
apiKey: token,
});

// Stream the response using AI SDK v6
const result = streamText({
model: databricks.chat(endpoint),
messages: coreMessages,
maxOutputTokens: 1000,
});

// v6 API: pipe the text stream to the Express response
result.pipeTextStreamToResponse(res);
} catch (err) {
const message = (err as Error).message;
console.error(`[chat] Streaming request failed:`, message);
res.status(502).json({
error: "Chat request failed",
detail: message,
});
}
});
```

### 7. Render the chat UI with `useChat`
### 6. Render the streaming chat UI

Use `useChat` for transport and AI Elements for composition. Keep your existing chat UI structure and point requests at `/api/chat`.
Use `useChat` from the AI SDK with `TextStreamChatTransport` for streaming support:

```tsx
import { useChat } from "@ai-sdk/react";
import { TextStreamChatTransport } from "ai";
import { useState } from "react";

export function ChatPage() {
const { messages, input, handleInputChange, handleSubmit } = useChat({
api: "/api/chat",
const [input, setInput] = useState("");

const { messages, sendMessage, status } = useChat({
transport: new TextStreamChatTransport({ api: "/api/chat" }),
});

return (
<div className="flex flex-col h-[600px]">
<div className="flex flex-col h-full">
<div className="flex-1 overflow-y-auto space-y-4 p-4">
{messages.map((m) => (
<div key={m.id} className={m.role === "user" ? "text-right" : ""}>
<span className="text-sm font-medium">
{m.role === "user" ? "You" : "Assistant"}
</span>
<p>{m.content}</p>
{m.parts.map((part, i) =>
part.type === "text" ? (
<p key={`${m.id}-${i}`} className="whitespace-pre-wrap">
{part.text}
</p>
) : null,
)}
</div>
))}
{status === "submitted" && <div className="p-4">Loading...</div>}
</div>
<form onSubmit={handleSubmit} className="border-t p-4 flex gap-2">
<form
onSubmit={(e) => {
e.preventDefault();
if (input.trim()) {
void sendMessage({ text: input });
setInput("");
}
}}
className="border-t p-4 flex gap-2"
>
<input
value={input}
onChange={handleInputChange}
onChange={(e) => setInput(e.target.value)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change this but state isn't needed here. Worked before as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vercel/ai#8725 <- intended to address this for v6?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it as is! I was referring to React state vs. unmanaged forms.

You only really need state if you want to programmatically manage the value. The only place where we do this here is the onSubmit (where we set it to "" to reset the input field after submission) but we can also reset the form field without React. Not worth going into this tbh - sorry I brought it up. I'm just a sucker for "use the web platform over React wherever possible" 😅

Pseudo code but roughly:

onSubmit={(e) => {
          e.preventDefault();
          const form = e.target;
          const input = form.chatinput // needs to match id/name prop on input ele
          if (input.trim()) {
            void sendMessage({ text: input });
            form.reset();
          }
        }}

This way we can fully remove the onChange and useState. Not important and I can make the React UI code pretty in the future.

placeholder="Ask a question..."
className="flex-1 border rounded px-3 py-2"
disabled={status !== "ready"}
/>
<button type="submit">Send</button>
<button type="submit" disabled={status !== "ready"}>
{status === "submitted" || status === "streaming"
? "Sending..."
: "Send"}
</button>
</form>
</div>
);
}
```

### 8. Add chat persistence in Lakebase

Persist chat sessions and messages using this recipe:

- [`Lakebase Chat Persistence`](/resources/ai-chat-app-template#lakebase-chat-persistence)

### 9. Deploy and verify
### 7. Deploy and verify

```bash
databricks apps deploy --profile <PROFILE>
databricks apps list --profile <PROFILE>
databricks apps logs <app-name> --profile <PROFILE>
```

Open the app URL while signed in to Databricks, send a message, and verify streamed responses plus persisted chat history.
Open the app URL while signed in to Databricks, send a message, and verify streaming responses appear token-by-token from the AI Gateway endpoint.

#### References

- [Databricks AI Gateway](https://docs.databricks.com/en/ai-gateway/)
- [Databricks Model Serving endpoints](https://docs.databricks.com/en/machine-learning/model-serving/create-manage-serving-endpoints.html)
- [Model Serving Overview](https://docs.databricks.com/aws/en/machine-learning/model-serving/)
- [Serving Endpoints](https://docs.databricks.com/aws/en/machine-learning/model-serving/create-foundation-model-endpoints)
- [AI Elements docs](https://ui.shadcn.com/docs/registry/ai-elements)
9 changes: 8 additions & 1 deletion content/recipes/databricks-local-bootstrap.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ databricks auth profiles

### 5. Scaffold a new Databricks app

Run this from the directory where you want the project created. The CLI creates a new folder named after `--name` inside the current directory. Use `--version latest` to get the latest AppKit template. Add `--features` to enable specific plugins (e.g., `--features=genie,lakebase`).
Run this from the directory where you want the project created. The CLI creates a new folder named after `--name` inside the current directory. Use `--version latest` to get the latest AppKit template. Add `--features` to enable specific plugins (e.g., `--features=lakebase` for database persistence).

```bash
databricks apps init \
Expand All @@ -71,6 +71,13 @@ databricks apps init \
--profile <PROFILE>
```

Then enter the project and install dependencies:

```bash
cd <app-name>
npm install
```

### 6. Install Databricks agent skills

Install the agent skills for the agents and editors you actively use.
Expand Down
Loading
Loading