Skip to content

Commit 70ee03b

Browse files
committed
Refactor apps-mcp to use CLI-based approach
1 parent bc1902c commit 70ee03b

File tree

16 files changed

+664
-333
lines changed

16 files changed

+664
-333
lines changed

experimental/apps-mcp/README.md

Lines changed: 96 additions & 168 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,24 @@
11
# Databricks MCP Server
22

3-
A Model Context Protocol (MCP) server for generating production-ready Databricks applications with testing,
4-
linting and deployment setup from a single prompt. This agent relies heavily on scaffolding and
5-
extensive validation to ensure high-quality outputs.
3+
A Model Context Protocol (MCP) server for working with Databricks through natural language. This server provides tools for data exploration, workspace management, and executing Databricks CLI commands through AI-powered conversations.
64

75
## TL;DR
86

9-
**Primary Goal:** Create and deploy production-ready Databricks applications from a single natural language prompt. This MCP server combines scaffolding, validation, and deployment into a seamless workflow that goes from idea to running application.
7+
**Primary Goal:** Interact with Databricks workspaces, manage Databricks Asset Bundles (DABs), deploy Databricks Apps, and query data through natural language conversations.
108

119
**How it works:**
12-
1. **Explore your data** - Query Databricks catalogs, schemas, and tables to understand your data
13-
2. **Generate the app** - Scaffold a full-stack TypeScript application (tRPC + React) with proper structure
14-
3. **Customize with AI** - Use workspace tools to read, write, and edit files naturally through conversation
15-
4. **Validate rigorously** - Run builds, type checks, and tests to ensure quality
16-
5. **Deploy confidently** - Push validated apps directly to Databricks Apps platform
10+
1. **Explore your workspace** - Discover workspace resources, get CLI command examples, and workflow recommendations
11+
2. **Query your data** - Browse catalogs, schemas, and tables; execute SQL queries via CLI commands
12+
3. **Manage bundles** - Initialize, validate, deploy, and run Databricks Asset Bundles
13+
4. **Deploy apps** - Deploy and manage Databricks Apps through CLI commands
14+
5. **Execute any CLI command** - Run the full Databricks CLI through the `invoke_databricks_cli` tool
1715

1816
**Why use it:**
19-
- **Speed**: Go from concept to deployed Databricks app in minutes, not hours or days
20-
- **Quality**: Extensive validation ensures your app builds, passes tests, and is production-ready
21-
- **Simplicity**: One natural language conversation handles the entire workflow
17+
- **Conversational interface**: Work with Databricks using natural language instead of memorizing CLI commands
18+
- **Context-aware**: Get relevant command suggestions based on your workspace configuration
19+
- **Unified workflow**: Combine data exploration, bundle management, and app deployment in one tool
2220

23-
Perfect for data engineers and developers who want to build Databricks apps without the manual overhead of project setup, configuration, testing infrastructure, and deployment pipelines.
21+
Perfect for data engineers and developers who want to streamline their Databricks workflows with AI-powered assistance.
2422

2523
---
2624

@@ -54,229 +52,164 @@ Perfect for data engineers and developers who want to build Databricks apps with
5452
}
5553
```
5654

57-
3. **Create your first Databricks app:**
55+
3. **Start using Databricks with natural language:**
5856

5957
Restart your MCP client and try:
6058
```
61-
Create a Databricks app that shows sales data from main.sales.transactions
62-
with a chart showing revenue by region. Deploy it as "sales-dashboard".
59+
Explore my Databricks workspace and show me what catalogs are available
6360
```
6461

65-
The AI will:
66-
- Explore your Databricks tables
67-
- Generate a full-stack application
68-
- Customize it based on your requirements
69-
- Validate it passes all tests
70-
- Deploy it to Databricks Apps
71-
72-
---
73-
74-
## Features
75-
76-
All features are designed to support the end-to-end workflow of creating production-ready Databricks applications:
77-
78-
### 1. Data Exploration (Foundation)
79-
80-
Understand your Databricks data before building:
81-
82-
- **`databricks_list_catalogs`** - Discover available data catalogs
83-
- **`databricks_list_schemas`** - Browse schemas in a catalog
84-
- **`databricks_find_tables`** - Find tables in a schema
85-
- **`databricks_describe_table`** - Get table details, columns, and sample data
86-
- **`databricks_execute_query`** - Test queries and preview data
87-
88-
*These tools help the AI understand your data structure so it can generate relevant application code.*
89-
90-
### 2. Application Generation (Core)
91-
92-
Create the application structure:
93-
94-
- **`scaffold_data_app`** - Generate a full-stack TypeScript application
95-
- Modern stack: Node.js, TypeScript, React, tRPC
96-
- Pre-configured build system, linting, and testing
97-
- Production-ready project structure
98-
- Databricks SDK integration
62+
```
63+
Initialize a new Databricks Asset Bundle for a data pipeline project
64+
```
9965

100-
*This is the foundation of your application - a working, tested template ready for customization.*
66+
```
67+
Query the main.sales.transactions table and show me the top 10 customers by revenue
68+
```
10169

102-
### 3. Validation (Quality Assurance)
70+
The AI will use the appropriate Databricks tools to help you complete these tasks.
10371

104-
Ensure production-readiness before deployment:
72+
---
10573

106-
- **`validate_data_app`** - Comprehensive validation
107-
- Build verification (npm build)
108-
- Type checking (TypeScript compiler)
109-
- Test execution (full test suite)
74+
## Features
11075

111-
*This step guarantees your application is tested and ready for production before deployment.*
76+
The Databricks MCP server provides CLI-based tools for workspace interaction:
11277

113-
### 4. Deployment (Production Release)
78+
Execute Databricks CLI commands and explore workspace resources:
11479

115-
Deploy validated applications to Databricks (enable with `--allow-deployment`):
80+
- **`explore`** - Discover workspace resources and get CLI command recommendations
81+
- Lists workspace URL, SQL warehouse details, and authentication profiles
82+
- Provides command examples for jobs, clusters, catalogs, tables, and workspace files
83+
- Gives workflow guidance for Databricks Asset Bundles and Apps
11684

117-
- **`deploy_databricks_app`** - Push to Databricks Apps platform
118-
- Automatic deployment configuration
119-
- Environment management
120-
- Production-grade setup
85+
- **`invoke_databricks_cli`** - Execute any Databricks CLI command
86+
- Run bundle commands: `bundle init`, `bundle validate`, `bundle deploy`, `bundle run`
87+
- Run apps commands: `apps deploy`, `apps list`, `apps get`, `apps start`, `apps stop`
88+
- Run workspace commands: `workspace list`, `workspace export`, `jobs list`, `clusters list`
89+
- Run catalog commands: `catalogs list`, `schemas list`, `tables list`
90+
- Supports all Databricks CLI functionality with proper user allowlisting
12191

122-
*The final step: your validated application running on Databricks.*
92+
*These tools provide a conversational interface to the full Databricks CLI, including Unity Catalog exploration and SQL query execution.*
12393

12494
---
12595

12696
## Example Usage
12797

128-
Here are example conversations showing the end-to-end workflow for creating Databricks applications:
98+
Here are example conversations showing common workflows:
12999

130-
### Complete Workflow: Analytics Dashboard
100+
### Data Exploration
131101

132-
This example shows how to go from data exploration to deployed application:
133-
134-
**User:**
102+
**Explore workspace resources:**
135103
```
136-
I want to create a Databricks app that visualizes customer purchases. The data is
137-
in the main.sales catalog. Show me what tables are available and create a dashboard
138-
with charts for total revenue by region and top products. Deploy it as "sales-insights".
104+
Explore my Databricks workspace and show me what's available
139105
```
140106

141-
**What happens:**
142-
1. **Data Discovery** - AI lists schemas and tables in main.sales
143-
2. **Data Inspection** - AI describes the purchases table structure
144-
3. **App Generation** - AI scaffolds a TypeScript application
145-
4. **Customization** - AI adds visualization components and queries
146-
5. **Validation** - AI runs build, type check, and tests in container
147-
6. **Deployment** - AI deploys to Databricks Apps as "sales-insights"
148-
149-
**Result:** A production-ready Databricks app running in minutes with proper testing.
150-
151-
---
152-
153-
### Quick Examples for Specific Use Cases
154-
155-
#### Data App from Scratch
156-
107+
**Query data:**
157108
```
158-
Create a Databricks app in ~/projects/user-analytics that shows daily active users
159-
from main.analytics.events. Include a line chart and data table.
109+
Show me the schema of the main.sales.transactions table and give me a sample of 10 rows
160110
```
161111

162-
#### Real-Time Monitoring Dashboard
163-
112+
**Find specific tables:**
164113
```
165-
Build a monitoring dashboard for the main.logs.system_metrics table. Show CPU,
166-
memory, and disk usage over time. Add alerts for values above thresholds.
114+
Find all tables in the main catalog that contain the word "customer"
167115
```
168116

169-
#### Report Generator
117+
### Databricks Asset Bundles (DABs)
170118

119+
**Create a new bundle project:**
171120
```
172-
Create an app that generates weekly reports from main.sales.transactions.
173-
Include revenue trends, top customers, and product performance. Add export to CSV.
121+
Initialize a new Databricks Asset Bundle for a data pipeline project
174122
```
175123

176-
#### Data Quality Dashboard
177-
124+
**Deploy a bundle:**
178125
```
179-
Build a data quality dashboard for main.warehouse.inventory. Check for nulls,
180-
duplicates, and out-of-range values. Show data freshness metrics.
126+
Validate and deploy my Databricks bundle to the dev environment
181127
```
182128

183-
---
184-
185-
### Working with Existing Applications
186-
187-
Once an app is scaffolded, you can continue development through conversation:
188-
129+
**Run a job from a bundle:**
189130
```
190-
Add a filter to show only transactions from the last 30 days
131+
Run the data_processing job from my bundle
191132
```
192133

193-
```
194-
Update the chart to use a bar chart instead of line chart
195-
```
134+
### Databricks Apps
196135

136+
**Initialize an app from template:**
197137
```
198-
Add a new API endpoint to fetch customer details
138+
Initialize a new Streamlit app using the Databricks bundle template
199139
```
200140

141+
**Deploy an app:**
201142
```
202-
Run the tests and fix any failures
143+
Deploy my app in the current directory to Databricks Apps as "sales-dashboard"
203144
```
204145

146+
**Manage apps:**
205147
```
206-
Add error handling for failed database queries
148+
List all my Databricks Apps and show me their status
207149
```
208150

209-
---
210-
211-
### Iterative Development Workflow
151+
### Working with Jobs and Clusters
212152

213-
**Initial Request:**
153+
**List and inspect jobs:**
214154
```
215-
Create a simple dashboard for main.sales.orders
155+
Show me all jobs in the workspace and their recent run status
216156
```
217157

218-
**Refinement:**
158+
**Get cluster details:**
219159
```
220-
Add a date range picker to filter orders
160+
List all clusters and show me the configuration of the production cluster
221161
```
222162

223-
**Enhancement:**
224-
```
225-
Include a summary card showing total orders and revenue
226-
```
163+
### Complex Workflows
227164

228-
**Quality Check:**
165+
**End-to-end data pipeline:**
229166
```
230-
Validate the app and show me any test failures
167+
1. Show me what tables are in the main.raw catalog
168+
2. Create a new bundle for an ETL pipeline
169+
3. Deploy it to the dev environment
170+
4. Run the pipeline and show me the results
231171
```
232172

233-
**Production:**
173+
**Multi-environment deployment:**
234174
```
235-
Deploy the app to Databricks as "orders-dashboard"
175+
Validate my bundle, then deploy it to dev, staging, and production environments
236176
```
237177

238178
---
239179

240-
## Why This Approach Works
180+
## Benefits
241181

242-
### Traditional Development vs. Databricks MCP
182+
### Natural Language Interface
243183

244-
| Traditional Approach | With Databricks MCP |
245-
|---------------------|-------------|
246-
| Manual project setup (hours) | Instant scaffolding (seconds) |
247-
| Configure build tools manually | Pre-configured and tested |
248-
| Set up testing infrastructure | Built-in test suite |
249-
| Manual code changes and debugging | AI-powered development with validation |
250-
| Local testing only | Containerized validation (reproducible) |
251-
| Manual deployment setup | Automated deployment to Databricks |
252-
| **Time to production: days/weeks** | **Time to production: minutes** |
184+
Instead of memorizing complex CLI commands and flags, you can:
185+
- Ask questions in plain English
186+
- Get context-aware command suggestions
187+
- Execute commands through conversation
188+
- Receive explanations of results
253189

254-
### Key Advantages
190+
### Workspace Awareness
255191

256-
**1. Scaffolding + Validation = Quality**
257-
- Start with a working, tested template
258-
- Every change is validated before deployment
259-
- No broken builds reach production
192+
The `explore` tool provides:
193+
- Automatic workspace configuration detection
194+
- SQL warehouse information
195+
- Authentication profile details
196+
- Relevant command examples based on your setup
260197

261-
**2. Natural Language = Productivity**
262-
- Describe what you want, not how to build it
263-
- AI handles implementation details
264-
- Focus on requirements, not configuration
198+
### Unified Workflow
265199

266-
**3. End-to-End Workflow = Simplicity**
267-
- Single tool for entire lifecycle
268-
- No context switching between tools
269-
- Seamless progression from idea to deployment
200+
Work with all Databricks functionality from one place:
201+
- **Data exploration**: Query catalogs, schemas, and tables
202+
- **Bundle management**: Create, validate, and deploy DABs
203+
- **App deployment**: Deploy and manage Databricks Apps
204+
- **Workspace operations**: Manage jobs, clusters, and notebooks
270205

271-
### What Makes It Production-Ready
206+
### Safe Command Execution
272207

273-
The Databricks MCP server doesn't just generate code—it ensures quality:
274-
275-
-**TypeScript** - Type safety catches errors early
276-
-**Build verification** - Ensures code compiles
277-
-**Test suite** - Validates functionality
278-
-**Linting** - Enforces code quality
279-
-**Databricks integration** - Native SDK usage
208+
The `invoke_databricks_cli` tool:
209+
- Allows users to allowlist specific commands
210+
- Provides better tracking of executed operations
211+
- Maintains audit trail of AI actions
212+
- Prevents unauthorized operations
280213

281214
---
282215

@@ -290,18 +223,14 @@ databricks experimental apps-mcp --warehouse-id <warehouse-id>
290223

291224
# Enable workspace tools
292225
databricks experimental apps-mcp --warehouse-id <warehouse-id> --with-workspace-tools
293-
294-
# Enable deployment
295-
databricks experimental apps-mcp --warehouse-id <warehouse-id> --allow-deployment
296226
```
297227

298228
### CLI Flags
299229

300230
| Flag | Description | Default |
301231
|------|-------------|---------|
302-
| `--warehouse-id` | Databricks SQL Warehouse ID (required) | - |
232+
| `--warehouse-id` | Databricks SQL Warehouse ID (required for SQL queries) | - |
303233
| `--with-workspace-tools` | Enable workspace file operations | `false` |
304-
| `--allow-deployment` | Enable deployment operations | `false` |
305234
| `--help` | Show help | - |
306235

307236
### Environment Variables
@@ -312,7 +241,6 @@ databricks experimental apps-mcp --warehouse-id <warehouse-id> --allow-deploymen
312241
| `DATABRICKS_TOKEN` | Databricks personal access token | `dapi...` |
313242
| `WAREHOUSE_ID` | Databricks SQL warehouse ID (preferred) | `abc123def456` |
314243
| `DATABRICKS_WAREHOUSE_ID` | Alternative name for warehouse ID | `abc123def456` |
315-
| `ALLOW_DEPLOYMENT` | Enable deployment operations | `true` or `false` |
316244
| `WITH_WORKSPACE_TOOLS` | Enable workspace tools | `true` or `false` |
317245

318246
### Authentication

0 commit comments

Comments
 (0)