Skip to content

Commit 5759c24

Browse files
igrekunarsenyinfofjakobs
authored
Refactor apps-mcp to use CLI-based approach (#4003)
This refactors the apps-mcp server to use CLI commands instead of direct API providers, significantly simplifying the architecture and leveraging existing bundle command functionality. ## Changes **New CLI-based provider:** - Add experimental/apps-mcp/lib/providers/clitools package - Implement workspace exploration via CLI commands - Add invoke_databricks_cli helper for executing CLI commands - Update prompts to support apps exploration workflow **Removed providers and templates:** - Remove databricks provider (replaced by CLI invocation) - Remove IO provider (scaffolding/validation, now handled by bundle commands) - Remove deployment provider (superseded by bundle deploy commands) - Remove entire templates system including trpc template **Clean up old development features:** - Remove cmd/workspace/apps/dev.go and vite bridge - Remove vite development server integration - Drop experimental development workflow in favor of bundle-based approach ## Why This change reduces code complexity while providing a more maintainable architecture that reuses existing CLI commands rather than duplicating API logic. ## Tests <!-- How have you tested the changes? --> <!-- If your PR needs to be included in the release notes for next release, add a separate entry in NEXT_CHANGELOG.md as part of your PR. --> --------- Co-authored-by: Arseny Kravchenko <[email protected]> Co-authored-by: Fabian Jakobs <[email protected]>
1 parent 89cea48 commit 5759c24

38 files changed

+1633
-3017
lines changed

experimental/apps-mcp/README.md

Lines changed: 90 additions & 170 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
11
# Databricks MCP Server
22

3-
A Model Context Protocol (MCP) server for generating production-ready Databricks applications with testing,
4-
linting and deployment setup from a single prompt. This agent relies heavily on scaffolding and
5-
extensive validation to ensure high-quality outputs.
3+
A Model Context Protocol (MCP) server for working with Databricks through natural language. This server provides tools for data exploration, workspace management, and executing Databricks CLI commands through AI-powered conversations.
64

75
## TL;DR
86

9-
**Primary Goal:** Create and deploy production-ready Databricks applications from a single natural language prompt. This MCP server combines scaffolding, validation, and deployment into a seamless workflow that goes from idea to running application.
7+
**Primary Goal:** Interact with Databricks workspaces, manage Databricks Asset Bundles (DABs), deploy Databricks Apps, and query data through natural language conversations.
108

119
**How it works:**
1210
1. **Explore your data** - Query Databricks catalogs, schemas, and tables to understand your data
@@ -16,11 +14,11 @@ extensive validation to ensure high-quality outputs.
1614
5. **Deploy confidently** - Push validated apps directly to Databricks Apps platform
1715

1816
**Why use it:**
19-
- **Speed**: Go from concept to deployed Databricks app in minutes, not hours or days
20-
- **Quality**: Extensive validation ensures your app builds, passes tests, and is production-ready
21-
- **Simplicity**: One natural language conversation handles the entire workflow
17+
- **Conversational interface**: Work with Databricks using natural language instead of memorizing CLI commands
18+
- **Context-aware**: Get relevant command suggestions based on your workspace configuration
19+
- **Unified workflow**: Combine data exploration, bundle management, and app deployment in one tool
2220

23-
Perfect for data engineers and developers who want to build Databricks apps without the manual overhead of project setup, configuration, testing infrastructure, and deployment pipelines.
21+
Perfect for data engineers and developers who want to streamline their Databricks workflows with AI-powered assistance.
2422

2523
---
2624

@@ -52,16 +50,18 @@ Perfect for data engineers and developers who want to build Databricks apps with
5250

5351
Try this in your MCP client:
5452
```
55-
Create a Databricks app that shows sales data from main.sales.transactions
56-
with a chart showing revenue by region. Deploy it as "sales-dashboard".
53+
Explore my Databricks workspace and show me what catalogs are available
5754
```
5855

59-
The AI will:
60-
- Explore your Databricks tables
61-
- Generate a full-stack application
62-
- Customize it based on your requirements
63-
- Validate it passes all tests
64-
- Deploy it to Databricks Apps
56+
```
57+
Initialize a new Databricks Asset Bundle for a data pipeline project
58+
```
59+
60+
```
61+
Query the main.sales.transactions table and show me the top 10 customers by revenue
62+
```
63+
64+
The AI will use the appropriate Databricks tools to help you complete these tasks.
6565

6666
---
6767

@@ -92,210 +92,143 @@ Then restart your MCP client for changes to take effect
9292

9393
## Features
9494

95-
All features are designed to support the end-to-end workflow of creating production-ready Databricks applications:
96-
97-
### 1. Data Exploration (Foundation)
98-
99-
Understand your Databricks data before building:
100-
101-
- **`databricks_list_catalogs`** - Discover available data catalogs
102-
- **`databricks_list_schemas`** - Browse schemas in a catalog
103-
- **`databricks_find_tables`** - Find tables in a schema
104-
- **`databricks_describe_table`** - Get table details, columns, and sample data
105-
- **`databricks_execute_query`** - Test queries and preview data
106-
107-
*These tools help the AI understand your data structure so it can generate relevant application code.*
108-
109-
### 2. Application Generation (Core)
95+
The Databricks MCP server provides CLI-based tools for workspace interaction:
11096

111-
Create the application structure:
97+
Execute Databricks CLI commands and explore workspace resources:
11298

113-
- **`scaffold_databricks_app`** - Generate a full-stack TypeScript application
114-
- Modern stack: Node.js, TypeScript, React, tRPC
115-
- Pre-configured build system, linting, and testing
116-
- Production-ready project structure
117-
- Databricks SDK integration
99+
- **`explore`** - Discover workspace resources and get CLI command recommendations
100+
- Lists workspace URL, SQL warehouse details, and authentication profiles
101+
- Provides command examples for jobs, clusters, catalogs, tables, and workspace files
102+
- Gives workflow guidance for Databricks Asset Bundles and Apps
118103

119-
*This is the foundation of your application - a working, tested template ready for customization.*
104+
- **`invoke_databricks_cli`** - Execute any Databricks CLI command
105+
- Run bundle commands: `bundle init`, `bundle validate`, `bundle deploy`, `bundle run`
106+
- Run apps commands: `apps deploy`, `apps list`, `apps get`, `apps start`, `apps stop`
107+
- Run workspace commands: `workspace list`, `workspace export`, `jobs list`, `clusters list`
108+
- Run catalog commands: `catalogs list`, `schemas list`, `tables list`
109+
- Supports all Databricks CLI functionality with proper user allowlisting
120110

121-
### 3. Validation (Quality Assurance)
122-
123-
Ensure production-readiness before deployment:
124-
125-
- **`validate_databricks_app`** - Comprehensive validation
126-
- Build verification (npm build)
127-
- Type checking (TypeScript compiler)
128-
- Test execution (full test suite)
129-
130-
*This step guarantees your application is tested and ready for production before deployment.*
131-
132-
### 4. Deployment (Production Release)
133-
134-
Deploy validated applications to Databricks:
135-
136-
- **`deploy_databricks_app`** - Push to Databricks Apps platform
137-
- Automatic deployment configuration
138-
- Environment management
139-
- Production-grade setup
140-
141-
*The final step: your validated application running on Databricks.*
111+
*These tools provide a conversational interface to the full Databricks CLI, including Unity Catalog exploration and SQL query execution.*
142112

143113
---
144114

145115
## Example Usage
146116

147-
Here are example conversations showing the end-to-end workflow for creating Databricks applications:
148-
149-
### Complete Workflow: Analytics Dashboard
117+
Here are example conversations showing common workflows:
150118

151-
This example shows how to go from data exploration to deployed application:
119+
### Data Exploration
152120

153-
**User:**
121+
**Explore workspace resources:**
154122
```
155-
I want to create a Databricks app that visualizes customer purchases. The data is
156-
in the main.sales catalog. Show me what tables are available and create a dashboard
157-
with charts for total revenue by region and top products. Deploy it as "sales-insights".
123+
Explore my Databricks workspace and show me what's available
158124
```
159125

160-
**What happens:**
161-
1. **Data Discovery** - AI lists schemas and tables in main.sales
162-
2. **Data Inspection** - AI describes the purchases table structure
163-
3. **App Generation** - AI scaffolds a TypeScript application
164-
4. **Customization** - AI adds visualization components and queries
165-
5. **Validation** - AI runs build, type check, and tests in container
166-
6. **Deployment** - AI deploys to Databricks Apps as "sales-insights"
167-
168-
**Result:** A production-ready Databricks app running in minutes with proper testing.
169-
170-
---
171-
172-
### Quick Examples for Specific Use Cases
173-
174-
#### Data App from Scratch
175-
126+
**Query data:**
176127
```
177-
Create a Databricks app in ~/projects/user-analytics that shows daily active users
178-
from main.analytics.events. Include a line chart and data table.
128+
Show me the schema of the main.sales.transactions table and give me a sample of 10 rows
179129
```
180130

181-
#### Real-Time Monitoring Dashboard
182-
131+
**Find specific tables:**
183132
```
184-
Build a monitoring dashboard for the main.logs.system_metrics table. Show CPU,
185-
memory, and disk usage over time. Add alerts for values above thresholds.
133+
Find all tables in the main catalog that contain the word "customer"
186134
```
187135

188-
#### Report Generator
136+
### Databricks Asset Bundles (DABs)
189137

138+
**Create a new bundle project:**
190139
```
191-
Create an app that generates weekly reports from main.sales.transactions.
192-
Include revenue trends, top customers, and product performance. Add export to CSV.
140+
Initialize a new Databricks Asset Bundle for a data pipeline project
193141
```
194142

195-
#### Data Quality Dashboard
196-
143+
**Deploy a bundle:**
197144
```
198-
Build a data quality dashboard for main.warehouse.inventory. Check for nulls,
199-
duplicates, and out-of-range values. Show data freshness metrics.
145+
Validate and deploy my Databricks bundle to the dev environment
200146
```
201147

202-
---
203-
204-
### Working with Existing Applications
205-
206-
Once an app is scaffolded, you can continue development through conversation:
207-
148+
**Run a job from a bundle:**
208149
```
209-
Add a filter to show only transactions from the last 30 days
150+
Run the data_processing job from my bundle
210151
```
211152

212-
```
213-
Update the chart to use a bar chart instead of line chart
214-
```
153+
### Databricks Apps
215154

155+
**Initialize an app from template:**
216156
```
217-
Add a new API endpoint to fetch customer details
157+
Initialize a new Streamlit app using the Databricks bundle template
218158
```
219159

160+
**Deploy an app:**
220161
```
221-
Run the tests and fix any failures
162+
Deploy my app in the current directory to Databricks Apps as "sales-dashboard"
222163
```
223164

165+
**Manage apps:**
224166
```
225-
Add error handling for failed database queries
167+
List all my Databricks Apps and show me their status
226168
```
227169

228-
---
170+
### Working with Jobs and Clusters
229171

230-
### Iterative Development Workflow
231-
232-
**Initial Request:**
172+
**List and inspect jobs:**
233173
```
234-
Create a simple dashboard for main.sales.orders
174+
Show me all jobs in the workspace and their recent run status
235175
```
236176

237-
**Refinement:**
177+
**Get cluster details:**
238178
```
239-
Add a date range picker to filter orders
179+
List all clusters and show me the configuration of the production cluster
240180
```
241181

242-
**Enhancement:**
243-
```
244-
Include a summary card showing total orders and revenue
245-
```
182+
### Complex Workflows
246183

247-
**Quality Check:**
184+
**End-to-end data pipeline:**
248185
```
249-
Validate the app and show me any test failures
186+
1. Show me what tables are in the main.raw catalog
187+
2. Create a new bundle for an ETL pipeline
188+
3. Deploy it to the dev environment
189+
4. Run the pipeline and show me the results
250190
```
251191

252-
**Production:**
192+
**Multi-environment deployment:**
253193
```
254-
Deploy the app to Databricks as "orders-dashboard"
194+
Validate my bundle, then deploy it to dev, staging, and production environments
255195
```
256196

257197
---
258198

259-
## Why This Approach Works
260-
261-
### Traditional Development vs. Databricks MCP
199+
## Benefits
262200

263-
| Traditional Approach | With Databricks MCP |
264-
|---------------------|-------------|
265-
| Manual project setup (hours) | Instant scaffolding (seconds) |
266-
| Configure build tools manually | Pre-configured and tested |
267-
| Set up testing infrastructure | Built-in test suite |
268-
| Manual code changes and debugging | AI-powered development with validation |
269-
| Local testing only | Containerized validation (reproducible) |
270-
| Manual deployment setup | Automated deployment to Databricks |
271-
| **Time to production: days/weeks** | **Time to production: minutes** |
201+
### Natural Language Interface
272202

273-
### Key Advantages
203+
Instead of memorizing complex CLI commands and flags, you can:
204+
- Ask questions in plain English
205+
- Get context-aware command suggestions
206+
- Execute commands through conversation
207+
- Receive explanations of results
274208

275-
**1. Scaffolding + Validation = Quality**
276-
- Start with a working, tested template
277-
- Every change is validated before deployment
278-
- No broken builds reach production
209+
### Workspace Awareness
279210

280-
**2. Natural Language = Productivity**
281-
- Describe what you want, not how to build it
282-
- AI handles implementation details
283-
- Focus on requirements, not configuration
211+
The `explore` tool provides:
212+
- Automatic workspace configuration detection
213+
- SQL warehouse information
214+
- Authentication profile details
215+
- Relevant command examples based on your setup
284216

285-
**3. End-to-End Workflow = Simplicity**
286-
- Single tool for entire lifecycle
287-
- No context switching between tools
288-
- Seamless progression from idea to deployment
217+
### Unified Workflow
289218

290-
### What Makes It Production-Ready
219+
Work with all Databricks functionality from one place:
220+
- **Data exploration**: Query catalogs, schemas, and tables
221+
- **Bundle management**: Create, validate, and deploy DABs
222+
- **App deployment**: Deploy and manage Databricks Apps
223+
- **Workspace operations**: Manage jobs, clusters, and notebooks
291224

292-
The Databricks MCP server doesn't just generate code—it ensures quality:
225+
### Safe Command Execution
293226

294-
-**TypeScript** - Type safety catches errors early
295-
- **Build verification** - Ensures code compiles
296-
- **Test suite** - Validates functionality
297-
- **Linting** - Enforces code quality
298-
- **Databricks integration** - Native SDK usage
227+
The `invoke_databricks_cli` tool:
228+
- Allows users to allowlist specific commands
229+
- Provides better tracking of executed operations
230+
- Maintains audit trail of AI actions
231+
- Prevents unauthorized operations
299232

300233
---
301234

@@ -308,29 +241,16 @@ The Databricks MCP server doesn't just generate code—it ensures quality:
308241
databricks experimental apps-mcp install
309242

310243
# Start MCP server (default mode)
311-
databricks experimental apps-mcp --warehouse-id <warehouse-id>
312-
313-
# Enable workspace tools
314-
databricks experimental apps-mcp --warehouse-id <warehouse-id> --with-workspace-tools
244+
databricks experimental apps-mcp
315245
```
316246

317-
### CLI Flags
318-
319-
| Flag | Description | Default |
320-
|------|-------------|---------|
321-
| `--warehouse-id` | Databricks SQL Warehouse ID (required) | - |
322-
| `--with-workspace-tools` | Enable workspace file operations | `false` |
323-
| `--help` | Show help | - |
324-
325247
### Environment Variables
326248

327249
| Variable | Description | Example |
328250
|----------|-------------|---------|
329251
| `DATABRICKS_HOST` | Databricks workspace URL | `https://your-workspace.databricks.com` |
330-
| `DATABRICKS_TOKEN` | Databricks personal access token | `dapi...` |
331252
| `WAREHOUSE_ID` | Databricks SQL warehouse ID (preferred) | `abc123def456` |
332253
| `DATABRICKS_WAREHOUSE_ID` | Alternative name for warehouse ID | `abc123def456` |
333-
| `WITH_WORKSPACE_TOOLS` | Enable workspace tools | `true` or `false` |
334254

335255
### Authentication
336256

0 commit comments

Comments
 (0)