feat: Add adb-coding-assistants-cluster module#227
feat: Add adb-coding-assistants-cluster module#227dgokeeffe wants to merge 2 commits intodatabricks:mainfrom
Conversation
Add Terraform module for deploying Claude Code CLI on Databricks clusters with MLflow tracing integration. Features: - Claude Code CLI installation with Node.js runtime - Databricks authentication integration via proxy endpoints - MLflow tracing for Claude Code sessions - VS Code/Cursor Remote SSH support - Token refresh helpers and cron automation - Databricks skills for common patterns - Network dependency validation script - Minimal installation option for constrained environments The module includes init scripts that: - Install Claude Code CLI and dependencies - Configure authentication via DATABRICKS_TOKEN - Set up bashrc helpers for token management - Support profile-based Azure authentication - Disable experimental betas for stability Co-authored-by: Cursor <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds a new Terraform module + example to provision a Databricks cluster that installs/configures Claude Code CLI (with MLflow tracing) via init scripts, plus helper scripts for Remote SSH setup and network dependency checks.
Changes:
- Introduces
adb-coding-assistants-clusterTerraform module (UC volume + init script upload + cluster creation + outputs). - Adds installation and helper scripts (full + minimal installer, VS Code/Cursor Remote SSH helper, network dependency checker) and accompanying docs.
- Adds an end-to-end example deployment (providers/auth options, tfvars template, outputs, documentation) and indexes it in the repo README.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/adb-coding-assistants-cluster/versions.tf | Defines Terraform + Databricks provider constraints for the new module. |
| modules/adb-coding-assistants-cluster/variables.tf | Adds module inputs for cluster/volume/tracing configuration. |
| modules/adb-coding-assistants-cluster/main.tf | Creates UC volume, uploads init script, provisions single-user cluster wiring init script. |
| modules/adb-coding-assistants-cluster/outputs.tf | Exposes cluster and init-script/volume details for consumers. |
| modules/adb-coding-assistants-cluster/README.md | Documents module usage, assumptions, and generated TF docs. |
| modules/adb-coding-assistants-cluster/Makefile | Adds terraform-docs generation/check targets for module docs. |
| modules/adb-coding-assistants-cluster/scripts/install-claude.sh | Full online installer + bash helpers for tokens, tracing, and VS Code guidance. |
| modules/adb-coding-assistants-cluster/scripts/install-claude-minimal.sh | Minimal installer with basic PATH + env var setup. |
| modules/adb-coding-assistants-cluster/scripts/vscode-setup.sh | Standalone Remote SSH setup/check/settings generator for IDEs. |
| modules/adb-coding-assistants-cluster/scripts/check-network-deps.sh | Preflight connectivity validator for required external domains. |
| modules/adb-coding-assistants-cluster/scripts/README.md | Documents the scripts and operational guidance. |
| examples/adb-coding-assistants-cluster/versions.tf | Sets example Terraform version constraint. |
| examples/adb-coding-assistants-cluster/variables.tf | Example inputs including auth selection validation. |
| examples/adb-coding-assistants-cluster/providers.tf | Example provider config (profile vs Azure resource-id path). |
| examples/adb-coding-assistants-cluster/main.tf | Wires example variables into the new module. |
| examples/adb-coding-assistants-cluster/outputs.tf | Prints cluster+volume outputs and a user-facing instruction block. |
| examples/adb-coding-assistants-cluster/README.md | Step-by-step example deployment + post-deploy workflow. |
| examples/adb-coding-assistants-cluster/terraform.tfvars.example | Provides an example tfvars template for quick start. |
| examples/adb-coding-assistants-cluster/Makefile | Adds terraform-docs generation/check targets for example docs. |
| README.md | Adds the new example/module to the repository index tables. |
| .gitignore | Ignores terraform.tfvars and *.plan files (keeps tfvars.example). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fi | ||
| fi | ||
|
|
||
| W="${DATABRICKS_HOST}" |
There was a problem hiding this comment.
install-claude.sh runs with set -u, but setup_bashrc dereferences DATABRICKS_HOST without a default. If DATABRICKS_HOST is not present in the init-script environment (common), this will abort the init script and can fail cluster startup. Use a safe expansion (e.g., ${DATABRICKS_HOST:-}) and/or avoid substituting host at init time (leave resolution to login-time env vars).
| W="${DATABRICKS_HOST}" | |
| W="${DATABRICKS_HOST:-}" |
| local venv_path | ||
| venv_path=$(claude-vscode-env 2>/dev/null) | ||
|
|
||
| echo "=== VS Code/Cursor settings.json Configuration ===" | ||
| echo "" | ||
| echo "Add this to your VS Code/Cursor settings.json:" | ||
| echo "" | ||
| echo "{" | ||
| echo " \"remote.SSH.defaultExtensions\": [" | ||
| echo " \"ms-Python.python\"," | ||
| echo " \"ms-toolsai.jupyter\"" | ||
| echo " ]" | ||
| if [ $? -eq 0 ] && [ -n "$venv_path" ]; then | ||
| echo "," | ||
| echo " \"python.defaultInterpreterPath\": \"$venv_path/bin/python\"" | ||
| fi | ||
| echo "}" | ||
| echo "" | ||
| if [ $? -eq 0 ] && [ -n "$venv_path" ]; then |
There was a problem hiding this comment.
claude-vscode-config checks $? long after venv_path=$(...), but multiple echo calls overwrite $? to 0. This makes the success checks effectively meaningless. Capture the exit status immediately (e.g., rc=$?) or just key off -n "$venv_path" (and/or have claude-vscode-env print nothing on failure) so the conditional reflects the actual detection result.
| fi | ||
|
|
||
| log "Installing Claude Code CLI..." | ||
| if curl -fsSL https://claude.ai/install.sh | bash &>>$L; then |
There was a problem hiding this comment.
Piping a remote script directly into bash is a supply-chain risk (no integrity verification, TOCTOU exposure). Prefer downloading to a temporary file, validating integrity (checksum/signature or pinned version), then executing it; at minimum, write the script to disk and log/inspect it before running.
| # Install Claude Code CLI | ||
| if ! command -v claude >/dev/null 2>&1; then | ||
| log "Installing Claude Code CLI..." | ||
| curl -fsSL https://claude.ai/install.sh | bash >> "$LOG_FILE" 2>&1 |
There was a problem hiding this comment.
Same issue as the full installer: curl | bash executes unverified remote content. Use a download + verification step (checksum/signature/pinned version) before execution (or mirror the installer internally for controlled environments).
| curl -fsSL https://claude.ai/install.sh | bash >> "$LOG_FILE" 2>&1 | |
| npm install -g @anthropic-ai/claude-code >> "$LOG_FILE" 2>&1 |
|
|
||
| # Install MLflow with Databricks support | ||
| log "Installing MLflow with Databricks support..." | ||
| if pip install --quiet --upgrade "mlflow[databricks]>=3.4" &>>$L; then |
There was a problem hiding this comment.
Using pip directly in init scripts can install into an unexpected interpreter (or fail if pip isn’t on PATH / points to a different Python). Prefer python3 -m pip install ... (and consider explicitly targeting the Databricks runtime env if required) so the installed mlflow CLI matches the Python environment you later invoke.
| if pip install --quiet --upgrade "mlflow[databricks]>=3.4" &>>$L; then | |
| if python3 -m pip install --quiet --upgrade "mlflow[databricks]>=3.4" &>>$L; then |
| | Script | Purpose | Network Required | | ||
| |--------|---------|------------------| | ||
| | `install-claude.sh` | Online installation (default) | ✅ Yes | |
There was a problem hiding this comment.
The scripts overview table lists only install-claude.sh, but this directory also includes install-claude-minimal.sh, vscode-setup.sh, and check-network-deps.sh. Also, the doc claims wget is installed, but the installer installs curl git jq (and minimal installs curl git)—either install wget or update the documentation to match actual behavior.
| - ✅ **Node.js 20.x** - Required runtime for Claude CLI | ||
| - ✅ **Claude Code CLI** - AI coding assistant | ||
| - ✅ **MLflow** - For tracing Claude interactions | ||
| - ✅ **System tools** - curl, wget, git, jq |
There was a problem hiding this comment.
The scripts overview table lists only install-claude.sh, but this directory also includes install-claude-minimal.sh, vscode-setup.sh, and check-network-deps.sh. Also, the doc claims wget is installed, but the installer installs curl git jq (and minimal installs curl git)—either install wget or update the documentation to match actual behavior.
| │ Databricks Cluster (on startup) │ | ||
| │ │ | ||
| │ 1. Executes init script from volume │ | ||
| │ 2. Installs Node.js, OpenCode, Claude CLI │ |
There was a problem hiding this comment.
The module README describes installing/configuring OpenCode (opencode) and generating ~/.opencode/config.json, but the provided init scripts in this PR don’t install OpenCode or add related helpers. Either implement OpenCode installation/configuration in the init script(s), or remove/update these README sections to avoid incorrect guidance.
| │ • DATABRICKS_TOKEN available from environment │ | ||
| │ • Configs auto-generate: │ | ||
| │ - ~/.claude/settings.json │ | ||
| │ - ~/.opencode/config.json │ | ||
| │ • Commands ready: claude, opencode │ |
There was a problem hiding this comment.
The module README describes installing/configuring OpenCode (opencode) and generating ~/.opencode/config.json, but the provided init scripts in this PR don’t install OpenCode or add related helpers. Either implement OpenCode installation/configuration in the init script(s), or remove/update these README sections to avoid incorrect guidance.
| local cron_cmd="[ -n \"\$DATABRICKS_TOKEN\" ] && [ -n \"\$DATABRICKS_HOST\" ] && source \"\$HOME/.bashrc\" && _check_and_refresh_token >/dev/null 2>&1" | ||
| local cron_job="0 * * * * $cron_cmd" |
There was a problem hiding this comment.
cron_cmd and cron_job are defined but not used (the function later installs cron_file directly into crontab). Removing unused locals (or actually using cron_job) will reduce confusion and keep the function as a single source of truth for the scheduled command.
| local cron_cmd="[ -n \"\$DATABRICKS_TOKEN\" ] && [ -n \"\$DATABRICKS_HOST\" ] && source \"\$HOME/.bashrc\" && _check_and_refresh_token >/dev/null 2>&1" | |
| local cron_job="0 * * * * $cron_cmd" |
- Add safe expansion for DATABRICKS_HOST to prevent crash under set -u - Remove unused local variables in claude-setup-token-refresh() - Fix broken $? check in claude-vscode-config() by capturing exit code - Use python3 -m pip instead of bare pip for safer execution - Add supply-chain verification comment to minimal installer - Fix node_type_id description to match actual default value - Update scripts README with all available scripts - Remove wget from system tools list (not installed) - Remove OpenCode references throughout documentation Co-Authored-By: Claude Opus 4.5 <[email protected]>
Summary
This PR adds a new Terraform module for deploying Claude Code CLI on Databricks clusters with MLflow tracing integration.
Key Features
Module Structure
modules/adb-coding-assistants-cluster/- Main module with full featuresexamples/adb-coding-assistants-cluster/- Example deployment configurationInit Scripts
Authentication
The module configures Claude Code to use Databricks as the model provider:
Helper Commands
The init scripts add several helper commands to the cluster:
check-claude- Verify installation statusclaude-refresh-token- Regenerate authentication settingsclaude-token-status- Check token freshnessclaude-tracing-enable/disable/status- Manage MLflow tracingclaude-vscode-setup- Remote SSH setup guideTest Plan
Made with Cursor