experimental: import ai-dev-kit skills into experimental/ directory by jamesbroadhead · Pull Request #73 · databricks/databricks-agent-skills

jamesbroadhead · 2026-05-12T15:52:22Z

Summary

Adds an experimental/ directory containing 18 agent skills from databricks-solutions/ai-dev-kit databricks-skills/, imported as a snapshot on a best-effort basis. Excluded:

databricks-model-serving (TODO #1b — different surface than stable, heavy MCP coupling)
databricks-spark-declarative-pipelines (TODO Fix skill inconsistencies and add required reading sections #5 — different surface than stable databricks-pipelines)
databricks-genie — removed during review per @lennartkats-db; deferred to a future revision
databricks-lakebase-provisioned — not in the upstream experimental branch either

The manifest exposes both stable and experimental skills in a single skills map. Each entry carries a repo_dir field ("skills" or "experimental") that points to the directory the skill lives in. Consumers derive experimental state from repo_dir — there is no parallel experimental_skills map and no per-skill experimental bool.

Paired with databricks/cli#5243 which teaches databricks aitools install (top-level) to:

read repo_dir and skip experimental entries by default,
install all of them with --experimental,
install one by name with --experimental required.

Experimental and stable skills install under their plain names (e.g. ~/.claude/skills/databricks-iceberg/); the upstream repo enforces name-uniqueness across both directories, so no install-side suffix is needed.

Source

Final sync at merge time was 20a92a3 on databricks-solutions/ai-dev-kit:experimental ("tests: outcome-oriented rewrite across 7 skills + strip MCP tool_modules from all manifests").

Initial import was 9c7a5b3 (head of a-d-k PR #533 on the appkit-on-experimental branch). PR #533 has since merged into experimental (7b07f18). The branch went through periodic re-syncs during review to pull in upstream updates.

The rename (databricks-app-python → databricks-apps-python) is preserved in the merged version, which is what prevents a 3rd skill-name collision with d-a-s's stable databricks-apps.

Direction caveat

In the Apr 28 thread (Slack link), Dustin's stated plan was to move databricks-agent-skills skills into ai-dev-kit's experimental branch as defaults. This PR went the other direction (a-d-k content → d-a-s/experimental). The plan post-merge is to invert the direction via git subtree — see TODO #3 below and a-d-k RFC PR #530.

TODOs / caveats for iteration

Name collisions. Resolved in this PR:
- 1a. databricks-jobs — merged into stable. Imported the comprehensive reference content from a-d-k's databricks-jobs skill into skills/databricks-jobs/, bumping version to 0.2.0. The merged skill keeps stable's scaffolding workflow + parent: databricks-core hierarchy + Codex agents/openai.yaml + compatibility note, and adds the experimental's full task-types reference (9 types), trigger types (6), notifications/health/retries/queues, and 7 worked end-to-end examples. Layered structure: SKILL.md as overview + four reference files (task-types.md, triggers-schedules.md, notifications-monitoring.md, examples.md). The experimental copy is removed.
  
  With the single-map manifest shape, collisions are no longer possible — _add_skill raises if the same skill name shows up under both skills/ and experimental/, so any future drift fails generation loudly.
- 1b. databricks-model-serving — dropped from this PR. After a deep compare, the two skills cover almost entirely different surfaces: stable is ops-focused (manage existing endpoints via CLI); experimental is dev-focused (build & ship MLflow models / GenAI agents with autolog → mlflow.pyfunc.log_model → databricks.agents.deploy() → query, with full Classical ML / Custom PyFunc / ResponsesAgent + LangGraph / UCFunctionToolkit / VectorSearchRetrieverTool coverage). Near-zero content overlap. Experimental version also has heavy MCP-tool dependency (60+ refs to ai-dev-kit's manage_serving_endpoint, etc., that don't exist in the d-a-s/databricks aitools flow). Follow-up: port the high-value dev-side content into the stable skill — classical-ml autolog patterns, Custom PyFunc signatures, ResponsesAgent pattern with the create_text_output_item helper-method gotcha, UCFunctionToolkit + VectorSearchRetrieverTool with resource passthrough for auth, the Foundation Model API endpoint table. Strip MCP refs; replace with CLI/SDK equivalents. Owners: @databricks/eng-apps-devex (per CODEOWNERS).
~~CODEOWNERS for experimental/~~ Resolved. Per @lennartkats-db review, /experimental/ owners are @lennartkats-db @simonfaltum @calreynolds @dustinvannoy-db (kept compact for now; broad participation invited via the broader collaborator set).
~~No sync mechanism with upstream a-d-k.~~ Resolved with a paired RFC. Two-part plan:
- Pre-lock (this PR): periodic manual re-syncs from upstream ai-dev-kit into experimental/. Documented in experimental/README.md.
- Post-lock (follow-up): invert the direction. a-d-k becomes the consumer; databricks-skills/imported/ in a-d-k is a git subtree of this repo's experimental/. RFC PR opened against a-d-k: RFC: subtree-sync skills from databricks-agent-skills/experimental databricks-solutions/ai-dev-kit#530 (draft). To make subtree work, d-a-s needs to publish an experimental-only branch via git subtree split --prefix=experimental after every push to main — that's a small workflow to add here in a follow-up PR. A one-shot preview branch experimental-only-preview was pushed to this repo to enable the RFC demo and should be deleted once the auto-publish workflow lands.
~~No agent metadata.~~ Resolved. scripts/skills.py auto-generates agents/openai.yaml + copies shared assets for each experimental skill on generate, using SKILL.md frontmatter. A DISPLAY_NAME_OVERRIDES map handles names whose hyphen-titlecase rendering breaks well-known capitalisation (AI Functions, AI/BI Dashboards, MLflow Evaluation, Unstructured PDF Generation — fixed per @lennartkats-db review). Stubs are only written when missing so upstream a-d-k can override by shipping its own files.
~~databricks-pipelines was deliberately excluded.~~ Resolved. a-d-k doesn't ship a databricks-pipelines skill under that name, but it does ship databricks-spark-declarative-pipelines covering the same product. After a deep compare, that experimental version covers a different surface than stable. Removed experimental/databricks-spark-declarative-pipelines/ from this PR. Follow-up TODO (post-merge): port the high-value pieces into stable skills/databricks-pipelines/ — DLT migration guide, workflow A/B/C decision matrix, per-language performance reference, language-selection rules. Strip MCP-tool refs. Owners: @lennartkats-db / @camielstee-db (per CODEOWNERS).
~~spark-python-data-source naming exception.~~ Kept as-is. The skill is about the OSS Apache Spark 4+ PySpark DataSource API (building custom connector libraries), not a Databricks product — only lightly flavored with Databricks idioms. The convention break is acceptable given the content.
~~Versioning.~~ Resolved. Bumped the extract_version_from_skill fallback in scripts/skills.py from 0.0.0 → 0.0.1 so the manifest never reports 0.0.0. Applies to skills with no explicit version: in their SKILL.md frontmatter. Sync-safe: when upstream a-d-k eventually adds version fields, those win; until then, the manifest reports the floor.
~~installed_dir for experimental skills.~~ Reversed during review per @dustinvannoy-db. Originally proposed: every experimental skill installs to ~/.claude/skills/<name>-experimental/. Replaced with: experimental skills install under their plain name (e.g. ~/.claude/skills/databricks-iceberg/). Upstream guarantees name-uniqueness across skills/ and experimental/, and the manifest generator (scripts/skills.py _add_skill) raises on any future collision so drift fails loudly rather than silently overwriting. Cli-side change shipped in databricks/cli#5243 (6d9c479f — drop SourceName and the suffixing in normalizeManifest).
~~Excluded a-d-k content.~~ Confirmed scope. Excluded: TEMPLATE/ (template, not a skill), install_skills.sh + install_genie_code_skills.py (a-d-k's installers — we use the cli installer instead), databricks-builder-app/ (a Python app for a-d-k's builder UI), databricks-mcp-server/ (the a-d-k MCP server — separate concern from skills), databricks-tools-core/ (Python lib used by a-d-k tooling — no experimental skill references it), hooks/hooks.json (a-d-k plugin lifecycle hooks tied to ${CLAUDE_PLUGIN_ROOT}/.claude-plugin/setup.sh/check_update.sh — plugin-specific, not skill content), plus top-level repo metadata (.github/, LICENSE.md, README.md, VERSION, install.{sh,ps1}, etc.). Verified no experimental skill cross-references any excluded path.
~~README placement.~~ Verified. experimental/README.md retains the adapted a-d-k skill list with a top warning block; the root README.md has an "Experimental Skills" section with an install-by-name example. Install commands use the new top-level surface: databricks aitools install [name] --experimental.
~~Manifest shape.~~ Resolved. Replaced the original two-map design (top-level skills + experimental_skills plus per-skill experimental bool) with a single skills map where each entry's repo_dir field is the source of truth. The manifest generator (scripts/skills.py) raises a clear error if the same skill name appears under both skills/ and experimental/.

Test plan

python3 scripts/skills.py generate regenerates the manifest cleanly.
python3 scripts/skills.py validate passes.
CI green on this branch.
Manual: databricks aitools install (no flag) installs only stable skills.
Manual: databricks aitools install --experimental installs both.
Manual: databricks aitools install databricks-iceberg errors because it's experimental.
Manual: databricks aitools install databricks-iceberg --experimental installs that one skill.

Post-merge follow-ups (reviewer-flagged)

Move reference files into references/ directories for consistency across skills (Dustin + Lennart).
Long-term-intention review of all 19 skills — confirm each is intended to stay (Dustin).
databricks-execution-compute/scripts/compute.py: when promoting out of experimental, rely on the Databricks CLI rather than the bundled compute script (Lennart, non-blocking).
databricks-mlflow-evaluation: review overlap with MLflow-repo skills (Lennart, non-blocking; cc'd @simonfaltum).
databricks-unstructured-pdf-generation: re-evaluate inclusion — "not very Databricks-specific" (Lennart, non-blocking; cc'd @dustinvannoy-db).
databricks-vector-search: "no longer experimental in Genie Code world" — move to stable or out (Lennart, non-blocking; cc'd @simonfaltum).
skills/databricks-jobs/SKILL.md: non-experimental skill additions should also propagate into Genie Code (Lennart, for @simonfaltum).

This pull request and its description were written by Claude.

The default DATABRICKS_SKILLS_REF pin is a release tag that pre-dates the experimental_skills manifest section (see databricks/databricks-agent-skills#73). Users who pass --experimental against that ref today silently get no experimental skills installed. Log a Warnf at install time pointing them at the env var override (=main, or a future release that includes the section). Helper: manifestHasExperimental(), unit-tested in source_test.go. Co-authored-by: Isaac

Replaces the previous import (a-d-k commit 2228c3e on add_appkit) with the head of a-d-k PR #533 (commit 9c7a5b3 on appkit-on-experimental), which targets a-d-k's experimental branch. Changes: - Refresh 23 experimental skill directories from the new source. - Drop databricks-lakebase-provisioned — removed on a-d-k experimental. - databricks-apps-python: rename + SKILL.md now leads with AppKit (TypeScript + React SDK) and demotes Python frameworks to alternatives; 6-mcp-approach.md replaced with 6-cli-approach.md. - databricks-lakebase-autoscale/references/connection-patterns.md: change placeholder `user:password` to `<user>:<password>` so the secret scanner doesn't flag the doc-only example. Cosmetic only. - Continue to exclude databricks-model-serving and databricks-spark-declarative-pipelines (PR #73 TODOs #1b and #5). - Regenerate manifest.json and agents/openai.yaml stubs via scripts/skills.py generate. - Update experimental/README.md provenance section with the new SHA, branch, and divergence notes. Co-authored-by: Isaac

Merges the comprehensive jobs reference content from experimental/databricks-jobs/ into skills/databricks-jobs/ and removes the experimental copy. What's new in stable databricks-jobs (v0.2.0): - Full task-types reference (9 types: notebook, spark_python, python_wheel, sql, dbt, pipeline, spark_jar, run_job, for_each) - All 6 trigger types with examples (cron, periodic, file_arrival, table_update, continuous, manual) + combining + pause/resume - Notifications + health rules + retries + timeouts + queues - 7 end-to-end worked examples (ETL, warehouse refresh, event-driven, ML training, multi-env, streaming, cross-job orchestration) - run_if conditions, environments (serverless deps), permissions What's retained from the prior stable skill: - parent: databricks-core hierarchy - Compatibility note + version metadata (bumped 0.1.0 → 0.2.0) - Scaffolding workflow (databricks bundle init + CLAUDE.md/AGENTS.md template + project structure) - Unit testing + development workflow sections - agents/openai.yaml + assets/ Cleanups during the merge: - Replaced the trigger-spam description with a terse one - Normalized hard-coded /Workspace/Users/user@example.com/ paths in the imported reference files to /Workspace/Shared/ scripts/skills.py: updated SKILL_METADATA description for jobs to reflect the broader scope. Manifest regenerated; experimental count drops from 23 to 22. Resolves PR #73 TODO #1a. Co-authored-by: Isaac

Adds an experimental/ directory containing the 26 agent skills from databricks-solutions/ai-dev-kit. These are imported as a snapshot on a best-effort basis — they are not officially supported skills and follow a looser contract than skills/ (no agents/openai.yaml, no shared-asset sync, no SKILL_METADATA gate). The manifest now exposes them under a new top-level experimental_skills map so consumers can distinguish them from stable skills and skip them by default. scripts/skills.py handles the new directory; the existing generate / validate flow is unchanged for stable skills. Co-authored-by: Isaac

Owners are the top contributors to databricks-solutions/ai-dev-kit (>=10 commits at import time). The cross-org team @databricks-solutions/ai-dev-kit-maintainers is the canonical owner; this line can be replaced with that team handle if the team gets write access to this repo. Co-authored-by: Isaac

Adds a Provenance & sync model section to experimental/README.md: - Transition phase: source of truth is upstream ai-dev-kit; this dir gets periodic manual re-syncs. - Post-lock: source of truth is this repo; ai-dev-kit's databricks-skills/ becomes read-only. Co-authored-by: Isaac

Adds two helpers to scripts/skills.py that run as part of `generate`: - `ensure_experimental_codex_metadata` copies the shared assets/databricks.{svg,png} into each experimental skill (mirroring what stable skills get via sync_assets). - `synthesize_openai_yaml` writes agents/openai.yaml from each experimental skill's SKILL.md frontmatter (display_name from the skill name, short_description from the first sentence of the frontmatter description, brand_color and icon paths fixed). Both run only when the destination file is absent, so upstream ai-dev-kit can override by shipping its own openai.yaml or assets. This closes the cosmetic gap that experimental skills installed into Codex CLI would render without an icon or marketplace metadata. Co-authored-by: Isaac

Same pattern as databricks-model-serving: the experimental version covers a different surface than the stable databricks-pipelines skill (workflow / scaffolding / DLT-migration / per-language performance vs feature reference / decision tree / common traps / format options). The DAB-coupled scaffolding workflow is also the specific concern Dustin flagged in the Apr 28 Slack thread for demo-generator flows. Removed experimental/databricks-spark-declarative-pipelines/; manifest regenerated (24 experimental skills). Follow-up TODO: port the high-value pieces (DLT migration guide, workflow A/B/C decision matrix, per-language performance reference, language-selection rules) into skills/databricks-pipelines/, stripping MCP-tool refs. Co-authored-by: Isaac

Adds the actual a-d-k commit hash (2228c3e on the add_appkit branch, 5 commits ahead of origin/main at import time) along with a note about the local deltas vs public main. Surfaces the key one: the databricks-app-python -> databricks-apps-python rename hadn't merged upstream, and pulling from the renamed version is what avoids a 3rd skill-name collision with d-a-s's stable databricks-apps. Co-authored-by: Isaac

extract_version_from_skill() falls back to a synthetic version when a skill's SKILL.md frontmatter has no version: field. The previous fallback was 0.0.0, which several install tools treat as "unset" rather than "first release". Bump to 0.0.1. Affects all 24 experimental skills (imported from ai-dev-kit without versions) plus the stable databricks-dabs skill. Skills with an explicit version are unchanged. Co-authored-by: Isaac

- experimental/README.md: install examples now use the -experimental suffix on the skill name + the --experimental flag (matching the install-path behaviour landed in databricks/cli#5243). Adds a short note explaining why the in-repo dir name and the install dir name differ. - experimental/README.md: drop databricks-model-serving from the collision example (it was removed from this PR earlier). - experimental/README.md: update the (also available as stable skill) note for databricks-jobs to point at the open TODO #1a. - Root README: clarify the suffixed install name in the by-name install example. Co-authored-by: Isaac

Replaces the previous import (a-d-k commit 2228c3e on add_appkit) with the head of a-d-k PR #533 (commit 9c7a5b3 on appkit-on-experimental), which targets a-d-k's experimental branch. Changes: - Refresh 23 experimental skill directories from the new source. - Drop databricks-lakebase-provisioned — removed on a-d-k experimental. - databricks-apps-python: rename + SKILL.md now leads with AppKit (TypeScript + React SDK) and demotes Python frameworks to alternatives; 6-mcp-approach.md replaced with 6-cli-approach.md. - databricks-lakebase-autoscale/references/connection-patterns.md: change placeholder `user:password` to `<user>:<password>` so the secret scanner doesn't flag the doc-only example. Cosmetic only. - Continue to exclude databricks-model-serving and databricks-spark-declarative-pipelines (PR #73 TODOs #1b and #5). - Regenerate manifest.json and agents/openai.yaml stubs via scripts/skills.py generate. - Update experimental/README.md provenance section with the new SHA, branch, and divergence notes. Co-authored-by: Isaac

The previous regex-only parser in extract_description_from_skill() captured the YAML block-scalar indicator (`>-`) verbatim, so any SKILL.md that wrote `description: >-\n multi-line content` produced a manifest entry of `">-"`. The new ai-dev-kit import (PR #533) brought two such files — databricks-dbsql and databricks-execution-compute — which landed corrupted descriptions in manifest.json and corrupted short_description / default_prompt in agents/openai.yaml. Walk the frontmatter line by line: if the value is a block-scalar indicator (|, |-, |+, >, >-, >+), aggregate the indented continuation lines (folded with spaces for `>`-style, newlines for `|`-style). Regenerate manifest.json and the two affected agents/openai.yaml stubs. Co-authored-by: Isaac

Replaces the hand-rolled block-scalar walker (added one commit ago) with PyYAML's safe_load. PyYAML's default SafeLoader is pure-Python — no C extension required — and handles every YAML edge case for free instead of reimplementing them. Side-benefit: also fixes a second latent bug. The regex parser stripped the outer YAML quotes but left inner `\"` escapes intact as literal backslash-quote characters, so descriptions like `"... mentions \"switch workspace\"..."` ended up in manifest.json with the backslashes preserved. yaml.safe_load resolves these correctly. Regenerated manifest reflects the fix for databricks-config. Co-authored-by: Isaac

Merges the comprehensive jobs reference content from experimental/databricks-jobs/ into skills/databricks-jobs/ and removes the experimental copy. What's new in stable databricks-jobs (v0.2.0): - Full task-types reference (9 types: notebook, spark_python, python_wheel, sql, dbt, pipeline, spark_jar, run_job, for_each) - All 6 trigger types with examples (cron, periodic, file_arrival, table_update, continuous, manual) + combining + pause/resume - Notifications + health rules + retries + timeouts + queues - 7 end-to-end worked examples (ETL, warehouse refresh, event-driven, ML training, multi-env, streaming, cross-job orchestration) - run_if conditions, environments (serverless deps), permissions What's retained from the prior stable skill: - parent: databricks-core hierarchy - Compatibility note + version metadata (bumped 0.1.0 → 0.2.0) - Scaffolding workflow (databricks bundle init + CLAUDE.md/AGENTS.md template + project structure) - Unit testing + development workflow sections - agents/openai.yaml + assets/ Cleanups during the merge: - Replaced the trigger-spam description with a terse one - Normalized hard-coded /Workspace/Users/user@example.com/ paths in the imported reference files to /Workspace/Shared/ scripts/skills.py: updated SKILL_METADATA description for jobs to reflect the broader scope. Manifest regenerated; experimental count drops from 23 to 22. Resolves PR #73 TODO #1a. Co-authored-by: Isaac

PR #533 has merged into upstream a-d-k experimental. The databricks-skills/ tree is byte-identical between the previous import SHA (9c7a5b3) and the merge commit (7b07f18) — only install.{sh,ps1} changed, which we don't import. README updated to point at the now-authoritative branch + SHA and drop the "one commit ahead of origin/experimental" caveat. Co-authored-by: Isaac

jamesbroadhead · 2026-05-15T11:13:05Z

Stable ↔ experimental skill overlap

Surfacing the overlap explicitly per Quentin's question. There are 6 stable/experimental pairs with non-trivial topical overlap; the remaining 16 experimental skills cover surfaces with no stable equivalent.

#	Stable	Experimental (this PR)	Topic	Overlap	Stable strengths (kept)	Experimental strengths (potential merge candidates)	Direction
1	`databricks-apps` (196L SKILL + 4 refs)	`databricks-apps-python` (259L SKILL + 6 refs + 4 examples)	Building Databricks Apps	Both scaffold apps on the Apps platform; both cover Lakebase integration, model serving, deployment	AppKit-first (TypeScript/React) — SQL queries, tRPC, smoke tests, Playwright selector rules, proto-first contracts, `appkit lint` cast rules, AppKit version pinning, Genie agent workflow	Python framework menu (Streamlit/Dash/Gradio/Flask/FastAPI/Reflex), explicit OAuth authorization patterns, Foundation Model API examples (`fm-minimal-chat`, `fm-parallel-calls`, `fm-structured-outputs`), app-resources schema	Keep both. Disjoint primary audience (TS vs Python). Port FM-API examples + Python-framework matrix into stable's `references/other-frameworks.md` post-merge.
2	`databricks-dabs` (39L SKILL + 5 refs, 450L total)	`databricks-bundles` (324L SKILL + SDP/alerts files, 497L total)	Declarative Automation Bundles	Both cover `databricks.yml`, resources, targets, deploy/validate, SDP pipelines, SQL alerts	Layered references (bundle-structure / deploy-and-run / resource-permissions / alerts / sdp-pipelines); naming convention `<name>.<type>.yml`; `--strict` validate guidance; permissions matrix	Self-contained single-file primer with full `databricks.yml` skeleton (variables, dev/prod targets); similar SDP + alerts content but flatter	Stable supersedes. Drop the experimental copy, or trim it to a thin pointer. The duplication is the cleanest example of what Quentin flagged.
3	`databricks-lakebase` (300L SKILL + 3 refs, 871L total)	`databricks-lakebase-autoscale` (232L SKILL + 5 refs, 1146L total)	Lakebase Postgres (Autoscaling)	Both cover projects/branches/endpoints, connectivity, OAuth token refresh, reverse-ETL via synced tables, compute sizing	Resource-hierarchy diagram, capacity planning + sizing details, Data API / PostgREST, app-integration via `databricks-apps`, compliance matrix (HIPAA/C5/TISAX)	Per-area refs (projects / branches / computes / connection-patterns / reverse-etl); 354L connection-patterns deep-dive; field-mask `update-*` CLI examples; region matrix (AWS + Azure-beta)	Stable supersedes on hierarchy/synced-tables. Port the deeper connection-patterns content + field-mask CLI usage into stable's `references/connectivity.md`.
4	`databricks-core` (CLI/auth/profile entrypoint)	`databricks-config`	Workspace/profile management	Both cover `databricks auth`, `~/.databrickscfg`, profile switching, OAuth + PAT login	"NEVER auto-select a profile" rule, CLI version-check + install flow, REST-API fallback for sandboxed envs, Claude Code separate-shell guidance, parent-skill index	Plain cheatsheet of `auth describe` / `config get` / `auth login` / `configure`	Stable supersedes. Experimental is a strict subset — drop.
5	`databricks-core` (CLI-first)	`databricks-python-sdk`	SDK + Connect quickstart	Both touch Databricks Connect, CLI version, profile selection	Profile-selection rules, CLI install, REST fallback	`databricks-sdk` Python usage, `WorkspaceClient` REST patterns, Connect quickstart, doc-index reference	Mostly disjoint. Keep as SDK-focused complement, or fold the SDK section into a sibling stable skill — does not collide with `databricks-core`'s CLI scope.
6	`databricks-serverless-migration`	`databricks-execution-compute`	Compute selection / execution	Both mention serverless vs classic / Databricks Connect	Migration lifecycle (Ingest/Analyze/Test/Validate); blocker detection (RDDs, DBFS, HMS, streaming triggers, init scripts); Spark Connect fixes	3-mode decision matrix (Connect / Serverless Job / Interactive Cluster); compute + warehouse CRUD cheatsheet; cold-start expectations	Keep both. Different angles — migration vs day-to-day execution choice. Cross-link from stable.

No stable equivalent (16 skills — no overlap)

databricks-agent-bricks, databricks-ai-functions, databricks-aibi-dashboards, databricks-dbsql, databricks-docs, databricks-genie, databricks-iceberg, databricks-metric-views, databricks-mlflow-evaluation, databricks-spark-structured-streaming, databricks-synthetic-data-gen, databricks-unity-catalog, databricks-unstructured-pdf-generation, databricks-vector-search, databricks-zerobus-ingest, spark-python-data-source.

TL;DR

Three pairs are real duplicates that should converge over time: dabs/bundles (drop experimental), lakebase/lakebase-autoscale (drop experimental, port connection-patterns content first), core/config (drop experimental).

Two pairs are complementary and worth keeping side-by-side: apps/apps-python (TS vs Python audiences), serverless-migration/execution-compute (migration vs runtime selection).

One pair is mostly disjoint: core/python-sdk (CLI vs SDK).

Intentional for this PR per the rationale in the description — port adk mostly intact, then resolve the duplicates as a series of single-skill PRs against stable, with the experimental copies removed in the same change. Happy to start that series if folks agree on the deltas above.

This comment was written by Claude.

Removes three experimental skills that duplicate stable equivalents without adding net new content: - databricks-bundles → use stable `databricks-dabs` - databricks-lakebase-autoscale → use stable `databricks-lakebase` - databricks-config → use stable `databricks-core` Cross-references in the surviving experimental skills (apps-python, python-sdk, zerobus-ingest, synthetic-data-gen, 4-deployment.md) now point at the stable names by bare name — matching the convention already used in stable skills' "Related Skills" / "Product Skills" sections. `experimental/README.md` records the removals and points readers at the stable replacements. Manifest regenerated — experimental skill count drops from 22 to 19. Constraint per James: root must not depend on experimental, but experimental may depend on root. Honored — all new references go in that direction, and no stable skill content was changed by this commit. Co-authored-by: Isaac

dustinvannoy-db

LGTM

jamesbroadhead · 2026-05-25T13:46:49Z

👋 Claude here on James's behalf — per-skill audit of the 5 non-blocking items from this PR's review threads. Each item gets a finding + recommendation; owners can decide what (if anything) becomes a follow-up PR.

#7 — databricks-execution-compute/scripts/compute.py (Lennart, cc @dustinvannoy-db)

Inspected the script: 743 lines, 19 functions. Vast majority shadows what's already in the Databricks CLI:

Function in compute.py	CLI equivalent
`list_clusters`, `get_best_cluster`, `start_cluster`, `get_cluster_status`	`databricks clusters list/get/start`
`create_cluster`, `terminate_cluster`, `delete_cluster`	`databricks clusters create/permanent-delete/delete`
`list_node_types`, `list_spark_versions`	`databricks clusters list-node-types/spark-versions`

The one differentiator is execute_databricks_command / run_code_on_serverless — the CLI has no one-shot "run python on cluster" command; that requires databricks api post .../execution-context workflow (which the SKILL could document instead).

Recommendation: when promoting databricks-execution-compute out of experimental, delete the bundled compute.py and document the CLI surface for the duplicated paths. For code execution, either (a) document the databricks api workflow against command-execution, or (b) propose a databricks compute execute-code subcommand in databricks/cli to close the gap. Not blocking; just a lock-in checklist item.

#8 — databricks-mlflow-evaluation vs MLflow-repo skills (Lennart, cc @dustinvannoy-db / @simonfaltum)

Searched github.com/mlflow/mlflow for a skills/ directory or an agent-skills-style repo under the mlflow org — neither exists publicly. The comparison Lennart was implicitly drawing isn't currently possible against a public source.

Recommendation: defer until the MLflow-repo skill source is identified (internal repo? planned upstream?). databricks-mlflow-evaluation (~6700 lines across 11 references) currently has no obvious external duplicate.

#9 — databricks-unstructured-pdf-generation Databricks-specificity (Lennart, cc @dustinvannoy-db)

Confirmed: the skill is ~80% Databricks-agnostic HTML → PDF tooling (weasyprint / wkhtmltopdf, HTML template patterns), ~20% Unity Catalog volume upload (databricks fs cp). The valuable Databricks-specific piece is the workflow shape (generate test PDFs → upload to UC volume → use them as a RAG evaluation dataset).

Recommendation: keep the skill, but either (a) trim to just the UC-volume upload step and link to standard HTML→PDF tooling externally, or (b) rename to something like databricks-rag-test-data to make the workflow shape the headline. The current name + scope makes the Databricks angle look thin.

#10 — databricks-vector-search "no longer experimental in Genie Code world" (Lennart, cc @simonfaltum)

Confirmed: there's no stable databricks-vector-search skill in skills/. If Genie Code treats Vector Search as a stable feature, the skill in experimental/ is mis-classified.

Recommendation: promote experimental/databricks-vector-search → skills/databricks-vector-search. The skill itself (335 lines SKILL.md + 4 references) is in good shape; the move is a one-PR git mv + manifest regen + version bump. Genie Code's @simonfaltum is best-placed to confirm the stable classification.

#11 — skills/databricks-jobs/SKILL.md propagation to Genie Code (Lennart, for @simonfaltum)

This isn't a d-a-s repo change — Genie Code consumes d-a-s via databricks aitools install (per databricks/cli PR #5243, now merged). The stable databricks-jobs at v0.2.0 (with the four task-type/triggers/notifications/examples refs added during this PR) ships automatically when users run the install in their Genie Code workspace context.

Recommendation: action is in the Genie Code product, not in d-a-s. @simonfaltum to verify the install path picks up skills/databricks-jobs/ correctly in Genie Code's workspace context.

Bonus finding (during audit): skills/databricks-jobs/ has its references at the top level (task-types.md, triggers-schedules.md, etc.) rather than under references/, unlike the other stable skills (databricks-apps, databricks-lakebase, databricks-pipelines). The newly-opened PR #86 restructures the experimental skills only; happy to extend it to cover stable databricks-jobs too if you'd like consistency across the repo.

(comment posted by Claude)

Phase 2 of the a-d-k → d-a-s port for databricks-spark-declarative-pipelines. Adds three new references that fill the dev-side gaps that stable's per-feature × per-language reference files don't cover: - references/workflows.md — Workflow A/B/C chooser (standalone bundle via `databricks pipelines init`, pipeline-in-existing-bundle, rapid CLI iteration with no bundle); language selection rules; start-update + poll-the-update pattern with the "never poll top-level pipeline state" rationale; edit/ re-upload/restart flow. - references/pipeline-configuration.md — Full JSON config reference for `pipelines create|update` (top-level fields, clusters, event_log, notifications, configuration, run_as, restart_window, environment, deployment); variant snippets (dev mode, non-serverless, continuous, notifications, autoscaling, custom event log, serverless Python deps); multi-schema patterns; platform constraints. - references/performance.md — Liquid Clustering with per-layer key guidance (bronze/silver/gold), cluster-key type rules, table properties, state management strategies for streaming, join optimization, query optimization, pre-aggregation, compute config, monitoring. SKILL.md updates: - New "Choose Your Workflow" and "Language Selection" sections. - Scaffolding section documents both `databricks pipelines init` and `databricks bundle init lakeflow-pipelines`. - Pipeline API Reference list reorganized into Project & Lifecycle and Datasets, Flows & Quality groups. - Version bumped to 0.3.0. Deliberately dropped from a-d-k's databricks-spark-declarative-pipelines: - 2-mcp-approach.md (a-d-k experimental already replaced with 2-cli-approach.md — MCP tool refs removed per PR #73 policy). - python/{1..4}-*.md and sql/{1..4}-*.md (covered by stable's existing per- feature × per-language refs: python-basics, sql-basics, auto-loader-*, auto-cdc-*, streaming-table-*, sink-*, foreach-batch-sink-*, etc.). - scripts/exploration_notebook.py (stable convention has no scripts/; users use the CLI directly or the explorations/ folder generated by `pipelines init`). Source: databricks-solutions/ai-dev-kit@experimental. Co-authored-by: Isaac

Phase 1 of #73's TODO #1b. Adds references/fm-api-endpoints.md with the curated Foundation Model API endpoint table (chat/instruct + embedding models) from databricks-solutions/ai-dev-kit's model-serving skill, plus common defaults and query examples (CLI + SDK). Stripped: the cloud/language prefix on the docs link, and the leftover MCP-tool references in the source. The endpoint table itself is static catalog data — no MCP coupling. SKILL.md updates: - bump version to 0.2.0 - point Endpoint Types table at the new reference - point the Foundation Model discovery bullet at the new reference Subsequent phases (separate PRs / commits) port the remaining dev-side content: classical-ml autolog patterns, Custom PyFunc signatures, ResponsesAgent with the create_text_output_item gotcha, UCFunctionToolkit + VectorSearchRetrieverTool resource passthrough. Co-authored-by: Isaac

…-d-k (#85) ## Summary Ports the `databricks-spark-declarative-pipelines` skill from [`databricks-solutions/ai-dev-kit`](https://github.com/databricks-solutions/ai-dev-kit/tree/experimental/databricks-skills/databricks-spark-declarative-pipelines) into stable `skills/databricks-pipelines/`. Source: `databricks-solutions/ai-dev-kit:experimental`. Completes d-a-s [PR #73](#73 TODO #5. Pairs with a-d-k [PR #546](databricks-solutions/ai-dev-kit#546), which tombstones the a-d-k skill once this lands. Stable's `databricks-pipelines` already covered the per-feature × per-language API/options surface (decision tree, common traps, format options, dataset/flow/quality references). a-d-k's version covered scaffolding/workflows, configuration, performance tuning, DLT migration, and several streaming patterns + Kafka ingestion + SCD-2 query patterns that stable lacked. This PR adds a-d-k's net-new content as new `references/` files; the per-feature reference structure is preserved. ## Changes ### New `references/` - `dlt-migration.md` — both migration paths (DLT Python → SDP Python via `pyspark.pipelines`, DLT Python → SDP SQL) with side-by-side conversions for decorators, reads, expectations, CDC/SCD, and partitioning → liquid clustering. - `workflows.md` — Workflow A/B/C chooser (standalone bundle via `databricks pipelines init`, pipeline-in-existing-bundle, rapid CLI iteration with no bundle); language-selection rules; start-update + poll-the-update pattern (with the "never poll top-level pipeline state because RETRY_ON_FAILURE flips it back to RUNNING" rationale); edit/re-upload/restart flow; Python SDK alternative. - `pipeline-configuration.md` — Full JSON config reference for `pipelines create|update` (top-level fields, `clusters`, `event_log`, `notifications`, `configuration`, `run_as`, `restart_window`, `environment`, `deployment`); variant snippets (dev mode, non-serverless, continuous, notifications, autoscaling, custom event log, serverless Python deps); multi-schema patterns; platform constraints. - `performance.md` — Liquid Clustering with per-layer key guidance (bronze/silver/gold); cluster-key type rules; table properties; state-management strategies for streaming; join optimization (stream-to-static, stream-to-stream with time bounds); query optimization; pre-aggregation; compute config; monitoring. - `streaming-patterns.md` — Deduplication (by key, with time window, composite); windowed aggregations (tumbling, multi-size, session windows); event-time vs processing-time; rescue-data quarantine (Auto Loader `_rescued_data` → bronze_quarantine + silver_clean fanout); stream-to-stream join as a pattern; running totals; anomaly detection (rolling z-score outlier flag); end-to-end lag monitoring. - `kafka.md` — Basic Kafka read (Python + SQL); JSON payload parsing with explicit schemas; Databricks Secrets SASL/PLAIN auth; mTLS notes; Event Hubs via the Kafka protocol; pipeline-config plumbing for brokers/topics; pointer to `sink.md` for writing back to Kafka. Fills a full gap — stable's SKILL.md API table listed `read_kafka` and `format(\"kafka\")` with no linked skill. - `scd-2-querying.md` — `__START_AT` / `__END_AT` temporal semantics; current-state materialized views; point-in-time queries with the inclusive-lower / exclusive-upper boundary; per-entity history; period-bounded change analysis; joining facts with historical dimensions (as-of-transaction-time and current-dim variants); pre-filter MV optimization; clustering on `(entity_key, __START_AT)`. ### `SKILL.md` - New "Choose Your Workflow" and "Language Selection" sections near scaffolding. - Scaffolding section documents both `databricks pipelines init` (newer, focused) and `databricks bundle init lakeflow-pipelines` (template-based). - Pipeline API Reference list reorganized: **Project & Lifecycle** (workflows, configuration, performance, DLT migration) and **Datasets, Flows & Quality** (the existing per-feature refs + new kafka, scd-2-querying, streaming-patterns). - Version bumped to `0.3.0`. ### Cross-references in existing references - `auto-loader.md` → `streaming-patterns.md` (quarantine), `kafka.md`, lag monitoring. - `auto-cdc.md` → `scd-2-querying.md` for reading SCD-2 history tables. ## Deliberately dropped from a-d-k | a-d-k file | Why dropped | |------------|-------------| | `references/2-mcp-approach.md` | a-d-k experimental already renamed this to `2-cli-approach.md`; MCP tool refs stripped per d-a-s PR #73 policy. CLI flow now lives in `workflows.md` as Workflow C. | | `references/python/1-syntax-basics.md`, `references/sql/1-syntax-basics.md` | Covered by stable's `python-basics.md`, `sql-basics.md`, and the per-feature references (streaming-table, materialized-view, temporary-view, view-sql). | | `references/python/{2,3,4}-*.md`, `references/sql/{2,3,4}-*.md` | Pattern content ported into `streaming-patterns.md`, `kafka.md`, `scd-2-querying.md` (this PR); API/options content already covered by stable's per-feature × per-language references. | | `scripts/exploration_notebook.py` | Stable convention has no `scripts/` directory under a skill. `databricks pipelines init` generates an `explorations/` folder; users use the CLI or the generated notebook directly. | ## Test plan - [x] `python3 scripts/skills.py generate` clean. - [x] `python3 scripts/skills.py validate` passes. - [x] Merged `origin/main` mid-port (resolved version conflict — kept `0.3.0`; took main's CLI install command + compatibility bump). - [ ] CI green on this branch. - [ ] Owner review (`@lennartkats-db` / `@camielstee-db` per CODEOWNERS). This pull request and its description were written by Claude.

## Summary Per [d-a-s #73](#73 post-merge follow-ups (`@dustinvannoy-db` and `@lennartkats-db`): consolidate every skill's reference layout to use `references/<file>.md`, matching the convention already in place for `databricks-apps`, `databricks-lakebase`, and `databricks-pipelines`. ## Changes **Experimental skills** (commit 1): - Move 51 top-level `.md` files (excluding `SKILL.md`) into per-skill `references/` directories across 12 experimental skills. - Rewrite all 193 path references in the corresponding `SKILL.md` files. - Wire in one orphan: `databricks-python-sdk/doc-index.md` was never referenced from its SKILL.md. Added a "SDK Reference" section in SKILL.md pointing at it + the existing `examples/` directory. **Stable `databricks-jobs`** (commit 2): - Same move for the 4 top-level reference files (`examples.md`, `notifications-monitoring.md`, `task-types.md`, `triggers-schedules.md`) → `skills/databricks-jobs/references/`. - Rewrite 36 path references in `SKILL.md`. - `databricks-jobs` was the last stable skill with the references-at-top-level layout. Manifest regenerated; `python3 scripts/skills.py validate` passes. ## Skills affected **12 experimental** (51 files moved): `databricks-agent-bricks` (2), `databricks-ai-functions` (4), `databricks-aibi-dashboards` (5), `databricks-apps-python` (6), `databricks-dbsql` (5), `databricks-iceberg` (5), `databricks-metric-views` (2), `databricks-python-sdk` (1, was orphaned), `databricks-spark-structured-streaming` (9), `databricks-unity-catalog` (3), `databricks-vector-search` (4), `databricks-zerobus-ingest` (5). **1 stable** (4 files moved): `databricks-jobs` (4). Other skills already had `references/` and were untouched: `databricks-execution-compute`, `databricks-mlflow-evaluation`, `databricks-synthetic-data-gen`, `spark-python-data-source` (experimental), and `databricks-apps`, `databricks-lakebase`, `databricks-pipelines`, `databricks-model-serving`, `databricks-core`, `databricks-dabs`, `databricks-serverless-migration` (stable). ## Test plan - [x] `python3 scripts/skills.py generate` clean. - [x] `python3 scripts/skills.py validate` passes (`Everything is up to date.`). - [x] All 55 moves are renames (verified with `git diff -M`). - [x] Spot-checked `databricks-agent-bricks/SKILL.md` and `databricks-jobs/SKILL.md` to confirm path-rewrites. - [ ] CI green on this branch. This pull request and its description were written by Claude.

@simonfaltum

## Summary Resolves the open follow-up from d-a-s [PR #73](#73 reviewer-flagged list: > **\`databricks-mlflow-evaluation\`**: review overlap with MLflow-repo skills (Lennart, non-blocking; cc'd @simonfaltum). The OSS [\`mlflow/skills\`](https://github.com/mlflow/skills) repo ships [\`agent-evaluation\`](https://github.com/mlflow/skills/tree/main/agent-evaluation) and four related skills (\`instrumenting-with-mlflow-tracing\`, \`analyze-mlflow-trace\`, \`retrieving-mlflow-traces\`, \`querying-mlflow-metrics\`) that cover the generic MLflow GenAI evaluation workflow — \`mlflow.genai.evaluate()\`, scorers/judges, datasets, tracing setup, the 5-step evaluation loop. Substantial topic overlap with this skill. Rather than dedupe content or split the skill, this PR adds a short \"Scope vs upstream \`mlflow/skills\`\" section at the top of \`SKILL.md\` that: - Names the upstream skills. - Scopes this skill to **Databricks-specific patterns layered on top** of that workflow — UC trace ingestion, MemAlign judge alignment via UC SME labeling sessions, \`optimize_prompts()\` GEPA loop, UC-table-backed datasets. - Defers everything else to upstream rather than restating it. Picked this approach because option (b) — pushing the Databricks-specific patterns upstream into \`mlflow/skills/agent-evaluation/references/\` as a fork-PR — would split source of truth and force the MLflow team to own Databricks-specific docs. ## Test plan - [x] \`python3 scripts/skills.py validate\` passes. - [x] Manifest unchanged (file list identical; only \`SKILL.md\` content changed). - [ ] Reviewer ack that the scope-boundary is the right place to draw the line. This pull request and its description were written by Claude. Signed-off-by: James Broadhead <james.broadhead@databricks.com>

…sorted manifest (#95) ## Summary Two related cleanups to `manifest.json` hygiene. ### 1. Drop `base_revision` plumbing The `base_revision` field was added in [#30](#30) as a per-skill upstream-revision tag, intended to track where a synced skill came from in a-d-k. It never got a consumer: - `databricks/cli`'s `aitools install` ignores it (`gh search code base_revision --owner=databricks` returns only this repo's own `scripts/skills.py`). - No skill on `main` carries it. A few feature branches populated it while iterating on a-d-k ports (e.g. `origin/experimental-aidevkit` had `"base_revision": "e742f36e8ab1"` for some skills), but those values never reached `main`. - The generator just round-tripped the field if present in the prior manifest, and `normalize_manifest` stripped it before the validate diff — so it couldn't fail CI either. It's dead weight that future maintainers have to reason about. Dropped: - `_build_stable_entry`'s existing-entry-preservation branch (and the now-unused `existing_skills` parameter). - `_build_experimental_entry`'s equivalent. - `generate_manifest`'s existing-manifest read (only used to feed `existing_skills`). - `normalize_manifest` and `_normalize_skill_map` — the only volatile field they normalized was `base_revision`, so `validate_manifest` can compare dicts directly now. If per-skill upstream sync tracking does become useful later, re-add the field at the same time as the tool that consumes it. The long-term sync plan in [#73 TODO #3](#73) is git-subtree-based, which tracks revisions intrinsically — `base_revision` doesn't help there. ### 2. Enforce canonical sorted form The on-disk `manifest.json` is now required to be byte-equal to the canonical serialization (`json.dumps(manifest, indent=2, sort_keys=True) + "\n"`). That means: - The `skills` map's keys are alphabetical across stable **+** experimental (no more stable-first grouping; that was a side-effect of build order, not a deliberate choice). - Each skill entry's keys are alphabetical: `description`, `files`, `min_cli_version?`, `repo_dir`, `version`. - The top-level keys (`skills`, `version`) are alphabetical. - `files` arrays remain sorted by the generator (already enforced). `scripts/skills.py validate` now does two checks: 1. **Content lint** — the parsed dict equals what `generate_manifest` would produce. Catches stale content, unsorted/missing/extra `files` entries, drift between SKILL.md frontmatter and manifest. 2. **Canonical-form lint** — the on-disk bytes equal `serialize_manifest(current)`. Catches hand-edited sort drift (re-ordered skill names, re-ordered per-skill keys) even when the parsed content is correct. The writer at `scripts/skills.py generate` uses the same `serialize_manifest` helper, so files emitted by the generator always pass the lint. ## Diff shape | File | Change | |---|---| | `scripts/skills.py` | −47 / +35 lines: drop `base_revision` branches + dead normalize machinery; add `serialize_manifest` helper; rework `validate_manifest` to do the two checks above. | | `manifest.json` | 220-line swap: skills now in one alphabetical run (`databricks-agent-bricks`, `databricks-ai-functions`, ...) and per-skill keys alphabetised. No semantic change for `aitools install`. | ## Test plan - [x] `python3 scripts/skills.py generate` clean — regenerates the canonical sorted manifest. - [x] `python3 scripts/skills.py validate` passes. - [x] Lint catches a manually-reversed `skills` map: emits `ERROR: manifest.json is not in canonical sorted form. Keys must be alphabetical at every level.` - [x] Lint catches a manually-reversed `files` array inside a skill entry: emits `ERROR: manifest.json content is out of date`. - [ ] CI green on this branch. This pull request and its description were written by Claude. --------- Co-authored-by: simon <simon.faltum@databricks.com>

…ental (#91) ## Summary Stable and experimental skills had two different contracts for Codex CLI marketplace metadata (`agents/openai.yaml` + `assets/databricks.{svg,png}`): | | stable (`skills/`) | experimental (`experimental/`) | |---|---|---| | `agents/openai.yaml` | hand-authored, required | auto-synthesised from `SKILL.md` frontmatter | | `assets/databricks.{svg,png}` | manual copy needed | `sync_assets()` copied them in | | CI enforcement | partial (only via `_build_stable_entry`) | full (`check_assets_synced` + `ensure_experimental_codex_metadata`) | So adding a stable skill needed manual icon copies + a hand-authored YAML; adding an experimental one was `python3 scripts/skills.py generate`. Stable also had no CI check that the icons were actually present. This PR makes the contract uniform: every skill gets icons + `agents/openai.yaml` auto-generated when missing, hand-authored YAML is preserved as an override, and one CI cycle (`python3 scripts/skills.py validate`) enforces it for both directories. - `iter_all_skill_dirs(repo_root)` walks every skill across `skills/` and `experimental/`. - `ensure_codex_metadata(repo_root)` replaces `sync_assets()` and `ensure_experimental_codex_metadata()`. Copies shared icons if missing/stale; synthesises `agents/openai.yaml` only when absent (so curated YAML like `databricks-core`'s `display_name: \"Databricks\"` survives). - `check_codex_metadata(repo_root)` mirrors the same checks for `validate`. The redundant per-skill openai.yaml existence check in `_build_stable_entry` is gone. - `main()` `sync` + `generate` call `ensure_codex_metadata`; `validate` calls `check_codex_metadata` then `validate_manifest`. - `.github/workflows/validate-manifest.yml` already covered both `skills/**` and `experimental/**`; no workflow change needed — one CI cycle covers both. - `CONTRIBUTING.md` gets a new \"Skill anatomy\" section explaining what these files are, who consumes them (Codex CLI marketplace), why the repo ships them for every skill, the auto-generation + hand-authoring escape hatch, and `DISPLAY_NAME_OVERRIDES` for acronym/product-name casing. `CLAUDE.md` links to it. ## Why PR #73 (experimental import) added the auto-gen path for experimental skills only. Reviewers' question on that PR — \"what are these files, why do we ship them, how do new skills get them?\" — pointed at the asymmetry. This is the follow-up. ## Test plan - [x] `python3 scripts/skills.py generate` on clean tree → no diff (no-op). - [x] `python3 scripts/skills.py sync` on clean tree → no diff. - [x] Removed `skills/databricks-core/{agents/openai.yaml,assets/databricks.png}` + `experimental/databricks-ai-functions/agents/openai.yaml` → `validate` exits 1 listing all three missing files; `generate` heals all three; hand-authored stable YAML preserved when restored from backup before re-running `generate`. - [x] No stale references to the removed helpers (`sync_assets`, `check_assets_synced`, `ensure_experimental_codex_metadata`) anywhere in the repo. This pull request and its description were written by Claude. --------- Signed-off-by: James Broadhead <james.broadhead@databricks.com>

…apx Related Skills entry `databricks-app-apx` was the FastAPI+React stack referenced from ai-dev-kit's `databricks-apps-python` skill. It has been removed upstream (a-d-k is deprecated; the apx-on-CLI flow merged into the stable `databricks-apps` skill via #84/#73). The "Related Skills" bullet is the last dangling reference inside this repo. This PR was prepared by Claude.

## Summary The skills manifest in `databricks/databricks-agent-skills` is gaining experimental skills sourced from a new `experimental/` directory in the repo (see paired [d-a-s PR #73](databricks/databricks-agent-skills#73), which imports the ai-dev-kit skill catalog into `experimental/`). This wires the parsing through the aitools installer: - `Manifest.Skills` is a **single map** holding both stable and experimental entries; the per-skill `repo_dir` field ("skills" or "experimental") is the source of truth for whether a skill is experimental. `SkillMeta.IsExperimental()` derives state from `RepoDir`. - Experimental skills get a `-experimental` suffix on their install-side key during `normalizeManifest`; `SourceName` preserves the unsuffixed name for fetch URLs. - The existing `--experimental` flag (already wired in `cmd/skills.go`) now has experimental skills to install; without it, `resolveSkills` filters them out as before. ## UX ``` # default — only stable skills databricks experimental aitools skills install # all experimental skills, plus stable databricks experimental aitools skills install --experimental # one experimental skill by name (--experimental still required by resolveSkills) databricks experimental aitools skills install databricks-iceberg-experimental --experimental ``` ## TODOs / caveats for iteration 1. ~~**`DATABRICKS_SKILLS_REF` pin.**~~ **Partially resolved.** The default ref is still the latest stable release tag (sourced from `experimental/aitools/lib/installer/SKILLS_VERSION`); experimental entries won't exist there until d-a-s cuts a release with [PR #73](databricks/databricks-agent-skills#73) merged. The default ref bump is a follow-up automated by the SKILLS_VERSION file. **UX fix shipped in this PR**: if `--experimental` is passed but the manifest at the resolved ref exposes no experimental skills, a warning is logged pointing users at `DATABRICKS_SKILLS_REF=main`. 2. ~~**Collision handling is naive.**~~ **Resolved.** Every experimental skill gets a `-experimental` suffix on its install-side key during `normalizeManifest`. The manifest key + install dir both carry the suffix; the `SourceName` field on `SkillMeta` preserves the upstream repo dir name for fetch URLs. Users see at a glance which installed skills are experimental. Also handled: **experimental↔stable transitions**. If a skill flips its experimental status upstream (the same logical skill changes manifest key), `install` removes the stale variant on disk + state before installing the new one, and `uninstall` accepts either variant name (and removes both if both are present). Helper: `alternateVariantKey()`. Covered by tests `TestInstallReplacesAlternateVariant`, `TestUninstallByEitherVariantRemovesBoth`, `TestUninstallByAlternateNameWhenOnlyOneVariantInstalled`. 3. ~~**`list` UX.**~~ **Resolved.** `aitools skills list` shows experimental skills with an `[experimental]` tag in the NAME column (driven by `meta.IsExperimental()`). Combined with the TODO #2 resolution (`-experimental` suffix in the manifest key), every experimental row reads e.g. `databricks-iceberg-experimental [experimental]` — slightly redundant but a clear visual anchor. Hide-by-default was considered but rejected: users running `list` are usually looking for what's available, and silently omitting experimental skills makes them un-discoverable. 4. ~~**State tracking.**~~ **Resolved — kept additive semantics.** `InstallState.IncludeExperimental` records what was last requested but is not used to drive retroactive removal. Running `install` without `--experimental` leaves previously-installed experimental skills in place. Rationale: (a) users running `install` are typically adding/updating, not declaring set membership; (b) silently uninstalling things the user previously asked for is surprising; (c) the transition cleanup shipped under TODO #2 handles the actual drift case (skill's experimental status flipping upstream). Removal is what `uninstall` is for. 5. ~~**No acceptance test yet.**~~ **Resolved.** Added acceptance tests under `acceptance/experimental/aitools/skills/install*/` covering the install flow against a mocked manifest server: - Stable-only install (no flag) → 1 skill installed - `--experimental` install adds the experimental skill (with `-experimental` suffix per the install-path model) → 2 skills total - Re-running `--experimental` is idempotent - Specific-skill install (`install --skills <name>`) for both stable and experimental - `--experimental` against a manifest with no experimental entries logs a nudge To make these reachable, exposed a new env-var override `DATABRICKS_SKILLS_BASE_URL` that overrides the hard-coded `raw.githubusercontent.com` base URL used by `GitHubManifestSource.FetchManifest` and `fetchSkillFile`. Defaults to the canonical URL when unset, so no production behavior change. Updated `Taskfile.yml`'s `test-exp-aitools` task to include `acceptance/experimental/aitools/**`. Variants left as follow-up acceptance tests (the structure is now in place): - Variant transition cleanup (stable → experimental, experimental → stable) - Uninstall flow (with both variants installed) 6. ~~**`--experimental` flag scope.**~~ **Resolved — kept current scope.** Each command has internally consistent behavior: - `install --experimental` → explicit opt-in (required to install experimental skills). - `update` → state-driven (honors `InstallState.IncludeExperimental` from the last `install`). If you opted in once, future updates refresh experimentals; otherwise they're skipped. - `list` → shows all skills with an `[experimental]` tag (no filtering — discovery first, opt-in to install). Adding `--experimental` / `--no-experimental` to `update` for one-off overrides was considered but rejected: the natural workflow is to re-run `install --experimental` (or just `install`), which already sets the desired state. Follow-up if real users hit a use case for the override. 7. ~~**Manifest shape.**~~ **Resolved.** Replaced the original two-map design (`skills` + `experimental_skills` + a per-skill `experimental` bool) with a single `skills` map where each entry's `repo_dir` (`"skills"` or `"experimental"`) is the source of truth. The cli derives experimental state from `RepoDir` via `SkillMeta.IsExperimental()`. Collisions between stable and experimental skills with the same repo dir name must be resolved upstream in d-a-s (which they already are — d-a-s PR #73's TODO #1a merged the only known collision into stable). The d-a-s manifest generator should be updated to emit `repo_dir` per skill; until then `normalizeManifest` defaults a missing `RepoDir` to `"skills"` so older manifests still parse. ## Test plan - [x] `go build ./...` passes. - [x] `go test ./experimental/aitools/...` passes (`source_test.go` covers the normalize/IsExperimental cases). - [x] `go test ./acceptance -run TestAccept/experimental/aitools` passes (a pre-existing flake intermittently surfaces an `lstat` warning during copyDir, ~10% of multi-test runs; unrelated to this refactor). - [ ] Run `./task lint` and `./task fmt` before merge. - [ ] Manual: against a d-a-s ref containing experimental entries with `repo_dir`, verify the four UX cases above behave correctly. This pull request and its description were written by Claude. --------- Co-authored-by: simon <4305831+simonfaltum@users.noreply.github.com> Co-authored-by: simon <simon.faltum@databricks.com>

…apx Related Skills entry (#106) ## Summary Removes the last dangling `databricks-app-apx` reference in this repo — one line in `experimental/databricks-apps-python/SKILL.md` ("Related Skills" bullet). ## Why `databricks-app-apx` was the FastAPI+React stack referenced from ai-dev-kit's `databricks-apps-python`. It has been removed upstream (a-d-k is deprecated; the apx-on-CLI flow merged into the stable `databricks-apps` skill via #84/#73). I grepped the entire repo and this bullet is the only remaining mention — README, install scripts, and stable skills no longer reference it. ## Test plan - [x] `python3 scripts/skills.py validate` passes (`Everything is up to date.`) - [x] `grep -rn databricks-app-apx .` returns no remaining hits. - [ ] CI green. This pull request and its description were written by Claude.

…tmatter (#105) ## Summary Backfills three frontmatter fields on 17 `experimental/` SKILL.md files that stable skills already carry but the imported a-d-k snapshot does not: - `compatibility: Requires databricks CLI (>= v0.294.0)` - `metadata.version: "0.1.0"` (was the `0.0.1` `scripts/skills.py` fallback floor) - `parent: databricks-core` Closes the frontmatter-version / parent-skill / CLI-compatibility gaps in one mechanical pass — they all touch the same files. ## Why The stable-side standard is documented in CLAUDE.md and consistently applied across `skills/`. Experimental skills carry none of these fields because [#73](#73) imported the a-d-k snapshot verbatim. PR [#73 TODO #7](#73) explicitly leaves the version backfill open ("when upstream a-d-k eventually adds version fields, those win; until then, the manifest reports the floor"). With a-d-k now deprecated, this repo is source of truth for `experimental/` and the backfill lands here. The promotion-time pattern (cf. [#87](#87) vector-search) adds these fields on the way out of `experimental/`. This PR closes the gap for the remaining skills that haven't been promoted yet. ## Changes 17 SKILL.md files in `experimental/`, plus manifest regeneration. `experimental/databricks-vector-search/SKILL.md` is intentionally **skipped** — #87 promotes it to `skills/` and adds the same fields as part of the move; including it here would create fake conflicts. Whichever lands first, the other rebases cleanly. ## Manifest deltas Every experimental skill's `version` flips from `0.0.1` (the `extract_version_from_skill` fallback floor) to `0.1.0`. `compatibility` and `parent` are SKILL.md-only — not surfaced in manifest.json today. ## Test plan - [x] `python3 scripts/skills.py generate` clean - [x] `python3 scripts/skills.py validate` passes (`Everything is up to date.`) - [ ] CI green on this branch. This pull request and its description were written by Claude. --------- Signed-off-by: simon <simon.faltum@databricks.com> Co-authored-by: simon <simon.faltum@databricks.com>

…#88) ## Summary Per Lennart's [audit](#73 (comment)) on #73, item #9: `databricks-unstructured-pdf-generation` reads as "not very Databricks-specific" because the headline is local HTML → PDF generation, with the Databricks workflow (UC volume + RAG-eval dataset) buried. This reframes the skill to put the Databricks-specific value up front, without removing any content. ## Changes `experimental/databricks-unstructured-pdf-generation/SKILL.md`: - Frontmatter `description` now leads with **"Build RAG / unstructured-document evaluation datasets on Databricks"**. PDF generation is positioned as a step, not the headline. - Body intro states explicitly that the Databricks-specific value is the workflow shape (UC volume layout, paired question files, hand-off to downstream `ai_extract` / `ai_parse_document` / `mlflow.genai.evaluate()`), not the HTML → PDF tooling itself. - One-line escape hatch added: *"If you only need ad-hoc PDFs (no Databricks workflow), any HTML → PDF tool works directly — this skill exists for the synthetic-dataset-on-UC end-to-end shape, not as a general PDF generator."* Manifest regenerated to pick up the new description. No deletions; this is a framing change. ## What this doesn't do Two stronger alternatives in the audit are *not* implemented here: - **Trim** the local HTML → PDF tooling and link to an external tool. Would destroy useful content; the templates and parallel-conversion patterns are still valuable for users following the end-to-end workflow. - **Rename** to e.g. `databricks-rag-test-data`. Has cross-PR implications (the a-d-k tombstone PR [databricks-solutions/ai-dev-kit#546](databricks-solutions/ai-dev-kit#546) references the current name) and changes the install command for existing users. If either is preferred over this lighter reframe, happy to open a follow-up. ## Test plan - [x] `python3 scripts/skills.py generate` clean. - [x] `python3 scripts/skills.py validate` passes. - [ ] CI green. - [ ] Reviewer sign-off (`@lennartkats-db` raised this; `@dustinvannoy-db` cc'd). This pull request and its description were written by Claude. --------- Co-authored-by: simon <simon.faltum@databricks.com>

## Summary Per Lennart's [audit](#73 (comment)) on #73, item #10: Vector Search is no longer considered experimental in Genie Code. Promote `experimental/databricks-vector-search/` → `skills/databricks-vector-search/`. ## Changes - `git mv experimental/databricks-vector-search skills/databricks-vector-search`. - Move the 4 top-level reference files into `references/` to match the stable-skills layout convention (apps, lakebase, pipelines). - SKILL.md frontmatter: add `parent: databricks-core` and `metadata.version: "0.1.0"`. Body: add the standard "FIRST: Use the parent `databricks-core` skill" prelude. - Rewrite 10 path references in SKILL.md for the new `references/<file>.md` locations. - `scripts/skills.py`: add `databricks-vector-search` to `SKILL_METADATA`. - Root `README.md` "Available Skills" list: add `databricks-vector-search`. - `experimental/README.md`: remove from the experimental skill list. Manifest regenerated; `python3 scripts/skills.py validate` passes. ## Cross-repo - a-d-k tombstone PR [databricks-solutions/ai-dev-kit#546](databricks-solutions/ai-dev-kit#546) currently redirects `databricks-vector-search` with `--experimental`. I'll drop the flag in that PR to match this promotion. - Stacking concern: this PR has zero overlap with the open references-restructure PR [#86](#86) since the vector-search files move in this PR too (commit includes the `references/` move). Whichever lands first, the other will rebase cleanly with no conflicts in the vector-search subtree. ## Test plan - [x] `python3 scripts/skills.py generate` clean. - [x] `python3 scripts/skills.py validate` passes (`Everything is up to date.`). - [ ] CI green. - [ ] Reviewer confirmation that stable classification is correct (`@simonfaltum` flagged this). This pull request and its description were written by Claude.

jamesbroadhead requested review from a team, lennartkats-db and simonfaltum as code owners May 12, 2026 15:52

jamesbroadhead mentioned this pull request May 12, 2026

aitools: parse experimental_skills manifest section databricks/cli#5243

Merged

5 tasks

jamesbroadhead force-pushed the experimental-aidevkit branch from 2035bab to 2a45b83 Compare May 12, 2026 16:06

jamesbroadhead requested a review from fjakobs as a code owner May 12, 2026 16:09

jamesbroadhead force-pushed the experimental-aidevkit branch from 8387654 to a0fbb21 Compare May 12, 2026 16:14

jamesbroadhead mentioned this pull request May 12, 2026

RFC: subtree-sync skills from databricks-agent-skills/experimental databricks-solutions/ai-dev-kit#530

Closed

jamesbroadhead changed the title ~~experimental: import ai-dev-kit skills as best-effort skills~~ experimental: import ai-dev-kit skills into experimental/ director May 12, 2026

jamesbroadhead changed the title ~~experimental: import ai-dev-kit skills into experimental/ director~~ experimental: import ai-dev-kit skills into experimental/ directory May 12, 2026

jamesbroadhead requested a review from camielstee-db as a code owner May 12, 2026 22:26

jamesbroadhead added 13 commits May 15, 2026 09:44

jamesbroadhead force-pushed the experimental-aidevkit branch from 50467c3 to 5719156 Compare May 15, 2026 09:45

dustinvannoy-db approved these changes May 23, 2026

View reviewed changes

lennartkats-db approved these changes May 24, 2026

View reviewed changes

lennartkats-db merged commit 482f9ff into main May 24, 2026
1 check passed

This was referenced May 26, 2026

docs(mlflow-evaluation): cross-reference upstream mlflow/skills #94

Merged

docs: reference DAS skills via CLI; drop ai-dev-kit link databricks-solutions/genie-code-skills-demo#4

Closed

jamesbroadhead mentioned this pull request May 26, 2026

scripts(skills): drop dead base_revision plumbing, enforce canonical sorted manifest #95

Merged

5 tasks

This was referenced May 27, 2026

experimental: backfill metadata.version + parent + compatibility frontmatter #105

Merged

docs(experimental/databricks-apps-python): drop stale databricks-app-apx Related Skills entry #106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experimental: import ai-dev-kit skills into experimental/ directory#73

experimental: import ai-dev-kit skills into experimental/ directory#73
lennartkats-db merged 27 commits into
mainfrom
experimental-aidevkit

jamesbroadhead commented May 12, 2026 •

edited

Loading

Uh oh!

jamesbroadhead commented May 15, 2026

Uh oh!

dustinvannoy-db left a comment

Uh oh!

Uh oh!

jamesbroadhead commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jamesbroadhead commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Source

Direction caveat

TODOs / caveats for iteration

Test plan

Post-merge follow-ups (reviewer-flagged)

Uh oh!

jamesbroadhead commented May 15, 2026

Stable ↔ experimental skill overlap

No stable equivalent (16 skills — no overlap)

TL;DR

Uh oh!

dustinvannoy-db left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jamesbroadhead commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jamesbroadhead commented May 12, 2026 •

edited

Loading