Skip to content

experimental: import ai-dev-kit skills into experimental/ directory#73

Merged
lennartkats-db merged 27 commits into
mainfrom
experimental-aidevkit
May 24, 2026
Merged

experimental: import ai-dev-kit skills into experimental/ directory#73
lennartkats-db merged 27 commits into
mainfrom
experimental-aidevkit

Conversation

@jamesbroadhead
Copy link
Copy Markdown
Contributor

@jamesbroadhead jamesbroadhead commented May 12, 2026

Summary

Adds an experimental/ directory containing 18 agent skills from databricks-solutions/ai-dev-kit databricks-skills/, imported as a snapshot on a best-effort basis. Excluded:

  • databricks-model-serving (TODO #1b — different surface than stable, heavy MCP coupling)
  • databricks-spark-declarative-pipelines (TODO Fix skill inconsistencies and add required reading sections #5 — different surface than stable databricks-pipelines)
  • databricks-genie — removed during review per @lennartkats-db; deferred to a future revision
  • databricks-lakebase-provisioned — not in the upstream experimental branch either

The manifest exposes both stable and experimental skills in a single skills map. Each entry carries a repo_dir field ("skills" or "experimental") that points to the directory the skill lives in. Consumers derive experimental state from repo_dir — there is no parallel experimental_skills map and no per-skill experimental bool.

Paired with databricks/cli#5243 which teaches databricks aitools install (top-level) to:

  • read repo_dir and skip experimental entries by default,
  • install all of them with --experimental,
  • install one by name with --experimental required.

Experimental and stable skills install under their plain names (e.g. ~/.claude/skills/databricks-iceberg/); the upstream repo enforces name-uniqueness across both directories, so no install-side suffix is needed.

Source

Final sync at merge time was 20a92a3 on databricks-solutions/ai-dev-kit:experimental ("tests: outcome-oriented rewrite across 7 skills + strip MCP tool_modules from all manifests").

Initial import was 9c7a5b3 (head of a-d-k PR #533 on the appkit-on-experimental branch). PR #533 has since merged into experimental (7b07f18). The branch went through periodic re-syncs during review to pull in upstream updates.

The rename (databricks-app-pythondatabricks-apps-python) is preserved in the merged version, which is what prevents a 3rd skill-name collision with d-a-s's stable databricks-apps.

Direction caveat

In the Apr 28 thread (Slack link), Dustin's stated plan was to move databricks-agent-skills skills into ai-dev-kit's experimental branch as defaults. This PR went the other direction (a-d-k content → d-a-s/experimental). The plan post-merge is to invert the direction via git subtree — see TODO #3 below and a-d-k RFC PR #530.

TODOs / caveats for iteration

  1. Name collisions. Resolved in this PR:

    • 1a. databricks-jobs — merged into stable. Imported the comprehensive reference content from a-d-k's databricks-jobs skill into skills/databricks-jobs/, bumping version to 0.2.0. The merged skill keeps stable's scaffolding workflow + parent: databricks-core hierarchy + Codex agents/openai.yaml + compatibility note, and adds the experimental's full task-types reference (9 types), trigger types (6), notifications/health/retries/queues, and 7 worked end-to-end examples. Layered structure: SKILL.md as overview + four reference files (task-types.md, triggers-schedules.md, notifications-monitoring.md, examples.md). The experimental copy is removed.

      With the single-map manifest shape, collisions are no longer possible — _add_skill raises if the same skill name shows up under both skills/ and experimental/, so any future drift fails generation loudly.

    • 1b. databricks-model-serving — dropped from this PR. After a deep compare, the two skills cover almost entirely different surfaces: stable is ops-focused (manage existing endpoints via CLI); experimental is dev-focused (build & ship MLflow models / GenAI agents with autolog → mlflow.pyfunc.log_modeldatabricks.agents.deploy() → query, with full Classical ML / Custom PyFunc / ResponsesAgent + LangGraph / UCFunctionToolkit / VectorSearchRetrieverTool coverage). Near-zero content overlap. Experimental version also has heavy MCP-tool dependency (60+ refs to ai-dev-kit's manage_serving_endpoint, etc., that don't exist in the d-a-s/databricks aitools flow). Follow-up: port the high-value dev-side content into the stable skill — classical-ml autolog patterns, Custom PyFunc signatures, ResponsesAgent pattern with the create_text_output_item helper-method gotcha, UCFunctionToolkit + VectorSearchRetrieverTool with resource passthrough for auth, the Foundation Model API endpoint table. Strip MCP refs; replace with CLI/SDK equivalents. Owners: @databricks/eng-apps-devex (per CODEOWNERS).

  2. CODEOWNERS for experimental/ Resolved. Per @lennartkats-db review, /experimental/ owners are @lennartkats-db @simonfaltum @calreynolds @dustinvannoy-db (kept compact for now; broad participation invited via the broader collaborator set).

  3. No sync mechanism with upstream a-d-k. Resolved with a paired RFC. Two-part plan:

    • Pre-lock (this PR): periodic manual re-syncs from upstream ai-dev-kit into experimental/. Documented in experimental/README.md.
    • Post-lock (follow-up): invert the direction. a-d-k becomes the consumer; databricks-skills/imported/ in a-d-k is a git subtree of this repo's experimental/. RFC PR opened against a-d-k: RFC: subtree-sync skills from databricks-agent-skills/experimental databricks-solutions/ai-dev-kit#530 (draft). To make subtree work, d-a-s needs to publish an experimental-only branch via git subtree split --prefix=experimental after every push to main — that's a small workflow to add here in a follow-up PR. A one-shot preview branch experimental-only-preview was pushed to this repo to enable the RFC demo and should be deleted once the auto-publish workflow lands.
  4. No agent metadata. Resolved. scripts/skills.py auto-generates agents/openai.yaml + copies shared assets for each experimental skill on generate, using SKILL.md frontmatter. A DISPLAY_NAME_OVERRIDES map handles names whose hyphen-titlecase rendering breaks well-known capitalisation (AI Functions, AI/BI Dashboards, MLflow Evaluation, Unstructured PDF Generation — fixed per @lennartkats-db review). Stubs are only written when missing so upstream a-d-k can override by shipping its own files.

  5. databricks-pipelines was deliberately excluded. Resolved. a-d-k doesn't ship a databricks-pipelines skill under that name, but it does ship databricks-spark-declarative-pipelines covering the same product. After a deep compare, that experimental version covers a different surface than stable. Removed experimental/databricks-spark-declarative-pipelines/ from this PR. Follow-up TODO (post-merge): port the high-value pieces into stable skills/databricks-pipelines/ — DLT migration guide, workflow A/B/C decision matrix, per-language performance reference, language-selection rules. Strip MCP-tool refs. Owners: @lennartkats-db / @camielstee-db (per CODEOWNERS).

  6. spark-python-data-source naming exception. Kept as-is. The skill is about the OSS Apache Spark 4+ PySpark DataSource API (building custom connector libraries), not a Databricks product — only lightly flavored with Databricks idioms. The convention break is acceptable given the content.

  7. Versioning. Resolved. Bumped the extract_version_from_skill fallback in scripts/skills.py from 0.0.00.0.1 so the manifest never reports 0.0.0. Applies to skills with no explicit version: in their SKILL.md frontmatter. Sync-safe: when upstream a-d-k eventually adds version fields, those win; until then, the manifest reports the floor.

  8. installed_dir for experimental skills. Reversed during review per @dustinvannoy-db. Originally proposed: every experimental skill installs to ~/.claude/skills/<name>-experimental/. Replaced with: experimental skills install under their plain name (e.g. ~/.claude/skills/databricks-iceberg/). Upstream guarantees name-uniqueness across skills/ and experimental/, and the manifest generator (scripts/skills.py _add_skill) raises on any future collision so drift fails loudly rather than silently overwriting. Cli-side change shipped in databricks/cli#5243 (6d9c479f — drop SourceName and the suffixing in normalizeManifest).

  9. Excluded a-d-k content. Confirmed scope. Excluded: TEMPLATE/ (template, not a skill), install_skills.sh + install_genie_code_skills.py (a-d-k's installers — we use the cli installer instead), databricks-builder-app/ (a Python app for a-d-k's builder UI), databricks-mcp-server/ (the a-d-k MCP server — separate concern from skills), databricks-tools-core/ (Python lib used by a-d-k tooling — no experimental skill references it), hooks/hooks.json (a-d-k plugin lifecycle hooks tied to ${CLAUDE_PLUGIN_ROOT}/.claude-plugin/setup.sh/check_update.sh — plugin-specific, not skill content), plus top-level repo metadata (.github/, LICENSE.md, README.md, VERSION, install.{sh,ps1}, etc.). Verified no experimental skill cross-references any excluded path.

  10. README placement. Verified. experimental/README.md retains the adapted a-d-k skill list with a top warning block; the root README.md has an "Experimental Skills" section with an install-by-name example. Install commands use the new top-level surface: databricks aitools install [name] --experimental.

  11. Manifest shape. Resolved. Replaced the original two-map design (top-level skills + experimental_skills plus per-skill experimental bool) with a single skills map where each entry's repo_dir field is the source of truth. The manifest generator (scripts/skills.py) raises a clear error if the same skill name appears under both skills/ and experimental/.

Test plan

  • python3 scripts/skills.py generate regenerates the manifest cleanly.
  • python3 scripts/skills.py validate passes.
  • CI green on this branch.
  • Manual: databricks aitools install (no flag) installs only stable skills.
  • Manual: databricks aitools install --experimental installs both.
  • Manual: databricks aitools install databricks-iceberg errors because it's experimental.
  • Manual: databricks aitools install databricks-iceberg --experimental installs that one skill.

Post-merge follow-ups (reviewer-flagged)

  • Move reference files into references/ directories for consistency across skills (Dustin + Lennart).
  • Long-term-intention review of all 19 skills — confirm each is intended to stay (Dustin).
  • databricks-execution-compute/scripts/compute.py: when promoting out of experimental, rely on the Databricks CLI rather than the bundled compute script (Lennart, non-blocking).
  • databricks-mlflow-evaluation: review overlap with MLflow-repo skills (Lennart, non-blocking; cc'd @simonfaltum).
  • databricks-unstructured-pdf-generation: re-evaluate inclusion — "not very Databricks-specific" (Lennart, non-blocking; cc'd @dustinvannoy-db).
  • databricks-vector-search: "no longer experimental in Genie Code world" — move to stable or out (Lennart, non-blocking; cc'd @simonfaltum).
  • skills/databricks-jobs/SKILL.md: non-experimental skill additions should also propagate into Genie Code (Lennart, for @simonfaltum).

This pull request and its description were written by Claude.

@jamesbroadhead jamesbroadhead force-pushed the experimental-aidevkit branch from 2035bab to 2a45b83 Compare May 12, 2026 16:06
@jamesbroadhead jamesbroadhead requested a review from fjakobs as a code owner May 12, 2026 16:09
@jamesbroadhead jamesbroadhead force-pushed the experimental-aidevkit branch from 8387654 to a0fbb21 Compare May 12, 2026 16:14
jamesbroadhead added a commit to databricks/cli that referenced this pull request May 12, 2026
The default DATABRICKS_SKILLS_REF pin is a release tag that pre-dates
the experimental_skills manifest section (see
databricks/databricks-agent-skills#73). Users who pass --experimental
against that ref today silently get no experimental skills installed.

Log a Warnf at install time pointing them at the env var override
(=main, or a future release that includes the section).

Helper: manifestHasExperimental(), unit-tested in source_test.go.

Co-authored-by: Isaac
@jamesbroadhead jamesbroadhead changed the title experimental: import ai-dev-kit skills as best-effort skills experimental: import ai-dev-kit skills into experimental/ director May 12, 2026
@jamesbroadhead jamesbroadhead changed the title experimental: import ai-dev-kit skills into experimental/ director experimental: import ai-dev-kit skills into experimental/ directory May 12, 2026
jamesbroadhead added a commit that referenced this pull request May 12, 2026
Replaces the previous import (a-d-k commit 2228c3e on add_appkit) with the
head of a-d-k PR #533 (commit 9c7a5b3 on appkit-on-experimental), which
targets a-d-k's experimental branch.

Changes:
- Refresh 23 experimental skill directories from the new source.
- Drop databricks-lakebase-provisioned — removed on a-d-k experimental.
- databricks-apps-python: rename + SKILL.md now leads with AppKit
  (TypeScript + React SDK) and demotes Python frameworks to alternatives;
  6-mcp-approach.md replaced with 6-cli-approach.md.
- databricks-lakebase-autoscale/references/connection-patterns.md: change
  placeholder `user:password` to `<user>:<password>` so the secret scanner
  doesn't flag the doc-only example. Cosmetic only.
- Continue to exclude databricks-model-serving and
  databricks-spark-declarative-pipelines (PR #73 TODOs #1b and #5).
- Regenerate manifest.json and agents/openai.yaml stubs via
  scripts/skills.py generate.
- Update experimental/README.md provenance section with the new SHA,
  branch, and divergence notes.

Co-authored-by: Isaac
jamesbroadhead added a commit that referenced this pull request May 12, 2026
Merges the comprehensive jobs reference content from
experimental/databricks-jobs/ into skills/databricks-jobs/ and removes
the experimental copy.

What's new in stable databricks-jobs (v0.2.0):
- Full task-types reference (9 types: notebook, spark_python,
  python_wheel, sql, dbt, pipeline, spark_jar, run_job, for_each)
- All 6 trigger types with examples (cron, periodic, file_arrival,
  table_update, continuous, manual) + combining + pause/resume
- Notifications + health rules + retries + timeouts + queues
- 7 end-to-end worked examples (ETL, warehouse refresh, event-driven,
  ML training, multi-env, streaming, cross-job orchestration)
- run_if conditions, environments (serverless deps), permissions

What's retained from the prior stable skill:
- parent: databricks-core hierarchy
- Compatibility note + version metadata (bumped 0.1.0 → 0.2.0)
- Scaffolding workflow (databricks bundle init + CLAUDE.md/AGENTS.md
  template + project structure)
- Unit testing + development workflow sections
- agents/openai.yaml + assets/

Cleanups during the merge:
- Replaced the trigger-spam description with a terse one
- Normalized hard-coded /Workspace/Users/user@example.com/ paths in
  the imported reference files to /Workspace/Shared/

scripts/skills.py: updated SKILL_METADATA description for jobs to
reflect the broader scope. Manifest regenerated; experimental count
drops from 23 to 22.

Resolves PR #73 TODO #1a.

Co-authored-by: Isaac
Adds an experimental/ directory containing the 26 agent skills from
databricks-solutions/ai-dev-kit. These are imported as a snapshot on a
best-effort basis — they are not officially supported skills and follow a
looser contract than skills/ (no agents/openai.yaml, no shared-asset sync,
no SKILL_METADATA gate).

The manifest now exposes them under a new top-level experimental_skills
map so consumers can distinguish them from stable skills and skip them by
default. scripts/skills.py handles the new directory; the existing
generate / validate flow is unchanged for stable skills.

Co-authored-by: Isaac
Owners are the top contributors to databricks-solutions/ai-dev-kit
(>=10 commits at import time). The cross-org team
@databricks-solutions/ai-dev-kit-maintainers is the canonical owner;
this line can be replaced with that team handle if the team gets
write access to this repo.

Co-authored-by: Isaac
Adds a Provenance & sync model section to experimental/README.md:
- Transition phase: source of truth is upstream ai-dev-kit; this dir
  gets periodic manual re-syncs.
- Post-lock: source of truth is this repo; ai-dev-kit's
  databricks-skills/ becomes read-only.

Co-authored-by: Isaac
Adds two helpers to scripts/skills.py that run as part of `generate`:

- `ensure_experimental_codex_metadata` copies the shared
  assets/databricks.{svg,png} into each experimental skill (mirroring
  what stable skills get via sync_assets).
- `synthesize_openai_yaml` writes agents/openai.yaml from each
  experimental skill's SKILL.md frontmatter (display_name from the
  skill name, short_description from the first sentence of the
  frontmatter description, brand_color and icon paths fixed).

Both run only when the destination file is absent, so upstream
ai-dev-kit can override by shipping its own openai.yaml or assets.

This closes the cosmetic gap that experimental skills installed into
Codex CLI would render without an icon or marketplace metadata.

Co-authored-by: Isaac
Same pattern as databricks-model-serving: the experimental version
covers a different surface than the stable databricks-pipelines skill
(workflow / scaffolding / DLT-migration / per-language performance vs
feature reference / decision tree / common traps / format options).
The DAB-coupled scaffolding workflow is also the specific concern
Dustin flagged in the Apr 28 Slack thread for demo-generator flows.

Removed experimental/databricks-spark-declarative-pipelines/;
manifest regenerated (24 experimental skills). Follow-up TODO: port
the high-value pieces (DLT migration guide, workflow A/B/C decision
matrix, per-language performance reference, language-selection rules)
into skills/databricks-pipelines/, stripping MCP-tool refs.

Co-authored-by: Isaac
Adds the actual a-d-k commit hash (2228c3e on the add_appkit branch,
5 commits ahead of origin/main at import time) along with a note
about the local deltas vs public main. Surfaces the key one: the
databricks-app-python -> databricks-apps-python rename hadn't merged
upstream, and pulling from the renamed version is what avoids a 3rd
skill-name collision with d-a-s's stable databricks-apps.

Co-authored-by: Isaac
extract_version_from_skill() falls back to a synthetic version when a
skill's SKILL.md frontmatter has no version: field. The previous
fallback was 0.0.0, which several install tools treat as "unset"
rather than "first release".

Bump to 0.0.1. Affects all 24 experimental skills (imported from
ai-dev-kit without versions) plus the stable databricks-dabs skill.
Skills with an explicit version are unchanged.

Co-authored-by: Isaac
- experimental/README.md: install examples now use the -experimental
  suffix on the skill name + the --experimental flag (matching the
  install-path behaviour landed in databricks/cli#5243). Adds a short
  note explaining why the in-repo dir name and the install dir name
  differ.
- experimental/README.md: drop databricks-model-serving from the
  collision example (it was removed from this PR earlier).
- experimental/README.md: update the (also available as stable skill)
  note for databricks-jobs to point at the open TODO #1a.
- Root README: clarify the suffixed install name in the by-name install
  example.

Co-authored-by: Isaac
Replaces the previous import (a-d-k commit 2228c3e on add_appkit) with the
head of a-d-k PR #533 (commit 9c7a5b3 on appkit-on-experimental), which
targets a-d-k's experimental branch.

Changes:
- Refresh 23 experimental skill directories from the new source.
- Drop databricks-lakebase-provisioned — removed on a-d-k experimental.
- databricks-apps-python: rename + SKILL.md now leads with AppKit
  (TypeScript + React SDK) and demotes Python frameworks to alternatives;
  6-mcp-approach.md replaced with 6-cli-approach.md.
- databricks-lakebase-autoscale/references/connection-patterns.md: change
  placeholder `user:password` to `<user>:<password>` so the secret scanner
  doesn't flag the doc-only example. Cosmetic only.
- Continue to exclude databricks-model-serving and
  databricks-spark-declarative-pipelines (PR #73 TODOs #1b and #5).
- Regenerate manifest.json and agents/openai.yaml stubs via
  scripts/skills.py generate.
- Update experimental/README.md provenance section with the new SHA,
  branch, and divergence notes.

Co-authored-by: Isaac
The previous regex-only parser in extract_description_from_skill()
captured the YAML block-scalar indicator (`>-`) verbatim, so any SKILL.md
that wrote `description: >-\n  multi-line content` produced a manifest
entry of `">-"`. The new ai-dev-kit import (PR #533) brought two such
files — databricks-dbsql and databricks-execution-compute — which
landed corrupted descriptions in manifest.json and corrupted
short_description / default_prompt in agents/openai.yaml.

Walk the frontmatter line by line: if the value is a block-scalar
indicator (|, |-, |+, >, >-, >+), aggregate the indented continuation
lines (folded with spaces for `>`-style, newlines for `|`-style).

Regenerate manifest.json and the two affected agents/openai.yaml stubs.

Co-authored-by: Isaac
Replaces the hand-rolled block-scalar walker (added one commit ago) with
PyYAML's safe_load. PyYAML's default SafeLoader is pure-Python — no C
extension required — and handles every YAML edge case for free instead
of reimplementing them.

Side-benefit: also fixes a second latent bug. The regex parser stripped
the outer YAML quotes but left inner `\"` escapes intact as literal
backslash-quote characters, so descriptions like
`"... mentions \"switch workspace\"..."` ended up in manifest.json
with the backslashes preserved. yaml.safe_load resolves these
correctly. Regenerated manifest reflects the fix for databricks-config.

Co-authored-by: Isaac
Merges the comprehensive jobs reference content from
experimental/databricks-jobs/ into skills/databricks-jobs/ and removes
the experimental copy.

What's new in stable databricks-jobs (v0.2.0):
- Full task-types reference (9 types: notebook, spark_python,
  python_wheel, sql, dbt, pipeline, spark_jar, run_job, for_each)
- All 6 trigger types with examples (cron, periodic, file_arrival,
  table_update, continuous, manual) + combining + pause/resume
- Notifications + health rules + retries + timeouts + queues
- 7 end-to-end worked examples (ETL, warehouse refresh, event-driven,
  ML training, multi-env, streaming, cross-job orchestration)
- run_if conditions, environments (serverless deps), permissions

What's retained from the prior stable skill:
- parent: databricks-core hierarchy
- Compatibility note + version metadata (bumped 0.1.0 → 0.2.0)
- Scaffolding workflow (databricks bundle init + CLAUDE.md/AGENTS.md
  template + project structure)
- Unit testing + development workflow sections
- agents/openai.yaml + assets/

Cleanups during the merge:
- Replaced the trigger-spam description with a terse one
- Normalized hard-coded /Workspace/Users/user@example.com/ paths in
  the imported reference files to /Workspace/Shared/

scripts/skills.py: updated SKILL_METADATA description for jobs to
reflect the broader scope. Manifest regenerated; experimental count
drops from 23 to 22.

Resolves PR #73 TODO #1a.

Co-authored-by: Isaac
PR #533 has merged into upstream a-d-k experimental. The
databricks-skills/ tree is byte-identical between the previous
import SHA (9c7a5b3) and the merge commit (7b07f18) — only
install.{sh,ps1} changed, which we don't import. README updated
to point at the now-authoritative branch + SHA and drop the "one
commit ahead of origin/experimental" caveat.

Co-authored-by: Isaac
@jamesbroadhead jamesbroadhead force-pushed the experimental-aidevkit branch from 50467c3 to 5719156 Compare May 15, 2026 09:45
@jamesbroadhead
Copy link
Copy Markdown
Contributor Author

Stable ↔ experimental skill overlap

Surfacing the overlap explicitly per Quentin's question. There are 6 stable/experimental pairs with non-trivial topical overlap; the remaining 16 experimental skills cover surfaces with no stable equivalent.

# Stable Experimental (this PR) Topic Overlap Stable strengths (kept) Experimental strengths (potential merge candidates) Direction
1 databricks-apps (196L SKILL + 4 refs) databricks-apps-python (259L SKILL + 6 refs + 4 examples) Building Databricks Apps Both scaffold apps on the Apps platform; both cover Lakebase integration, model serving, deployment AppKit-first (TypeScript/React) — SQL queries, tRPC, smoke tests, Playwright selector rules, proto-first contracts, appkit lint cast rules, AppKit version pinning, Genie agent workflow Python framework menu (Streamlit/Dash/Gradio/Flask/FastAPI/Reflex), explicit OAuth authorization patterns, Foundation Model API examples (fm-minimal-chat, fm-parallel-calls, fm-structured-outputs), app-resources schema Keep both. Disjoint primary audience (TS vs Python). Port FM-API examples + Python-framework matrix into stable's references/other-frameworks.md post-merge.
2 databricks-dabs (39L SKILL + 5 refs, 450L total) databricks-bundles (324L SKILL + SDP/alerts files, 497L total) Declarative Automation Bundles Both cover databricks.yml, resources, targets, deploy/validate, SDP pipelines, SQL alerts Layered references (bundle-structure / deploy-and-run / resource-permissions / alerts / sdp-pipelines); naming convention <name>.<type>.yml; --strict validate guidance; permissions matrix Self-contained single-file primer with full databricks.yml skeleton (variables, dev/prod targets); similar SDP + alerts content but flatter Stable supersedes. Drop the experimental copy, or trim it to a thin pointer. The duplication is the cleanest example of what Quentin flagged.
3 databricks-lakebase (300L SKILL + 3 refs, 871L total) databricks-lakebase-autoscale (232L SKILL + 5 refs, 1146L total) Lakebase Postgres (Autoscaling) Both cover projects/branches/endpoints, connectivity, OAuth token refresh, reverse-ETL via synced tables, compute sizing Resource-hierarchy diagram, capacity planning + sizing details, Data API / PostgREST, app-integration via databricks-apps, compliance matrix (HIPAA/C5/TISAX) Per-area refs (projects / branches / computes / connection-patterns / reverse-etl); 354L connection-patterns deep-dive; field-mask update-* CLI examples; region matrix (AWS + Azure-beta) Stable supersedes on hierarchy/synced-tables. Port the deeper connection-patterns content + field-mask CLI usage into stable's references/connectivity.md.
4 databricks-core (CLI/auth/profile entrypoint) databricks-config Workspace/profile management Both cover databricks auth, ~/.databrickscfg, profile switching, OAuth + PAT login "NEVER auto-select a profile" rule, CLI version-check + install flow, REST-API fallback for sandboxed envs, Claude Code separate-shell guidance, parent-skill index Plain cheatsheet of auth describe / config get / auth login / configure Stable supersedes. Experimental is a strict subset — drop.
5 databricks-core (CLI-first) databricks-python-sdk SDK + Connect quickstart Both touch Databricks Connect, CLI version, profile selection Profile-selection rules, CLI install, REST fallback databricks-sdk Python usage, WorkspaceClient REST patterns, Connect quickstart, doc-index reference Mostly disjoint. Keep as SDK-focused complement, or fold the SDK section into a sibling stable skill — does not collide with databricks-core's CLI scope.
6 databricks-serverless-migration databricks-execution-compute Compute selection / execution Both mention serverless vs classic / Databricks Connect Migration lifecycle (Ingest/Analyze/Test/Validate); blocker detection (RDDs, DBFS, HMS, streaming triggers, init scripts); Spark Connect fixes 3-mode decision matrix (Connect / Serverless Job / Interactive Cluster); compute + warehouse CRUD cheatsheet; cold-start expectations Keep both. Different angles — migration vs day-to-day execution choice. Cross-link from stable.

No stable equivalent (16 skills — no overlap)

databricks-agent-bricks, databricks-ai-functions, databricks-aibi-dashboards, databricks-dbsql, databricks-docs, databricks-genie, databricks-iceberg, databricks-metric-views, databricks-mlflow-evaluation, databricks-spark-structured-streaming, databricks-synthetic-data-gen, databricks-unity-catalog, databricks-unstructured-pdf-generation, databricks-vector-search, databricks-zerobus-ingest, spark-python-data-source.

TL;DR

Three pairs are real duplicates that should converge over time: dabs/bundles (drop experimental), lakebase/lakebase-autoscale (drop experimental, port connection-patterns content first), core/config (drop experimental).

Two pairs are complementary and worth keeping side-by-side: apps/apps-python (TS vs Python audiences), serverless-migration/execution-compute (migration vs runtime selection).

One pair is mostly disjoint: core/python-sdk (CLI vs SDK).

Intentional for this PR per the rationale in the description — port adk mostly intact, then resolve the duplicates as a series of single-skill PRs against stable, with the experimental copies removed in the same change. Happy to start that series if folks agree on the deltas above.

This comment was written by Claude.

Removes three experimental skills that duplicate stable equivalents
without adding net new content:

- databricks-bundles → use stable `databricks-dabs`
- databricks-lakebase-autoscale → use stable `databricks-lakebase`
- databricks-config → use stable `databricks-core`

Cross-references in the surviving experimental skills (apps-python,
python-sdk, zerobus-ingest, synthetic-data-gen, 4-deployment.md) now
point at the stable names by bare name — matching the convention
already used in stable skills' "Related Skills" / "Product Skills"
sections.

`experimental/README.md` records the removals and points readers at
the stable replacements. Manifest regenerated — experimental skill
count drops from 22 to 19.

Constraint per James: root must not depend on experimental, but
experimental may depend on root. Honored — all new references go in
that direction, and no stable skill content was changed by this
commit.

Co-authored-by: Isaac
Copy link
Copy Markdown
Collaborator

@dustinvannoy-db dustinvannoy-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jamesbroadhead
Copy link
Copy Markdown
Contributor Author

👋 Claude here on James's behalf — per-skill audit of the 5 non-blocking items from this PR's review threads. Each item gets a finding + recommendation; owners can decide what (if anything) becomes a follow-up PR.

#7databricks-execution-compute/scripts/compute.py (Lennart, cc @dustinvannoy-db)

Inspected the script: 743 lines, 19 functions. Vast majority shadows what's already in the Databricks CLI:

Function in compute.py CLI equivalent
list_clusters, get_best_cluster, start_cluster, get_cluster_status databricks clusters list/get/start
create_cluster, terminate_cluster, delete_cluster databricks clusters create/permanent-delete/delete
list_node_types, list_spark_versions databricks clusters list-node-types/spark-versions

The one differentiator is execute_databricks_command / run_code_on_serverless — the CLI has no one-shot "run python on cluster" command; that requires databricks api post .../execution-context workflow (which the SKILL could document instead).

Recommendation: when promoting databricks-execution-compute out of experimental, delete the bundled compute.py and document the CLI surface for the duplicated paths. For code execution, either (a) document the databricks api workflow against command-execution, or (b) propose a databricks compute execute-code subcommand in databricks/cli to close the gap. Not blocking; just a lock-in checklist item.


#8databricks-mlflow-evaluation vs MLflow-repo skills (Lennart, cc @dustinvannoy-db / @simonfaltum)

Searched github.com/mlflow/mlflow for a skills/ directory or an agent-skills-style repo under the mlflow org — neither exists publicly. The comparison Lennart was implicitly drawing isn't currently possible against a public source.

Recommendation: defer until the MLflow-repo skill source is identified (internal repo? planned upstream?). databricks-mlflow-evaluation (~6700 lines across 11 references) currently has no obvious external duplicate.


#9databricks-unstructured-pdf-generation Databricks-specificity (Lennart, cc @dustinvannoy-db)

Confirmed: the skill is ~80% Databricks-agnostic HTML → PDF tooling (weasyprint / wkhtmltopdf, HTML template patterns), ~20% Unity Catalog volume upload (databricks fs cp). The valuable Databricks-specific piece is the workflow shape (generate test PDFs → upload to UC volume → use them as a RAG evaluation dataset).

Recommendation: keep the skill, but either (a) trim to just the UC-volume upload step and link to standard HTML→PDF tooling externally, or (b) rename to something like databricks-rag-test-data to make the workflow shape the headline. The current name + scope makes the Databricks angle look thin.


#10databricks-vector-search "no longer experimental in Genie Code world" (Lennart, cc @simonfaltum)

Confirmed: there's no stable databricks-vector-search skill in skills/. If Genie Code treats Vector Search as a stable feature, the skill in experimental/ is mis-classified.

Recommendation: promote experimental/databricks-vector-searchskills/databricks-vector-search. The skill itself (335 lines SKILL.md + 4 references) is in good shape; the move is a one-PR git mv + manifest regen + version bump. Genie Code's @simonfaltum is best-placed to confirm the stable classification.


#11skills/databricks-jobs/SKILL.md propagation to Genie Code (Lennart, for @simonfaltum)

This isn't a d-a-s repo change — Genie Code consumes d-a-s via databricks aitools install (per databricks/cli PR #5243, now merged). The stable databricks-jobs at v0.2.0 (with the four task-type/triggers/notifications/examples refs added during this PR) ships automatically when users run the install in their Genie Code workspace context.

Recommendation: action is in the Genie Code product, not in d-a-s. @simonfaltum to verify the install path picks up skills/databricks-jobs/ correctly in Genie Code's workspace context.


Bonus finding (during audit): skills/databricks-jobs/ has its references at the top level (task-types.md, triggers-schedules.md, etc.) rather than under references/, unlike the other stable skills (databricks-apps, databricks-lakebase, databricks-pipelines). The newly-opened PR #86 restructures the experimental skills only; happy to extend it to cover stable databricks-jobs too if you'd like consistency across the repo.

(comment posted by Claude)

jamesbroadhead added a commit that referenced this pull request May 26, 2026
Phase 2 of the a-d-k → d-a-s port for databricks-spark-declarative-pipelines.

Adds three new references that fill the dev-side gaps that stable's per-feature
× per-language reference files don't cover:

- references/workflows.md — Workflow A/B/C chooser (standalone bundle via
  `databricks pipelines init`, pipeline-in-existing-bundle, rapid CLI iteration
  with no bundle); language selection rules; start-update + poll-the-update
  pattern with the "never poll top-level pipeline state" rationale; edit/
  re-upload/restart flow.
- references/pipeline-configuration.md — Full JSON config reference for
  `pipelines create|update` (top-level fields, clusters, event_log,
  notifications, configuration, run_as, restart_window, environment,
  deployment); variant snippets (dev mode, non-serverless, continuous,
  notifications, autoscaling, custom event log, serverless Python deps);
  multi-schema patterns; platform constraints.
- references/performance.md — Liquid Clustering with per-layer key guidance
  (bronze/silver/gold), cluster-key type rules, table properties, state
  management strategies for streaming, join optimization, query optimization,
  pre-aggregation, compute config, monitoring.

SKILL.md updates:
- New "Choose Your Workflow" and "Language Selection" sections.
- Scaffolding section documents both `databricks pipelines init` and
  `databricks bundle init lakeflow-pipelines`.
- Pipeline API Reference list reorganized into Project & Lifecycle and
  Datasets, Flows & Quality groups.
- Version bumped to 0.3.0.

Deliberately dropped from a-d-k's databricks-spark-declarative-pipelines:
- 2-mcp-approach.md (a-d-k experimental already replaced with 2-cli-approach.md
  — MCP tool refs removed per PR #73 policy).
- python/{1..4}-*.md and sql/{1..4}-*.md (covered by stable's existing per-
  feature × per-language refs: python-basics, sql-basics, auto-loader-*,
  auto-cdc-*, streaming-table-*, sink-*, foreach-batch-sink-*, etc.).
- scripts/exploration_notebook.py (stable convention has no scripts/; users
  use the CLI directly or the explorations/ folder generated by `pipelines
  init`).

Source: databricks-solutions/ai-dev-kit@experimental.

Co-authored-by: Isaac
jamesbroadhead added a commit that referenced this pull request May 26, 2026
Phase 1 of #73's TODO #1b. Adds references/fm-api-endpoints.md with the
curated Foundation Model API endpoint table (chat/instruct + embedding
models) from databricks-solutions/ai-dev-kit's model-serving skill,
plus common defaults and query examples (CLI + SDK).

Stripped: the cloud/language prefix on the docs link, and the leftover
MCP-tool references in the source. The endpoint table itself is static
catalog data — no MCP coupling.

SKILL.md updates:
- bump version to 0.2.0
- point Endpoint Types table at the new reference
- point the Foundation Model discovery bullet at the new reference

Subsequent phases (separate PRs / commits) port the remaining dev-side
content: classical-ml autolog patterns, Custom PyFunc signatures,
ResponsesAgent with the create_text_output_item gotcha, UCFunctionToolkit
+ VectorSearchRetrieverTool resource passthrough.

Co-authored-by: Isaac
jamesbroadhead added a commit that referenced this pull request May 26, 2026
…-d-k (#85)

## Summary

Ports the `databricks-spark-declarative-pipelines` skill from
[`databricks-solutions/ai-dev-kit`](https://github.com/databricks-solutions/ai-dev-kit/tree/experimental/databricks-skills/databricks-spark-declarative-pipelines)
into stable `skills/databricks-pipelines/`. Source:
`databricks-solutions/ai-dev-kit:experimental`.

Completes d-a-s [PR
#73](#73
TODO #5. Pairs with a-d-k [PR
#546](databricks-solutions/ai-dev-kit#546),
which tombstones the a-d-k skill once this lands.

Stable's `databricks-pipelines` already covered the per-feature ×
per-language API/options surface (decision tree, common traps, format
options, dataset/flow/quality references). a-d-k's version covered
scaffolding/workflows, configuration, performance tuning, DLT migration,
and several streaming patterns + Kafka ingestion + SCD-2 query patterns
that stable lacked. This PR adds a-d-k's net-new content as new
`references/` files; the per-feature reference structure is preserved.

## Changes

### New `references/`

- `dlt-migration.md` — both migration paths (DLT Python → SDP Python via
`pyspark.pipelines`, DLT Python → SDP SQL) with side-by-side conversions
for decorators, reads, expectations, CDC/SCD, and partitioning → liquid
clustering.
- `workflows.md` — Workflow A/B/C chooser (standalone bundle via
`databricks pipelines init`, pipeline-in-existing-bundle, rapid CLI
iteration with no bundle); language-selection rules; start-update +
poll-the-update pattern (with the "never poll top-level pipeline state
because RETRY_ON_FAILURE flips it back to RUNNING" rationale);
edit/re-upload/restart flow; Python SDK alternative.
- `pipeline-configuration.md` — Full JSON config reference for
`pipelines create|update` (top-level fields, `clusters`, `event_log`,
`notifications`, `configuration`, `run_as`, `restart_window`,
`environment`, `deployment`); variant snippets (dev mode,
non-serverless, continuous, notifications, autoscaling, custom event
log, serverless Python deps); multi-schema patterns; platform
constraints.
- `performance.md` — Liquid Clustering with per-layer key guidance
(bronze/silver/gold); cluster-key type rules; table properties;
state-management strategies for streaming; join optimization
(stream-to-static, stream-to-stream with time bounds); query
optimization; pre-aggregation; compute config; monitoring.
- `streaming-patterns.md` — Deduplication (by key, with time window,
composite); windowed aggregations (tumbling, multi-size, session
windows); event-time vs processing-time; rescue-data quarantine (Auto
Loader `_rescued_data` → bronze_quarantine + silver_clean fanout);
stream-to-stream join as a pattern; running totals; anomaly detection
(rolling z-score outlier flag); end-to-end lag monitoring.
- `kafka.md` — Basic Kafka read (Python + SQL); JSON payload parsing
with explicit schemas; Databricks Secrets SASL/PLAIN auth; mTLS notes;
Event Hubs via the Kafka protocol; pipeline-config plumbing for
brokers/topics; pointer to `sink.md` for writing back to Kafka. Fills a
full gap — stable's SKILL.md API table listed `read_kafka` and
`format(\"kafka\")` with no linked skill.
- `scd-2-querying.md` — `__START_AT` / `__END_AT` temporal semantics;
current-state materialized views; point-in-time queries with the
inclusive-lower / exclusive-upper boundary; per-entity history;
period-bounded change analysis; joining facts with historical dimensions
(as-of-transaction-time and current-dim variants); pre-filter MV
optimization; clustering on `(entity_key, __START_AT)`.

### `SKILL.md`

- New "Choose Your Workflow" and "Language Selection" sections near
scaffolding.
- Scaffolding section documents both `databricks pipelines init` (newer,
focused) and `databricks bundle init lakeflow-pipelines`
(template-based).
- Pipeline API Reference list reorganized: **Project & Lifecycle**
(workflows, configuration, performance, DLT migration) and **Datasets,
Flows & Quality** (the existing per-feature refs + new kafka,
scd-2-querying, streaming-patterns).
- Version bumped to `0.3.0`.

### Cross-references in existing references

- `auto-loader.md` → `streaming-patterns.md` (quarantine), `kafka.md`,
lag monitoring.
- `auto-cdc.md` → `scd-2-querying.md` for reading SCD-2 history tables.

## Deliberately dropped from a-d-k

| a-d-k file | Why dropped |
|------------|-------------|
| `references/2-mcp-approach.md` | a-d-k experimental already renamed
this to `2-cli-approach.md`; MCP tool refs stripped per d-a-s PR #73
policy. CLI flow now lives in `workflows.md` as Workflow C. |
| `references/python/1-syntax-basics.md`,
`references/sql/1-syntax-basics.md` | Covered by stable's
`python-basics.md`, `sql-basics.md`, and the per-feature references
(streaming-table, materialized-view, temporary-view, view-sql). |
| `references/python/{2,3,4}-*.md`, `references/sql/{2,3,4}-*.md` |
Pattern content ported into `streaming-patterns.md`, `kafka.md`,
`scd-2-querying.md` (this PR); API/options content already covered by
stable's per-feature × per-language references. |
| `scripts/exploration_notebook.py` | Stable convention has no
`scripts/` directory under a skill. `databricks pipelines init`
generates an `explorations/` folder; users use the CLI or the generated
notebook directly. |

## Test plan

- [x] `python3 scripts/skills.py generate` clean.
- [x] `python3 scripts/skills.py validate` passes.
- [x] Merged `origin/main` mid-port (resolved version conflict — kept
`0.3.0`; took main's CLI install command + compatibility bump).
- [ ] CI green on this branch.
- [ ] Owner review (`@lennartkats-db` / `@camielstee-db` per
CODEOWNERS).

This pull request and its description were written by Claude.
jamesbroadhead added a commit that referenced this pull request May 26, 2026
## Summary

Per [d-a-s
#73](#73
post-merge follow-ups (`@dustinvannoy-db` and `@lennartkats-db`):
consolidate every skill's reference layout to use
`references/<file>.md`, matching the convention already in place for
`databricks-apps`, `databricks-lakebase`, and `databricks-pipelines`.

## Changes

**Experimental skills** (commit 1):
- Move 51 top-level `.md` files (excluding `SKILL.md`) into per-skill
`references/` directories across 12 experimental skills.
- Rewrite all 193 path references in the corresponding `SKILL.md` files.
- Wire in one orphan: `databricks-python-sdk/doc-index.md` was never
referenced from its SKILL.md. Added a "SDK Reference" section in
SKILL.md pointing at it + the existing `examples/` directory.

**Stable `databricks-jobs`** (commit 2):
- Same move for the 4 top-level reference files (`examples.md`,
`notifications-monitoring.md`, `task-types.md`, `triggers-schedules.md`)
→ `skills/databricks-jobs/references/`.
- Rewrite 36 path references in `SKILL.md`.
- `databricks-jobs` was the last stable skill with the
references-at-top-level layout.

Manifest regenerated; `python3 scripts/skills.py validate` passes.

## Skills affected

**12 experimental** (51 files moved):
`databricks-agent-bricks` (2), `databricks-ai-functions` (4),
`databricks-aibi-dashboards` (5), `databricks-apps-python` (6),
`databricks-dbsql` (5), `databricks-iceberg` (5),
`databricks-metric-views` (2), `databricks-python-sdk` (1, was
orphaned), `databricks-spark-structured-streaming` (9),
`databricks-unity-catalog` (3), `databricks-vector-search` (4),
`databricks-zerobus-ingest` (5).

**1 stable** (4 files moved):
`databricks-jobs` (4).

Other skills already had `references/` and were untouched:
`databricks-execution-compute`, `databricks-mlflow-evaluation`,
`databricks-synthetic-data-gen`, `spark-python-data-source`
(experimental), and `databricks-apps`, `databricks-lakebase`,
`databricks-pipelines`, `databricks-model-serving`, `databricks-core`,
`databricks-dabs`, `databricks-serverless-migration` (stable).

## Test plan

- [x] `python3 scripts/skills.py generate` clean.
- [x] `python3 scripts/skills.py validate` passes (`Everything is up to
date.`).
- [x] All 55 moves are renames (verified with `git diff -M`).
- [x] Spot-checked `databricks-agent-bricks/SKILL.md` and
`databricks-jobs/SKILL.md` to confirm path-rewrites.
- [ ] CI green on this branch.

This pull request and its description were written by Claude.
simonfaltum pushed a commit that referenced this pull request May 27, 2026
## Summary

Resolves the open follow-up from d-a-s [PR
#73](#73
reviewer-flagged list:

> **\`databricks-mlflow-evaluation\`**: review overlap with MLflow-repo
skills (Lennart, non-blocking; cc'd @simonfaltum).

The OSS [\`mlflow/skills\`](https://github.com/mlflow/skills) repo ships
[\`agent-evaluation\`](https://github.com/mlflow/skills/tree/main/agent-evaluation)
and four related skills (\`instrumenting-with-mlflow-tracing\`,
\`analyze-mlflow-trace\`, \`retrieving-mlflow-traces\`,
\`querying-mlflow-metrics\`) that cover the generic MLflow GenAI
evaluation workflow — \`mlflow.genai.evaluate()\`, scorers/judges,
datasets, tracing setup, the 5-step evaluation loop. Substantial topic
overlap with this skill.

Rather than dedupe content or split the skill, this PR adds a short
\"Scope vs upstream \`mlflow/skills\`\" section at the top of
\`SKILL.md\` that:

- Names the upstream skills.
- Scopes this skill to **Databricks-specific patterns layered on top**
of that workflow — UC trace ingestion, MemAlign judge alignment via UC
SME labeling sessions, \`optimize_prompts()\` GEPA loop, UC-table-backed
datasets.
- Defers everything else to upstream rather than restating it.

Picked this approach because option (b) — pushing the
Databricks-specific patterns upstream into
\`mlflow/skills/agent-evaluation/references/\` as a fork-PR — would
split source of truth and force the MLflow team to own
Databricks-specific docs.

## Test plan

- [x] \`python3 scripts/skills.py validate\` passes.
- [x] Manifest unchanged (file list identical; only \`SKILL.md\` content
changed).
- [ ] Reviewer ack that the scope-boundary is the right place to draw
the line.

This pull request and its description were written by Claude.

Signed-off-by: James Broadhead <james.broadhead@databricks.com>
simonfaltum added a commit that referenced this pull request May 27, 2026
…sorted manifest (#95)

## Summary

Two related cleanups to `manifest.json` hygiene.

### 1. Drop `base_revision` plumbing

The `base_revision` field was added in
[#30](#30) as
a per-skill upstream-revision tag, intended to track where a synced
skill came from in a-d-k. It never got a consumer:

- `databricks/cli`'s `aitools install` ignores it (`gh search code
base_revision --owner=databricks` returns only this repo's own
`scripts/skills.py`).
- No skill on `main` carries it. A few feature branches populated it
while iterating on a-d-k ports (e.g. `origin/experimental-aidevkit` had
`"base_revision": "e742f36e8ab1"` for some skills), but those values
never reached `main`.
- The generator just round-tripped the field if present in the prior
manifest, and `normalize_manifest` stripped it before the validate diff
— so it couldn't fail CI either.

It's dead weight that future maintainers have to reason about. Dropped:

- `_build_stable_entry`'s existing-entry-preservation branch (and the
now-unused `existing_skills` parameter).
- `_build_experimental_entry`'s equivalent.
- `generate_manifest`'s existing-manifest read (only used to feed
`existing_skills`).
- `normalize_manifest` and `_normalize_skill_map` — the only volatile
field they normalized was `base_revision`, so `validate_manifest` can
compare dicts directly now.

If per-skill upstream sync tracking does become useful later, re-add the
field at the same time as the tool that consumes it. The long-term sync
plan in [#73 TODO
#3](#73) is
git-subtree-based, which tracks revisions intrinsically —
`base_revision` doesn't help there.

### 2. Enforce canonical sorted form

The on-disk `manifest.json` is now required to be byte-equal to the
canonical serialization (`json.dumps(manifest, indent=2, sort_keys=True)
+ "\n"`). That means:

- The `skills` map's keys are alphabetical across stable **+**
experimental (no more stable-first grouping; that was a side-effect of
build order, not a deliberate choice).
- Each skill entry's keys are alphabetical: `description`, `files`,
`min_cli_version?`, `repo_dir`, `version`.
- The top-level keys (`skills`, `version`) are alphabetical.
- `files` arrays remain sorted by the generator (already enforced).

`scripts/skills.py validate` now does two checks:

1. **Content lint** — the parsed dict equals what `generate_manifest`
would produce. Catches stale content, unsorted/missing/extra `files`
entries, drift between SKILL.md frontmatter and manifest.
2. **Canonical-form lint** — the on-disk bytes equal
`serialize_manifest(current)`. Catches hand-edited sort drift
(re-ordered skill names, re-ordered per-skill keys) even when the parsed
content is correct.

The writer at `scripts/skills.py generate` uses the same
`serialize_manifest` helper, so files emitted by the generator always
pass the lint.

## Diff shape

| File | Change |
|---|---|
| `scripts/skills.py` | −47 / +35 lines: drop `base_revision` branches +
dead normalize machinery; add `serialize_manifest` helper; rework
`validate_manifest` to do the two checks above. |
| `manifest.json` | 220-line swap: skills now in one alphabetical run
(`databricks-agent-bricks`, `databricks-ai-functions`, ...) and
per-skill keys alphabetised. No semantic change for `aitools install`. |

## Test plan

- [x] `python3 scripts/skills.py generate` clean — regenerates the
canonical sorted manifest.
- [x] `python3 scripts/skills.py validate` passes.
- [x] Lint catches a manually-reversed `skills` map: emits `ERROR:
manifest.json is not in canonical sorted form. Keys must be alphabetical
at every level.`
- [x] Lint catches a manually-reversed `files` array inside a skill
entry: emits `ERROR: manifest.json content is out of date`.
- [ ] CI green on this branch.

This pull request and its description were written by Claude.

---------

Co-authored-by: simon <simon.faltum@databricks.com>
simonfaltum pushed a commit that referenced this pull request May 27, 2026
…ental (#91)

## Summary

Stable and experimental skills had two different contracts for Codex CLI
marketplace metadata (`agents/openai.yaml` +
`assets/databricks.{svg,png}`):

| | stable (`skills/`) | experimental (`experimental/`) |
|---|---|---|
| `agents/openai.yaml` | hand-authored, required | auto-synthesised from
`SKILL.md` frontmatter |
| `assets/databricks.{svg,png}` | manual copy needed | `sync_assets()`
copied them in |
| CI enforcement | partial (only via `_build_stable_entry`) | full
(`check_assets_synced` + `ensure_experimental_codex_metadata`) |

So adding a stable skill needed manual icon copies + a hand-authored
YAML; adding an experimental one was `python3 scripts/skills.py
generate`. Stable also had no CI check that the icons were actually
present.

This PR makes the contract uniform: every skill gets icons +
`agents/openai.yaml` auto-generated when missing, hand-authored YAML is
preserved as an override, and one CI cycle (`python3 scripts/skills.py
validate`) enforces it for both directories.

- `iter_all_skill_dirs(repo_root)` walks every skill across `skills/`
and `experimental/`.
- `ensure_codex_metadata(repo_root)` replaces `sync_assets()` and
`ensure_experimental_codex_metadata()`. Copies shared icons if
missing/stale; synthesises `agents/openai.yaml` only when absent (so
curated YAML like `databricks-core`'s `display_name: \"Databricks\"`
survives).
- `check_codex_metadata(repo_root)` mirrors the same checks for
`validate`. The redundant per-skill openai.yaml existence check in
`_build_stable_entry` is gone.
- `main()` `sync` + `generate` call `ensure_codex_metadata`; `validate`
calls `check_codex_metadata` then `validate_manifest`.
- `.github/workflows/validate-manifest.yml` already covered both
`skills/**` and `experimental/**`; no workflow change needed — one CI
cycle covers both.
- `CONTRIBUTING.md` gets a new \"Skill anatomy\" section explaining what
these files are, who consumes them (Codex CLI marketplace), why the repo
ships them for every skill, the auto-generation + hand-authoring escape
hatch, and `DISPLAY_NAME_OVERRIDES` for acronym/product-name casing.
`CLAUDE.md` links to it.

## Why

PR #73 (experimental import) added the auto-gen path for experimental
skills only. Reviewers' question on that PR — \"what are these files,
why do we ship them, how do new skills get them?\" — pointed at the
asymmetry. This is the follow-up.

## Test plan

- [x] `python3 scripts/skills.py generate` on clean tree → no diff
(no-op).
- [x] `python3 scripts/skills.py sync` on clean tree → no diff.
- [x] Removed
`skills/databricks-core/{agents/openai.yaml,assets/databricks.png}` +
`experimental/databricks-ai-functions/agents/openai.yaml` → `validate`
exits 1 listing all three missing files; `generate` heals all three;
hand-authored stable YAML preserved when restored from backup before
re-running `generate`.
- [x] No stale references to the removed helpers (`sync_assets`,
`check_assets_synced`, `ensure_experimental_codex_metadata`) anywhere in
the repo.

This pull request and its description were written by Claude.

---------

Signed-off-by: James Broadhead <james.broadhead@databricks.com>
jamesbroadhead added a commit that referenced this pull request May 27, 2026
…apx Related Skills entry

`databricks-app-apx` was the FastAPI+React stack referenced from
ai-dev-kit's `databricks-apps-python` skill. It has been removed
upstream (a-d-k is deprecated; the apx-on-CLI flow merged into the
stable `databricks-apps` skill via #84/#73). The "Related Skills"
bullet is the last dangling reference inside this repo.

This PR was prepared by Claude.
denik pushed a commit to databricks/cli that referenced this pull request May 28, 2026
## Summary

The skills manifest in `databricks/databricks-agent-skills` is gaining
experimental skills sourced from a new `experimental/` directory in the
repo (see paired [d-a-s PR
#73](databricks/databricks-agent-skills#73),
which imports the ai-dev-kit skill catalog into `experimental/`).

This wires the parsing through the aitools installer:

- `Manifest.Skills` is a **single map** holding both stable and
experimental entries; the per-skill `repo_dir` field ("skills" or
"experimental") is the source of truth for whether a skill is
experimental. `SkillMeta.IsExperimental()` derives state from `RepoDir`.
- Experimental skills get a `-experimental` suffix on their install-side
key during `normalizeManifest`; `SourceName` preserves the unsuffixed
name for fetch URLs.
- The existing `--experimental` flag (already wired in `cmd/skills.go`)
now has experimental skills to install; without it, `resolveSkills`
filters them out as before.

## UX

```
# default — only stable skills
databricks experimental aitools skills install

# all experimental skills, plus stable
databricks experimental aitools skills install --experimental

# one experimental skill by name (--experimental still required by resolveSkills)
databricks experimental aitools skills install databricks-iceberg-experimental --experimental
```

## TODOs / caveats for iteration

1. ~~**`DATABRICKS_SKILLS_REF` pin.**~~ **Partially resolved.** The
default ref is still the latest stable release tag (sourced from
`experimental/aitools/lib/installer/SKILLS_VERSION`); experimental
entries won't exist there until d-a-s cuts a release with [PR
#73](databricks/databricks-agent-skills#73)
merged. The default ref bump is a follow-up automated by the
SKILLS_VERSION file. **UX fix shipped in this PR**: if `--experimental`
is passed but the manifest at the resolved ref exposes no experimental
skills, a warning is logged pointing users at
`DATABRICKS_SKILLS_REF=main`.
2. ~~**Collision handling is naive.**~~ **Resolved.** Every experimental
skill gets a `-experimental` suffix on its install-side key during
`normalizeManifest`. The manifest key + install dir both carry the
suffix; the `SourceName` field on `SkillMeta` preserves the upstream
repo dir name for fetch URLs. Users see at a glance which installed
skills are experimental.

Also handled: **experimental↔stable transitions**. If a skill flips its
experimental status upstream (the same logical skill changes manifest
key), `install` removes the stale variant on disk + state before
installing the new one, and `uninstall` accepts either variant name (and
removes both if both are present). Helper: `alternateVariantKey()`.
Covered by tests `TestInstallReplacesAlternateVariant`,
`TestUninstallByEitherVariantRemovesBoth`,
`TestUninstallByAlternateNameWhenOnlyOneVariantInstalled`.
3. ~~**`list` UX.**~~ **Resolved.** `aitools skills list` shows
experimental skills with an `[experimental]` tag in the NAME column
(driven by `meta.IsExperimental()`). Combined with the TODO #2
resolution (`-experimental` suffix in the manifest key), every
experimental row reads e.g. `databricks-iceberg-experimental
[experimental]` — slightly redundant but a clear visual anchor.
Hide-by-default was considered but rejected: users running `list` are
usually looking for what's available, and silently omitting experimental
skills makes them un-discoverable.
4. ~~**State tracking.**~~ **Resolved — kept additive semantics.**
`InstallState.IncludeExperimental` records what was last requested but
is not used to drive retroactive removal. Running `install` without
`--experimental` leaves previously-installed experimental skills in
place. Rationale: (a) users running `install` are typically
adding/updating, not declaring set membership; (b) silently uninstalling
things the user previously asked for is surprising; (c) the transition
cleanup shipped under TODO #2 handles the actual drift case (skill's
experimental status flipping upstream). Removal is what `uninstall` is
for.
5. ~~**No acceptance test yet.**~~ **Resolved.** Added acceptance tests
under `acceptance/experimental/aitools/skills/install*/` covering the
install flow against a mocked manifest server:
   - Stable-only install (no flag) → 1 skill installed
- `--experimental` install adds the experimental skill (with
`-experimental` suffix per the install-path model) → 2 skills total
   - Re-running `--experimental` is idempotent
- Specific-skill install (`install --skills <name>`) for both stable and
experimental
- `--experimental` against a manifest with no experimental entries logs
a nudge

To make these reachable, exposed a new env-var override
`DATABRICKS_SKILLS_BASE_URL` that overrides the hard-coded
`raw.githubusercontent.com` base URL used by
`GitHubManifestSource.FetchManifest` and `fetchSkillFile`. Defaults to
the canonical URL when unset, so no production behavior change. Updated
`Taskfile.yml`'s `test-exp-aitools` task to include
`acceptance/experimental/aitools/**`.

Variants left as follow-up acceptance tests (the structure is now in
place):
- Variant transition cleanup (stable → experimental, experimental →
stable)
   - Uninstall flow (with both variants installed)
6. ~~**`--experimental` flag scope.**~~ **Resolved — kept current
scope.** Each command has internally consistent behavior:
- `install --experimental` → explicit opt-in (required to install
experimental skills).
- `update` → state-driven (honors `InstallState.IncludeExperimental`
from the last `install`). If you opted in once, future updates refresh
experimentals; otherwise they're skipped.
- `list` → shows all skills with an `[experimental]` tag (no filtering —
discovery first, opt-in to install).

Adding `--experimental` / `--no-experimental` to `update` for one-off
overrides was considered but rejected: the natural workflow is to re-run
`install --experimental` (or just `install`), which already sets the
desired state. Follow-up if real users hit a use case for the override.
7. ~~**Manifest shape.**~~ **Resolved.** Replaced the original two-map
design (`skills` + `experimental_skills` + a per-skill `experimental`
bool) with a single `skills` map where each entry's `repo_dir`
(`"skills"` or `"experimental"`) is the source of truth. The cli derives
experimental state from `RepoDir` via `SkillMeta.IsExperimental()`.
Collisions between stable and experimental skills with the same repo dir
name must be resolved upstream in d-a-s (which they already are — d-a-s
PR #73's TODO #1a merged the only known collision into stable). The
d-a-s manifest generator should be updated to emit `repo_dir` per skill;
until then `normalizeManifest` defaults a missing `RepoDir` to
`"skills"` so older manifests still parse.

## Test plan

- [x] `go build ./...` passes.
- [x] `go test ./experimental/aitools/...` passes (`source_test.go`
covers the normalize/IsExperimental cases).
- [x] `go test ./acceptance -run TestAccept/experimental/aitools` passes
(a pre-existing flake intermittently surfaces an `lstat` warning during
copyDir, ~10% of multi-test runs; unrelated to this refactor).
- [ ] Run `./task lint` and `./task fmt` before merge.
- [ ] Manual: against a d-a-s ref containing experimental entries with
`repo_dir`, verify the four UX cases above behave correctly.

This pull request and its description were written by Claude.

---------

Co-authored-by: simon <4305831+simonfaltum@users.noreply.github.com>
Co-authored-by: simon <simon.faltum@databricks.com>
simonfaltum pushed a commit that referenced this pull request Jun 2, 2026
…apx Related Skills entry (#106)

## Summary

Removes the last dangling `databricks-app-apx` reference in this repo —
one line in `experimental/databricks-apps-python/SKILL.md` ("Related
Skills" bullet).

## Why

`databricks-app-apx` was the FastAPI+React stack referenced from
ai-dev-kit's `databricks-apps-python`. It has been removed upstream
(a-d-k is deprecated; the apx-on-CLI flow merged into the stable
`databricks-apps` skill via #84/#73). I grepped the entire repo and this
bullet is the only remaining mention — README, install scripts, and
stable skills no longer reference it.

## Test plan

- [x] `python3 scripts/skills.py validate` passes (`Everything is up to
date.`)
- [x] `grep -rn databricks-app-apx .` returns no remaining hits.
- [ ] CI green.

This pull request and its description were written by Claude.
simonfaltum added a commit that referenced this pull request Jun 2, 2026
…tmatter (#105)

## Summary

Backfills three frontmatter fields on 17 `experimental/` SKILL.md files
that stable skills already carry but the imported a-d-k snapshot does
not:

- `compatibility: Requires databricks CLI (>= v0.294.0)`
- `metadata.version: "0.1.0"` (was the `0.0.1` `scripts/skills.py`
fallback floor)
- `parent: databricks-core`

Closes the frontmatter-version / parent-skill / CLI-compatibility gaps
in one mechanical pass — they all touch the same files.

## Why

The stable-side standard is documented in CLAUDE.md and consistently
applied across `skills/`. Experimental skills carry none of these fields
because
[#73](#73)
imported the a-d-k snapshot verbatim. PR [#73 TODO
#7](#73)
explicitly leaves the version backfill open ("when upstream a-d-k
eventually adds version fields, those win; until then, the manifest
reports the floor"). With a-d-k now deprecated, this repo is source of
truth for `experimental/` and the backfill lands here.

The promotion-time pattern (cf.
[#87](#87)
vector-search) adds these fields on the way out of `experimental/`. This
PR closes the gap for the remaining skills that haven't been promoted
yet.

## Changes

17 SKILL.md files in `experimental/`, plus manifest regeneration.

`experimental/databricks-vector-search/SKILL.md` is intentionally
**skipped** — #87 promotes it to `skills/` and adds the same fields as
part of the move; including it here would create fake conflicts.
Whichever lands first, the other rebases cleanly.

## Manifest deltas

Every experimental skill's `version` flips from `0.0.1` (the
`extract_version_from_skill` fallback floor) to `0.1.0`. `compatibility`
and `parent` are SKILL.md-only — not surfaced in manifest.json today.

## Test plan

- [x] `python3 scripts/skills.py generate` clean
- [x] `python3 scripts/skills.py validate` passes (`Everything is up to
date.`)
- [ ] CI green on this branch.

This pull request and its description were written by Claude.

---------

Signed-off-by: simon <simon.faltum@databricks.com>
Co-authored-by: simon <simon.faltum@databricks.com>
simonfaltum added a commit that referenced this pull request Jun 2, 2026
…#88)

## Summary

Per Lennart's
[audit](#73 (comment))
on #73, item #9: `databricks-unstructured-pdf-generation` reads as "not
very Databricks-specific" because the headline is local HTML → PDF
generation, with the Databricks workflow (UC volume + RAG-eval dataset)
buried.

This reframes the skill to put the Databricks-specific value up front,
without removing any content.

## Changes

`experimental/databricks-unstructured-pdf-generation/SKILL.md`:
- Frontmatter `description` now leads with **"Build RAG /
unstructured-document evaluation datasets on Databricks"**. PDF
generation is positioned as a step, not the headline.
- Body intro states explicitly that the Databricks-specific value is the
workflow shape (UC volume layout, paired question files, hand-off to
downstream `ai_extract` / `ai_parse_document` /
`mlflow.genai.evaluate()`), not the HTML → PDF tooling itself.
- One-line escape hatch added: *"If you only need ad-hoc PDFs (no
Databricks workflow), any HTML → PDF tool works directly — this skill
exists for the synthetic-dataset-on-UC end-to-end shape, not as a
general PDF generator."*

Manifest regenerated to pick up the new description. No deletions; this
is a framing change.

## What this doesn't do

Two stronger alternatives in the audit are *not* implemented here:
- **Trim** the local HTML → PDF tooling and link to an external tool.
Would destroy useful content; the templates and parallel-conversion
patterns are still valuable for users following the end-to-end workflow.
- **Rename** to e.g. `databricks-rag-test-data`. Has cross-PR
implications (the a-d-k tombstone PR
[databricks-solutions/ai-dev-kit#546](databricks-solutions/ai-dev-kit#546)
references the current name) and changes the install command for
existing users.

If either is preferred over this lighter reframe, happy to open a
follow-up.

## Test plan

- [x] `python3 scripts/skills.py generate` clean.
- [x] `python3 scripts/skills.py validate` passes.
- [ ] CI green.
- [ ] Reviewer sign-off (`@lennartkats-db` raised this;
`@dustinvannoy-db` cc'd).

This pull request and its description were written by Claude.

---------

Co-authored-by: simon <simon.faltum@databricks.com>
simonfaltum pushed a commit that referenced this pull request Jun 2, 2026
## Summary

Per Lennart's
[audit](#73 (comment))
on #73, item #10: Vector Search is no longer considered experimental in
Genie Code. Promote `experimental/databricks-vector-search/` →
`skills/databricks-vector-search/`.

## Changes

- `git mv experimental/databricks-vector-search
skills/databricks-vector-search`.
- Move the 4 top-level reference files into `references/` to match the
stable-skills layout convention (apps, lakebase, pipelines).
- SKILL.md frontmatter: add `parent: databricks-core` and
`metadata.version: "0.1.0"`. Body: add the standard "FIRST: Use the
parent `databricks-core` skill" prelude.
- Rewrite 10 path references in SKILL.md for the new
`references/<file>.md` locations.
- `scripts/skills.py`: add `databricks-vector-search` to
`SKILL_METADATA`.
- Root `README.md` "Available Skills" list: add
`databricks-vector-search`.
- `experimental/README.md`: remove from the experimental skill list.

Manifest regenerated; `python3 scripts/skills.py validate` passes.

## Cross-repo

- a-d-k tombstone PR
[databricks-solutions/ai-dev-kit#546](databricks-solutions/ai-dev-kit#546)
currently redirects `databricks-vector-search` with `--experimental`.
I'll drop the flag in that PR to match this promotion.
- Stacking concern: this PR has zero overlap with the open
references-restructure PR
[#86](#86)
since the vector-search files move in this PR too (commit includes the
`references/` move). Whichever lands first, the other will rebase
cleanly with no conflicts in the vector-search subtree.

## Test plan

- [x] `python3 scripts/skills.py generate` clean.
- [x] `python3 scripts/skills.py validate` passes (`Everything is up to
date.`).
- [ ] CI green.
- [ ] Reviewer confirmation that stable classification is correct
(`@simonfaltum` flagged this).

This pull request and its description were written by Claude.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants