Skip to content

Add language-agnostic emitter-diff tool + python adapter#11122

Draft
l0lawrence wants to merge 22 commits into
mainfrom
l0lawrence-emitter-diff
Draft

Add language-agnostic emitter-diff tool + python adapter#11122
l0lawrence wants to merge 22 commits into
mainfrom
l0lawrence-emitter-diff

Conversation

@l0lawrence

@l0lawrence l0lawrence commented Jun 30, 2026

Copy link
Copy Markdown
Member

Draft / RFC. Adds the language-agnostic emitter-diff tool (built in Azure/typespec-azure) to microsoft/typespec. CI runs end-to-end; the PR-comment + approved-baseline gate are verified on this PR.

What

A language-agnostic tool, eng/emitter-diff (@typespec/emitter-diff), that diffs the generated code produced by two versions of a TypeSpec emitter and optionally runs the emitter's generated-code test suites. It resolves a baseline emitter from an npm version, a local folder, or a GitHub url/sha, generates with both baseline and head, and renders the diff as a clickable HTML report (default), a VS Code diff, or a terminal patch.

Ships the python adapter for @typespec/http-client-python, which drives the package's existing regenerate.ts two-phase native pipeline (TypeSpec → YAML, then a batched Python subprocess), building and creating a venv per emitter version.

Changes

  • eng/emitter-diff/ — the tool: CLI/orchestrator, ref resolver (npm/local/github), generic EmitterAdapter interface + registry, diff engine (HTML/VS Code/terminal), python adapter. Baseline and head generate in parallel by default (prefixed logs; --sequential to opt out).
  • packages/http-client-python/eng/scripts/ci/regenerate.ts — adds --httpSpecsDir, --azureSpecsDir, --no-baseline so the tool can pin specs and skip the published-baseline clone; and makes the batched Python phase honor a custom --generatedFolder. Defaults preserve current behavior.
  • pnpm-workspace.yaml — adds eng/emitter-diff as a workspace member and diff2html to the catalog (lockfile updated).
  • .github/workflows/ci-emitter-diff-python.yml — PR workflow with an approved-baseline gate.

Approved-baseline gate

The approved generated-output baseline is a commit SHA in eng/emitter-diff/baselines/python.sha. CI builds the emitter for both the PR checkout and a worktree of that commit, diffs them, uploads the HTML artifact, and posts a sticky PR comment. If the output differs, the run exits 2 and the job fails. To approve an intended change, update the SHA to a commit on your branch containing your emitter changes and push — once the baseline matches head, the diff is empty and the check passes.

Notes / open questions

  • Python's native pipeline needs a venv per emitter version, so the adapter builds + npm run installs each side. --use-pyodide from the azure variant is dropped (no pyodide path in core).
  • The tool is designed to host future c#/java/go/rust adapters with no core changes.
  • Companion PR in Azure/typespec-azure (Support generating Python SDK with Pyodide #4784).

Supersedes the fork-based draft #11119 (closed); moved here so the CI PR-comment + gate run with a read-write token.

l0lawrence and others added 7 commits June 30, 2026 13:22
Ports the eng/emitter-diff tool: diffs generated code between two emitter
versions (npm version, local folder, or github ref) and optionally runs the
emitter's generated-code test suites. Ships the python adapter for
@typespec/http-client-python, driving its native two-phase regenerate pipeline
with a venv per emitter version.

- eng/emitter-diff: @typespec/emitter-diff package (CLI, ref resolver, diff
  engine with HTML/VS Code/terminal output, adapter registry).
- regenerate.ts: add --httpSpecsDir, --azureSpecsDir, --no-baseline flags.
- pnpm-workspace.yaml: add eng/emitter-diff member + diff2html catalog entry.
- ci-emitter-diff-python.yml: PR workflow that diffs head vs the PR base commit,
  uploads the rendered HTML artifact, and posts a sticky PR comment.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
CI now diffs the PR's emitter against the emitter at the commit recorded in
eng/emitter-diff/baselines/python.sha and fails when generated output differs.
To approve an intended change, update that SHA to a commit on the branch that
contains the emitter changes and push; once the baseline matches head the diff
is empty and the check passes.

- cli.ts: --fail-on-diff now exits 2 for "diff present" (vs 1 for hard errors)
  so CI can distinguish an unapproved diff from a failure.
- baselines/python.sha: the approved baseline commit.
- ci-emitter-diff-python.yml: read the SHA file as baseline, run with
  --fail-on-diff, enforce the gate, and explain approval in the summary/comment.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Use a non-literal import specifier + local type so a typecheck that doesn't
install this package's deps (e.g. the parent repo's check:eng, which includes
core/eng) doesn't fail with TS2307; availability is validated at runtime.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
run_batch.py was always passed PLUGIN_DIR as --generated-dir, so when the
emitter-diff tool redirects output via --generatedFolder the batch phase looked
in the wrong tree, found no .tsp-codegen config files, and wrote zero .py files
(leaving the path-bearing config files behind as spurious diffs). Pass the
parent of GENERATED_FOLDER instead; it equals PLUGIN_DIR in the default case so
normal regeneration is unchanged.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The two sides write to isolated output dirs, use separate venvs, and emit
uniquely-named temp YAML, so they can regenerate concurrently. Run them with
Promise.all and tag each side's streamed output with a [baseline]/[head] prefix
so the interleaved logs stay attributable. Add --sequential to opt back into
one-at-a-time generation (quieter logs, lower peak CPU/memory).

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Temporarily set the approved baseline to #10947 (timedelta duration encoding)
so the diff PR comment and approval gate can be exercised end-to-end. Revert to
the blessed HEAD SHA after the demo.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@microsoft-github-policy-service microsoft-github-policy-service Bot added emitter:client:python Issue for the Python client emitter: @typespec/http-client-python eng labels Jun 30, 2026
@pkg-pr-new

pkg-pr-new Bot commented Jun 30, 2026

Copy link
Copy Markdown

Open in StackBlitz

npm i https://pkg.pr.new/@typespec/http-client-python@11122

commit: 4601b99

@github-actions

Copy link
Copy Markdown
Contributor

All changed packages have been documented.

  • @typespec/http-client-python
Show changes

@typespec/http-client-python - internal ✏️

Add --httpSpecsDir, --azureSpecsDir, and --no-baseline flags to regenerate.ts so the language-agnostic eng/emitter-diff tool can drive code generation against pinned specs without cloning the published baseline.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Python emitter diff

Baseline 3e10b8376f3a20dfcd46050c712d6bc64d537b2c vs this PR.

Diff summary: 400 file(s), +400 / -0

Full rendered diff: emitter-diff-html artifact on the run.

Informational check (eng/emitter-diff); does not block the PR.

@l0lawrence l0lawrence changed the title Add language-agnostic emitter-diff tool + python adapter (same-repo test) Add language-agnostic emitter-diff tool + python adapter Jun 30, 2026
l0lawrence and others added 2 commits June 30, 2026 14:53
A non-zero exit that is not the dedicated 'diff present' code (2) means the tool
failed to run (build/venv/generate threw), not that output is unchanged. Drive
the comment off the diff step's exit status so a hard error shows a clear
failure notice instead of a misleading 'no changes'.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Revert the approved baseline to the blessed HEAD SHA and add a one-line comment
to the generated package __init__.py so the PR's generated output differs from
the baseline, exercising the emitter-diff PR comment + approval gate end to end.
Both changes are temporary and must be reverted before merge.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@azure-sdk-automation

azure-sdk-automation Bot commented Jun 30, 2026

Copy link
Copy Markdown

You can try these changes here

🛝 Playground 🌐 Website 🛝 VSCode Extension

l0lawrence and others added 12 commits June 30, 2026 15:10
pnpm's recursive 'pnpm --filter <pkg> exec' collapses any non-zero child exit
into its own generic exit 1 (ERR_PNPM_RECURSIVE_EXEC_FIRST_FAIL), which masked
the tool's dedicated 'diff present' code (2) as a hard error (1). Run the CLI
from the package directory via plain (non-recursive) 'pnpm exec' so the real
exit code propagates and the approval gate can distinguish a diff from a crash.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…ocs)

- resolver: on the full-fetch fallback, check out a github ref by name rather
  than FETCH_HEAD (which could be a branch head), so a SHA baseline/head never
  silently diffs the wrong commit.
- resolver: gh:<ref> now resolves the origin repo from the local remote instead
  of hard-coding microsoft/typespec, so forks/renames work.
- python adapter: recognize packages/http-specs and packages/azure-http-specs
  layouts for external --specs, and warn (instead of silently falling back to
  the current checkout) when an explicit --specs dir can't be resolved.
- diff: --html always writes a file (a 'No differences' stub when unchanged) and
  inlines the diff2html stylesheet so the CI artifact renders offline.
- cli: reject an invalid --test-target instead of silently running no suites.
- docs: fix the broken github-sha example, note --specs doesn't accept npm refs,
  soften the 'no core changes' wording, and correct the stale workflow comment.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
- cli.ts: absolutize --work-dir (resolve, not bare join). The python adapter
  runs regenerate.ts with a different cwd, so a relative work-dir made
  --generatedFolder/--pluginDir resolve against the wrong directory: outputs
  landed elsewhere while the diff engine compared empty dirs -> silent false
  'no differences'. Now every path handed to an adapter is absolute.
- cli.ts: reject --specs npm: before building emitters (fast-fail).
- python.ts: ensureDeps() runs 'npm ci' when node_modules is missing so a
  fresh gh:/github: clone builds instead of failing with 'cannot find module'.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The sticky comment now shows only the one-line summary (file count, +/-) and
the emitter-diff-html artifact link, not an enumerated list of changed files.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…ent)

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Security:
- Pass the untrusted workflow_dispatch baseline input (and derived values) via
  env vars instead of interpolating GitHub Actions expressions into run: bash
  (Actions expression injection)
- Validate git ref/repo and use "git checkout --end-of-options" (arg injection)
- Add "npm install --ignore-scripts" when materializing a baseline package
- Drop the remote CDN CSS fallback so the HTML report never fetches off-box

Cleanup:
- De-duplicate the two HTML report templates into one htmlDoc() helper
- Remove emojis from CLI output, job summary, and PR comment

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The baseline is now the git merge-base with the PR base branch instead of
a pinned SHA file, so it survives squash-merge / rebase / force-push. Diff
approval is granted by adding the `emitter-diff-approved` label to the PR;
labeled/unlabeled events re-run the check. Removes eng/emitter-diff/baselines/python.sha.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The workflow no longer fails on a generated-output diff or gates on a label.
It runs the tool without --fail-on-diff, so a diff exits 0 and only a real
tool/build error fails the job. Diffs are reported via job summary, a sticky
PR comment, and the HTML artifact. Drops the label trigger, the approval gate,
and the emitter-diff-approved label requirement.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
- Replace diff2html with a self-contained, GitHub-style HTML renderer
  (inline CSS, light/dark, no external requests). Runtime deps now empty.
- Remove diff2html from package.json, pnpm catalog, and lockfile.
- Drop the --run-tests/--test-env/--test-target surface and the adapter
  runTests contract; parallel generation and --sequential are unchanged.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Now that the tool has zero runtime dependencies, it no longer needs to be
a workspace package. Deleting the package.json and its name removes any
npm dependency-confusion surface, and the tool runs with plain
ode
(Node 24 strips TypeScript natively) instead of tsx.

- Delete eng/emitter-diff/package.json; drop it from pnpm-workspace.yaml.
- Rewrite relative imports to .ts extensions and set
  allowImportingTsExtensions so
ode runs it and 	sc typechecks.
- Invoke via
ode src/cli.ts in CI, the shebang, and the README.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Running emitter-diff with plain node requires .ts import specifiers, but
the repo-wide check:eng gate (tsconfig.eng.json, --noEmit) rejected them.
Enable allowImportingTsExtensions there so the tool stays covered by the
existing typecheck. Also reword the ensureDeps comment: http-client-python
commits its own package-lock.json to the repo (lockfiles aren't in the
published tarball), which is why npm ci works on a source checkout.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
A local run now prints the exact generated trees the diff compares, so
developers can open the head (current checkout) output directly instead of
inferring it from the work dir. Language-agnostic: lives in cli.ts, labeled
per side by the resolved emitter.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@l0lawrence

Copy link
Copy Markdown
Member Author

should emitter-diff live under eng/common bc we need this in tsp-azure too (like for rust/js/go etc)

Cut the python workflow to ~half its size:
- trim explanatory comment headers
- drop the hand-rolled github-script patch parser; reuse the tool's own
  'Diff summary:' line (ANSI-stripped) as a step output for the PR comment
- drop the redundant job-summary step and the now-unused --patch output
- collapse 'Fail on tool error' to a one-liner

Behavior is unchanged: sticky PR comment, HTML artifact, informational-only
(fails only on a tool/build error, never on a diff).

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

emitter:client:python Issue for the Python client emitter: @typespec/http-client-python eng

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant