DOJ-4839: harden workbook output — knob split-bar, lede/quote/code-highlight, output validator by dbejarano820 · Pull Request #30 · DojoCodingLabs/instructional-design-toolkit

dbejarano820 · 2026-06-02T21:58:04Z

Summary

Output-quality hardening surfaced by reviewing a real generated workbook (intro-to-quantum). Three cohesive items; no invariant change (the validator enforces the existing rules, it doesn't alter them), no architecture change.

Changes

Knob is-split variant (workbook-base.html) — a generated knob captioned "probability of measuring 1 (coral) vs 0 (lilac)" out-promised the single-color --knob preview. Added a full-width two-color A-vs-B bar (value portion = --wb-accent, remainder = --wb-accent-2) driven by the same --knob var (no JS change) + a 2-item legend. Standard knob unchanged.
Promote bespoke patterns into the kit — wb-lede (lead paragraph) and wb-quote (pull-quote + cite), plus a dependency-free code-highlight convention (kw/str/num/fn/cmt token spans, fixed legible hues on the code surface). These were being re-invented per generation; now tested + themed in the base.
scripts/validate_workbook.py (NEW, stdlib only) — an author tool that validates a generated workbook against its invariants: standalone (no external <script src>/CDN), in-memory-only (no localStorage/sessionStorage/indexedDB/postMessage), globally-unique element ids (errors), + a11y-landmark / system-ui-fallback (warnings). Wired into the command as Phase 6.
Skill + command updated: new components in the catalog + mapping; the validate step documented.

Verification

Ran the validator on the real intro-to-quantum workbook → 0 errors, 0 warnings.
New components render; the split bar's gradient stop tracks --knob (verified in preview); standard knob stays single-color; zero console errors.
CI lint green (frontmatter / agent-references / json-schemas).
No :root/invariant change; brand stays token-driven.

Out of scope

Density / companion-vs-duplicate guidance and the empirical variety test (separate follow-ups).

Closes DOJ-4839

Created by Claude Code on behalf of @dbejarano820

🤖 Generated with Claude Code

@daniel

…ghlight, validator Improvements surfaced reviewing the real intro-to-quantum workbook. - Template: knob `is-split` variant (two-color A-vs-B bar driven by the existing --knob var + a 2-item legend; standard knob unchanged); promote `wb-lede` and `wb-quote` into the kit; dependency-free code-highlight token classes (kw/str/num/fn/cmt) with fixed legible hues on the code surface. - Skill: add lede / statement / quote / split-knob to the catalog; document the highlight convention. - Command: map the new components; add Phase 6 — run the output validator. - scripts/validate_workbook.py (NEW, stdlib only): checks a generated workbook is standalone (no external script/CDN), in-memory-only (no localStorage/ postMessage/etc.), has globally-unique ids (critical post fan-out), and warns on missing a11y landmarks / system-ui fallback. Validated the real intro-to-quantum workbook: 0 errors, 0 warnings. New template components render + the split bar tracks --knob; zero console errors. CI lint green. No invariant change. Created by Claude Code on behalf of @daniel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dbejarano820 · 2026-06-02T21:58:06Z

@dojo-code-reviewer review

dojo-code-reviewer

✅ Approved

Approved — 0 blockers, 5 P2, 1 P3. Confidence: 2/5.

🟡 P2 — Major

assets/templates/workbook/workbook-base.html:175 — 🟡 P2 (major) — The pull-quote citation text color uses var(--wb-muted), leading to an inaccessible 2.42:1 contrast ratio against the light page backgrounds. Switching to var(--wb-sub) meets the 4.5:1 WCAG AA threshold.

.wb-quote cite { display: block; margin-top: var(--wb-sp-2); font-style: normal; font-size: .9rem; color: var(--wb-sub); }

assets/templates/workbook/workbook-base.html:292 — 🟡 P2 (major) — Using var(--wb-muted) (#8a8a93) on a white page background results in a low 2.42:1 contrast ratio, failing WCAG AA (minimum 4.5:1) for outcome labels. We should switch the legend text color to var(--wb-sub) (#57575f, 5.4:1 contrast ratio) to ensure high readability.

.wb-knob__legend { display: flex; gap: var(--wb-sp-4); margin-top: var(--wb-sp-2); font-size: .8rem; color: var(--wb-sub); }

scripts/validate_workbook.py:28 — 🟡 P2 (major) — Scanning the entire document for CDN URLs flags safe outbound hyperlinks (e.g., anchors or references in text) rather than actual runtime dependencies. We should target only links and resource imports from tags like , , <iframe>, or <script> to prevent false validation errors on educational resources.

CDN_RE = re.compile(r"<(?:link|script|img|iframe)\b[^>]*(?:href|src)\s*=\s*[\"'](https?://[^\1'\s]*(?:cdn\.|unpkg\.com|jsdelivr\.net|cdnjs\.))", re.I)

scripts/validate_workbook.py:54 — 🟡 P2 (major) — Checking state-persistence APIs on the raw HTML content will raise errors and fail the build if terms like 'localStorage' are mentioned in instructional paragraphs, code syntax blocks, or annotations. We should parse out and validate only the JavaScript contents within <script> tags to allow safe conceptual discussions.

    scripts_content = "\n".join(m.group(1) for m in re.finditer(r"<script\b[^>]*>(.*?)</script>", html, re.DOTALL | re.IGNORECASE))
    state_hits = sorted({m.group(1) for m in STATE_RE.finditer(scripts_content)})

scripts/validate_workbook.py:63 — 🟡 P2 (major) — Detecting duplicate IDs globally across the raw HTML will trigger false errors if instructional HTML code blocks contain sample code with placeholder IDs. We should strip
```
 and  elements prior to identifying document-level IDs.
```


    html_no_code = re.sub(r"<(pre|code)\b[^>]*>.*?</\1>", "", html, flags=re.DOTALL | re.IGNORECASE)
    for m in ID_RE.finditer(html_no_code):

🔵 P3 — Minor

scripts/validate_workbook.py:32 — 🔵 P3 (minor) — Strict double-quote matches in accessibility landmark regexes will cause false warnings on perfectly valid HTML using single-quoted or unquoted role attributes. We should allow alternative quotes in role matches.

ROLE_REGION_RE = re.compile(r'role\s*=\s*["\']?region["\']?', re.I)
ROLE_PROGRESS_RE = re.compile(r'role\s*=\s*["\']?progressbar["\']?', re.I)


Total findings: 5 compliance, 1 business context (6 total)


ℹ️ Inline anchoring failed for the comments below; they are listed here instead.


assets/templates/workbook/workbook-base.html:175 — 🟡 P2 (major) — The pull-quote citation text color uses var(--wb-muted), leading to an inaccessible 2.42:1 contrast ratio against the light page backgrounds. Switching to var(--wb-sub) meets the 4.5:1 WCAG AA threshold.

.wb-quote cite { display: block; margin-top: var(--wb-sp-2); font-style: normal; font-size: .9rem; color: var(--wb-sub); }


assets/templates/workbook/workbook-base.html:292 — 🟡 P2 (major) — Using var(--wb-muted) (#8a8a93) on a white page background results in a low 2.42:1 contrast ratio, failing WCAG AA (minimum 4.5:1) for outcome labels. We should switch the legend text color to var(--wb-sub) (#57575f, 5.4:1 contrast ratio) to ensure high readability.

.wb-knob__legend { display: flex; gap: var(--wb-sp-4); margin-top: var(--wb-sp-2); font-size: .8rem; color: var(--wb-sub); }


scripts/validate_workbook.py:28 — 🟡 P2 (major) — Scanning the entire document for CDN URLs flags safe outbound hyperlinks (e.g.,  anchors or references in text) rather than actual runtime dependencies. We should target only links and resource imports from tags like , , <iframe>, or <script> to prevent false validation errors on educational resources.

CDN_RE = re.compile(r"<(?:link|script|img|iframe)\b[^>]*(?:href|src)\s*=\s*[\"'](https?://[^\1'\s]*(?:cdn\.|unpkg\.com|jsdelivr\.net|cdnjs\.))", re.I)


scripts/validate_workbook.py:32 — 🔵 P3 (minor) — Strict double-quote matches in accessibility landmark regexes will cause false warnings on perfectly valid HTML using single-quoted or unquoted role attributes. We should allow alternative quotes in role matches.

ROLE_REGION_RE = re.compile(r'role\s*=\s*["\']?region["\']?', re.I)
ROLE_PROGRESS_RE = re.compile(r'role\s*=\s*["\']?progressbar["\']?', re.I)


scripts/validate_workbook.py:54 — 🟡 P2 (major) — Checking state-persistence APIs on the raw HTML content will raise errors and fail the build if terms like 'localStorage' are mentioned in instructional paragraphs, code syntax blocks, or annotations. We should parse out and validate only the JavaScript contents within <script> tags to allow safe conceptual discussions.

    scripts_content = "\n".join(m.group(1) for m in re.finditer(r"<script\b[^>]*>(.*?)</script>", html, re.DOTALL | re.IGNORECASE))
    state_hits = sorted({m.group(1) for m in STATE_RE.finditer(scripts_content)})


scripts/validate_workbook.py:63 — 🟡 P2 (major) — Detecting duplicate IDs globally across the raw HTML will trigger false errors if instructional HTML code blocks contain sample code with placeholder IDs. We should strip  and  elements prior to identifying document-level IDs.


    html_no_code = re.sub(r"<(pre|code)\b[^>]*>.*?</\1>", "", html, flags=re.DOTALL | re.IGNORECASE)
    for m in ID_RE.finditer(html_no_code):

@daniel

…ardening - quote `cite` + knob legend: var(--wb-muted) -> var(--wb-sub) (WCAG AA, P2 x2). - validator (P2 x3 + P3): CDN check scoped to resource-tag href/src (not <a> links); state-API check scans <script> contents only (not prose/code that merely mentions localStorage); id-uniqueness excludes <pre>/<code> sample code; role regexes accept single/unquoted attributes. Verified: quantum workbook still 0/0; synthetic false-positive file (localStorage in prose, <a> cdn link, sample-code id) -> 0 errors; true-positive file (cdn <script>, localStorage in script, dup id) -> 4 errors. CI lint green. Created by Claude Code on behalf of @daniel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dojo-code-reviewer

✅ Approved

Approved — 0 blockers, 3 P3. Confidence: 4/5.

🔵 P3 — Minor

scripts/validate_workbook.py:29 — 🔵 P3 (minor) — Protocol-relative URLs (e.g., '//unpkg.com/...') are not captured by the current regex because it expects 'https?://'. This can cause some external CDN references to go undetected. Updating the pattern to support optional protocols makes the validation more robust.

    r"""<(?:link|script|img|iframe)\b[^>]*?\b(?:href|src)\s*=\s*["']((?:https?:)?//[^"']*(?:cdn\.|unpkg\.com|jsdelivr\.net|cdnjs\.)[^"']*)""",

scripts/validate_workbook.py:37 — 🔵 P3 (minor) — Single-quoted element IDs (e.g., id='my-id') will bypass this duplicate-check regex since it is strictly anchored to double-quotes. Supporting both single and double quotes ensures any valid HTML markup pattern is accurately checked.

ID_RE = re.compile(r'\sid\s*=\s*["\']([^"\']+)["\']', re.I)

scripts/validate_workbook.py:72 — 🔵 P3 (minor) — Script blocks containing variable assignments, string constants, or template literals with the string 'id="..."' can trigger false positives during duplicate-ID scanning because script content is not stripped. Removing script blocks before scanning prevents JS logic from being mistakenly classified as HTML element IDs.

    html_no_code = SCRIPT_BLOCK_RE.sub("", CODE_BLOCK_RE.sub("", html))

Total findings: 3 business context (3 total)

ℹ️ Inline anchoring failed for the comments below; they are listed here instead.

scripts/validate_workbook.py:29 — 🔵 P3 (minor) — Protocol-relative URLs (e.g., '//unpkg.com/...') are not captured by the current regex because it expects 'https?://'. This can cause some external CDN references to go undetected. Updating the pattern to support optional protocols makes the validation more robust.

    r"""<(?:link|script|img|iframe)\b[^>]*?\b(?:href|src)\s*=\s*["']((?:https?:)?//[^"']*(?:cdn\.|unpkg\.com|jsdelivr\.net|cdnjs\.)[^"']*)""",

scripts/validate_workbook.py:37 — 🔵 P3 (minor) — Single-quoted element IDs (e.g., id='my-id') will bypass this duplicate-check regex since it is strictly anchored to double-quotes. Supporting both single and double quotes ensures any valid HTML markup pattern is accurately checked.

ID_RE = re.compile(r'\sid\s*=\s*["\']([^"\']+)["\']', re.I)

scripts/validate_workbook.py:72 — 🔵 P3 (minor) — Script blocks containing variable assignments, string constants, or template literals with the string 'id="..."' can trigger false positives during duplicate-ID scanning because script content is not stripped. Removing script blocks before scanning prevents JS logic from being mistakenly classified as HTML element IDs.

    html_no_code = SCRIPT_BLOCK_RE.sub("", CODE_BLOCK_RE.sub("", html))

dojo-code-reviewer Bot approved these changes Jun 2, 2026

View reviewed changes

dbejarano820 merged commit 1d51b8b into main Jun 2, 2026
1 check passed

dbejarano820 deleted the daniel/doj-workbook-output-hardening branch June 2, 2026 22:03

dojo-code-reviewer Bot approved these changes Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOJ-4839: harden workbook output — knob split-bar, lede/quote/code-highlight, output validator#30

DOJ-4839: harden workbook output — knob split-bar, lede/quote/code-highlight, output validator#30
dbejarano820 merged 2 commits into
mainfrom
daniel/doj-workbook-output-hardening

dbejarano820 commented Jun 2, 2026

Uh oh!

dbejarano820 commented Jun 2, 2026

Uh oh!

dojo-code-reviewer Bot left a comment

Uh oh!

Uh oh!

dojo-code-reviewer Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dbejarano820 commented Jun 2, 2026

Summary

Changes

Verification

Out of scope

Uh oh!

dbejarano820 commented Jun 2, 2026

Uh oh!

dojo-code-reviewer Bot left a comment

Choose a reason for hiding this comment

✅ Approved

🟡 P2 — Major

🔵 P3 — Minor

Uh oh!

Uh oh!

dojo-code-reviewer Bot left a comment

Choose a reason for hiding this comment

✅ Approved

🔵 P3 — Minor

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant