context-engine
diff --git a/‎.agent/workflows/test-quality-review.md‎
Lines changed: 250 additions & 198 deletions b/‎.agent/workflows/test-quality-review.md‎
Lines changed: 250 additions & 198 deletions
diff --git a/‎.agent/workflows/write-tests-from-specs.md‎
Lines changed: 410 additions & 0 deletions b/‎.agent/workflows/write-tests-from-specs.md‎
Lines changed: 410 additions & 0 deletions
diff --git a/‎.claude/skills/analyze-mutation-survivors/SKILL.md‎
Lines changed: 72 additions & 0 deletions b/‎.claude/skills/analyze-mutation-survivors/SKILL.md‎
Lines changed: 72 additions & 0 deletions
diff --git a/‎.claude/skills/detect-test-antipatterns/SKILL.md‎
Lines changed: 127 additions & 0 deletions b/‎.claude/skills/detect-test-antipatterns/SKILL.md‎
Lines changed: 127 additions & 0 deletions
diff --git a/‎.claude/skills/map-spec-to-tests/SKILL.md‎
Lines changed: 90 additions & 0 deletions b/‎.claude/skills/map-spec-to-tests/SKILL.md‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎.claude/skills/run-quality-checks/SKILL.md‎
Lines changed: 58 additions & 0 deletions b/‎.claude/skills/run-quality-checks/SKILL.md‎
Lines changed: 58 additions & 0 deletions
@@ -0,0 +1,72 @@
+---
+name: analyze-mutation-survivors
+description: Analyze surviving mutations from Stryker mutation testing and write targeted tests to kill them. Use when mutation score is below threshold or when Stryker reports survivors.
+allowed-tools: Bash Read Edit Write Grep
+---
+
+# Analyze Mutation Survivors
+
+Analyze surviving mutants and write targeted tests to kill them.
+
+## Instructions
+
+### 1. Run Mutation Testing
+
+```bash
+bun run test:mutate
+```
+
+### 2. Open the Report
+
+The HTML report is at: `reports/mutation/index.html`
+
+Or read the console output which shows each survivor.
+
+### 3. For Each Survivor, Analyze
+
+Each surviving mutant shows:
+- **File and line number** - where the mutation was made
+- **Original code** - what the code looked like
+- **Mutated code** - what Stryker changed it to
+
+Ask: "Why didn't the existing tests catch this change?"
+
+### 4. Common Mutations and Fixes
+
+| Mutation | Why It Survived | Fix |
+|----------|-----------------|-----|
+| `>=` → `>` | No boundary test | Add test with exact boundary value |
+| `&&` → `\|\|` | Only tested when both true | Add test where only one is true |
+| `if (x)` → `if (true)` | No test where x is falsy | Add test with falsy value |
+| Block removed | Side effect not verified | Assert the block's effect |
+| Return changed | Only checked `toBeDefined()` | Assert exact return value |
+
+### 5. Write Targeted Test
+
+Create a test specifically designed to fail if the mutation is applied.
+
+Example:
+```typescript
+// Mutation: timestamp >= startTime → timestamp > startTime
+// Fix: Test with exact boundary
+test("includes message at exact startTime", () => {
+  const exactTime = new Date();
+  store.store(createMessage({ timestamp: exactTime }));
+  const results = store.query({ startTime: exactTime });
+  expect(results.length).toBe(1);  // Would be 0 with >
+});
+```
+
+### 6. Re-run Until Threshold Met
+
+```bash
+bun run test:mutate
+```
+
+Target: ≥ 80% mutation score
+
+## Tips
+
+- Focus on high-impact survivors first (core logic)
+- Some survivors are acceptable (defensive code, logging)
+- Property-based tests can kill many mutations at once
@@ -0,0 +1,127 @@
+---
+name: detect-test-antipatterns
+description: Detect fake safety patterns in tests that provide false confidence. Use when reviewing test quality, after automated checks pass, or when tests seem suspicious.
+allowed-tools: Read Grep Glob
+---
+
+# Detect Test Anti-Patterns
+
+Find patterns that create false confidence - tests that appear to verify behavior but don't.
+
+## Instructions
+
+### 1. Hidden Assertions (CRITICAL)
+
+Search for assertions inside callbacks:
+```bash
+grep -rn "expect(" packages/*/src/*.test.ts | grep -E "\([^)]+\) =>"
+```
+
+**Anti-pattern:**
+```typescript
+// ❌ BAD - expect may never run if callback doesn't execute
+pipeline.use(async (ctx) => {
+  expect(ctx.value).toBe("something");
+});
+await pipeline.run(ctx);
+```
+
+**Fix:**
+```typescript
+// ✅ GOOD - capture and assert after
+let capturedValue: string;
+let executed = false;
+pipeline.use(async (ctx) => {
+  executed = true;
+  capturedValue = ctx.value;
+});
+await pipeline.run(ctx);
+expect(executed).toBe(true);
+expect(capturedValue).toBe("something");
+```
+
+### 2. Weak toBeDefined() Assertions
+
+Search for:
+```bash
+grep -rn "toBeDefined()" packages/*/src/*.test.ts
+```
+
+**Anti-pattern:**
+```typescript
+// ❌ BAD - passes even if value is wrong
+expect(event.id).toBeDefined();
+```
+
+**Fix:**
+```typescript
+// ✅ GOOD - verify actual value
+expect(event.id).toMatch(/^[0-9a-f-]{36}$/i);
+```
+
+### 3. Tautological Comparisons
+
+Search for:
+```bash
+grep -rn "toBeGreaterThanOrEqual" packages/*/src/*.test.ts
+```
+
+**Anti-pattern:**
+```typescript
+// ❌ BAD - always passes if unchanged
+expect(updated.getTime()).toBeGreaterThanOrEqual(original.getTime());
+```
+
+**Fix:**
+```typescript
+// ✅ GOOD - strict with delay
+await new Promise(r => setTimeout(r, 5));
+expect(updated.getTime()).toBeGreaterThan(original.getTime());
+```
+
+### 4. Length-Only Checks
+
+Search for:
+```bash
+grep -rn "\.length).toBe" packages/*/src/*.test.ts
+```
+
+**Anti-pattern:**
+```typescript
+// ❌ BAD - could be wrong items
+expect(sessions.length).toBe(3);
+```
+
+**Fix:**
+```typescript
+// ✅ GOOD - verify content too
+expect(sessions.length).toBe(3);
+expect(sessions.map(s => s.id)).toContain(expected1.id);
+```
+
+### 5. Existence-Only Tests
+
+**Anti-pattern:**
+```typescript
+// ❌ BAD - doesn't test behavior
+it("exports SessionManager", () => {
+  expect(SessionManager).toBeDefined();
+});
+```
+
+**Fix:**
+```typescript
+// ✅ GOOD - verify functionality
+it("exports working SessionManager", () => {
+  const manager = new SessionManager();
+  const session = manager.create({ name: "test", transport: "stdio" });
+  expect(session.state).toBe("CREATED");
+});
+```
+
+## Output
+
+For each anti-pattern found, document:
+1. File and line number
+2. The problematic code
+3. Suggested fix
@@ -0,0 +1,90 @@
+---
+name: map-spec-to-tests
+description: Map specification scenarios to test cases and identify coverage gaps. Use when verifying test completeness against requirements or during test quality reviews.
+allowed-tools: Read Grep Glob
+---
+
+# Map Spec to Tests
+
+Create a traceability matrix linking spec scenarios to test implementations.
+
+## Instructions
+
+### 1. Locate the Spec Document
+
+Find the relevant spec file:
+```
+say2/3-how/multi-protocols/say2/specs/v1/02-architecture/shared/03-phased-implementation/
+```
+
+### 2. Extract Test Scenarios
+
+Look for sections with scenario lists:
+- Checkboxes: `- [ ]` or `- [x]`
+- Numbered lists describing expected behavior
+- Tables with acceptance criteria
+
+### 3. For Each Scenario
+
+Search for corresponding tests:
+```bash
+grep -rn "scenario keyword" packages/*/src/*.test.ts
+```
+
+### 4. Create Traceability Matrix
+
+Document in this format:
+
+```markdown
+| Spec Scenario | Test Location | Status | Notes |
+|---------------|---------------|--------|-------|
+| Sessions have unique IDs | manager.test.ts:L25 | ✅ | |
+| Messages preserve order | message-store.test.ts:L74 | ✅ | |
+| Middleware chain order | pipeline.test.ts:L30 | ✅ | |
+| Error state transition | manager.test.ts:L89 | ⚠️ | Only happy path |
+| Query by time range | - | ❌ | Not implemented |
+```
+
+### Status Legend
+
+- ✅ **Fully covered** - Test exists and verifies all aspects
+- ⚠️ **Partially covered** - Test exists but missing edge cases
+- ❌ **Not covered** - No test found
+
+### 5. For Missing or Partial Coverage
+
+Document:
+1. What specific behavior is not tested
+2. Which edge cases are missing
+3. Recommended test to add
+
+### 6. Update Spec Document
+
+After analysis, update the spec with verification status:
+
+```markdown
+## Test Scenarios
+
+- [x] Sessions are created with unique IDs *(manager.test.ts)*
+- [x] Messages can be stored and retrieved *(message-store.test.ts)*
+- [ ] Query filtering by time range *(NOT TESTED)*
+```
+
+## Example Output
+
+```markdown
+## Traceability Report for Phase 0
+
+**Coverage Summary:** 12/14 scenarios (86%)
+
+### Fully Covered (10)
+- Session lifecycle management
+- Message storage and retrieval
+- Middleware execution order
+
+### Partially Covered (2)
+- Error handling: missing timeout scenarios
+
+### Not Covered (2)
+- Concurrent session handling
+```
@@ -0,0 +1,58 @@
+---
+name: run-quality-checks
+description: Run automated test quality checks including tests, coverage, assertion density, and linting. Use when you need to verify test quality metrics or when tests might have issues.
+allowed-tools: Bash Read
+---
+
+# Run Quality Checks
+
+Run the automated quality checks to assess test health.
+
+## Instructions
+
+### 1. Quick Quality Check
+
+```bash
+bun run quality
+```
+
+This runs:
+- All tests
+- Assertion density check (≥ 1.0 per test)
+- Lint checks
+
+### 2. Full Quality Check (with mutation testing)
+
+```bash
+bun run quality:full
+```
+
+This adds mutation testing which is slower but more thorough.
+
+### 3. Individual Checks
+
+| Check | Command |
+|-------|---------|
+| Tests only | `bun test` |
+| Coverage | `bun test --coverage` |
+| Assertion density | `bun run test:density` |
+| Lint | `bun run lint` |
+| Mutation testing | `bun run test:mutate` |
+| Property tests | `bun run test:property` |
+
+## On Failure
+
+1. Read the output to identify which check failed
+2. Fix the issue in the test or source files
+3. Re-run until all checks pass
+
+## Expected Output
+
+```
+✅ All quality checks passed
+```
+
+With metrics:
+- Mutation score: ≥ 80%
+- Assertion density: ≥ 1.0 per test
+- Line coverage: ≥ 80%