test(cli): integration tests for all cf proof commands (#455)#462
test(cli): integration tests for all cf proof commands (#455)#462
Conversation
17 tests across 5 test classes cover every acceptance criterion:
AC1 — TestCapture: creates REQ-0001, persists to ledger, increments IDs,
rejects invalid severity/source with exit code 1
AC2 — TestRun: PASS/FAIL exit codes, no-obligations empty path,
invalid gate error, _run_gate patched via unittest.mock
AC3 — TestWaive: marks waived with expiry and persists to ledger,
works without expiry, rejects missing REQ (exit 1),
rejects bad date format with descriptive error
AC4 — TestStatus: empty workspace message, open count, waived count,
expired waiver reverts to open and prints Expired notice
AC5 — TestClosedLoop: capture → fail run (exit 1, FAIL in output) →
status still open → pass run (exit 0, PASS) → evidence recorded
All 17 tests pass. No production code changes required.
WalkthroughAdds a new end-to-end CLI test module that exercises Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/cli/test_proof_commands.py (1)
51-55: Assert the seed capture inws_with_req.If the setup command fails here, the dependent tests fail later with misleading assertions. Failing fast in the fixture makes the regression obvious.
Example tweak
`@pytest.fixture`() def ws_with_req(ws): """Workspace that already has one captured requirement (REQ-0001).""" workspace, workspace_path = ws - runner.invoke(app, ["proof", "capture", "-w", str(workspace_path)] + _CAPTURE_ARGS) + result = runner.invoke( + app, ["proof", "capture", "-w", str(workspace_path)] + _CAPTURE_ARGS + ) + assert result.exit_code == 0, result.output return workspace, workspace_path🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cli/test_proof_commands.py` around lines 51 - 55, The ws_with_req fixture runs the seed capture but doesn't verify it succeeded; update the ws_with_req function to assert the capture command's result is successful by capturing the return value of runner.invoke(app, ["proof", "capture", "-w", str(workspace_path)] + _CAPTURE_ARGS) and asserting result.exit_code == 0 (and/or that result.output contains the expected success message or created requirement id like "REQ-0001") so the fixture fails fast when the setup command fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/cli/test_proof_commands.py`:
- Around line 205-214: Update the test_waive_nonexistent_req_exits_nonzero test
to assert the CLI error message contains a stable "not found" indicator in
addition to exit code; after calling runner.invoke(app, [...]) check
result.exit_code == 1 and add an assertion like "'not found' in result.output"
(or result.stdout/result.stderr depending on runner) to ensure the failure is
the missing-REQ case rather than an unrelated usage/CLI error.
- Around line 305-315: After verifying evidence via ledger.list_evidence for
"REQ-0001", also assert the canonical satisfied state: call
ledger.get_requirement(workspace, "REQ-0001") and assert its .status (or .state)
equals the satisfied value your model uses (e.g., "satisfied" or
RequirementStatus.SATISFIED), or alternatively invoke the CLI with
runner.invoke(app, ["proof", "status", "-w", str(workspace_path), "REQ-0001"])
and assert the output shows it as satisfied; attach this assertion after the
existing evidence checks to ensure the requirement is truly recorded as closed.
- Around line 241-273: The tests use loose substring checks like assert "1" in
result.output which can match unrelated digits; tighten assertions by checking
the exact summary lines and verifying the waive CLI succeeded: in
test_status_shows_open_count assert the status output contains a line like
"Open: 1" (or use a regex anchored to the summary label) instead of just "1"; in
test_status_shows_waived_count first capture the result of runner.invoke(app,
["proof", "waive", ...]) and assert its exit_code == 0, then assert the status
contains an exact "Waived: 1" line; in
test_status_expired_waiver_reverts_to_open after injecting the past Waiver via
ledger.waive_requirement confirm the status contains both the "Expired" notice
and that the summary shows "Open: 1" (not just that "Expired" appears). Use the
test function names (test_status_shows_open_count,
test_status_shows_waived_count, test_status_expired_waiver_reverts_to_open),
runner.invoke, ledger.waive_requirement and the Waiver model to locate the
changes.
---
Nitpick comments:
In `@tests/cli/test_proof_commands.py`:
- Around line 51-55: The ws_with_req fixture runs the seed capture but doesn't
verify it succeeded; update the ws_with_req function to assert the capture
command's result is successful by capturing the return value of
runner.invoke(app, ["proof", "capture", "-w", str(workspace_path)] +
_CAPTURE_ARGS) and asserting result.exit_code == 0 (and/or that result.output
contains the expected success message or created requirement id like "REQ-0001")
so the fixture fails fast when the setup command fails.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 85023c51-de0a-49d1-be45-8327a263963b
📒 Files selected for processing (1)
tests/cli/test_proof_commands.py
Review: Integration Tests for
|
- ws_with_req fixture: assert capture exit_code == 0 for fast failure on setup errors - test_waive_nonexistent_req: add "not found" message assertion alongside exit code check - test_status_shows_open_count/waived_count: tighten to "Open:"/"Waived:" substrings and verify waive exit_code == 0 before status check - test_status_expired_waiver_reverts_to_open: assert "Open:" present after expiry in addition to "Expired" - test_capture_run_enforced_then_satisfied: assert req.status == ReqStatus.SATISFIED after passing run
|
Follow-up Review - status of issues after fix commit a23ef3a Resolved items from prior round:
Still open: Issue 1 - Important - Fragile count assertions in TestStatus (lines 243, 248, 262, 266). The two-assertion pattern still exists: assert 'Open:' in result.output followed by assert '1' in result.output. The digit '1' appears in REQ-0001 and in tmp path names, so this can pass even when the count label is absent. The fix is to combine into a single specific string, for example: assert 'Open: 1' in result.output. Check the actual output format from cf proof status to confirm the exact separator before applying. Issue 2 - Minor - Late import inside test body. The import 'from codeframe.core.proof.models import Waiver' remains inside the test_status_expired_waiver_reverts_to_open function body. Move it to the module-level imports at the top of the file alongside the existing ReqStatus import for consistency. Issue 3 - Minor - Inconsistent exit code assertions. test_capture_invalid_severity_exits_nonzero and test_capture_invalid_source_exits_nonzero use assert result.exit_code != 0. All other error tests in this file use == 1. Prefer == 1 for precision unless the command can return multiple distinct non-zero values on error. Summary: The three critical issues from the first review round are all resolved. Issue 1 above is the highest priority remaining item since it creates a false-passing test condition. Issues 2 and 3 are cleanup items. The overall test design is solid - isolated workspaces, direct ledger verification, correct patch target for _run_gate, and good fixture composition. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/cli/test_proof_commands.py (1)
243-278:⚠️ Potential issue | 🟠 MajorTighten the status summary assertions.
These checks can still pass on unrelated digits like
REQ-0001, and the expired-waiver case still doesn't prove the requirement is counted as open again. Please assert the summary lines directly (Open: 1,Waived: 1, andOpen: 1after expiry), or use an anchored regex against those labels.Suggested tightening
result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) assert result.exit_code == 0, result.output - assert "Open:" in result.output - assert "1" in result.output + assert "Open: 1" in result.output @@ result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) assert result.exit_code == 0, result.output - assert "Waived:" in result.output - assert "1" in result.output + assert "Waived: 1" in result.output @@ result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) assert result.exit_code == 0, result.output assert "Expired" in result.output - assert "Open:" in result.output + assert "Open: 1" in result.output🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cli/test_proof_commands.py` around lines 243 - 278, The tests use loose substring checks (e.g., assert "1" in result.output) which can match unrelated digits; update the three tests (test_status_shows_open_count, test_status_shows_waived_count, test_status_expired_waiver_reverts_to_open) to assert the exact summary text instead of generic digits — for example assert "Open: 1" in result.output and assert "Waived: 1" in result.output (or use an anchored regex like r"^Open:\s*1" / r"^Waived:\s*1" against the multiline output) so the status lines are validated precisely and the expired-waiver test verifies the requirement is counted as Open again.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/cli/test_proof_commands.py`:
- Around line 305-308: The test currently checks only that "Open" appears in the
proof status output; replace that weak assertion with a precise check that the
failing run left REQ-0001 open by asserting the count (e.g., assert "Open: 1" in
result.output) or by querying the ledger after runner.invoke (use the same
ledger client/helper used elsewhere in tests) to fetch the requirement status
for REQ-0001 and assert it remains "Open". Update the assertion near the proof
status invocation (runner.invoke(app, ["proof", "status", "-w",
str(workspace_path)])) to verify the exact open count or ledger state so the
"fail doesn't auto-satisfy" path is actually validated.
---
Duplicate comments:
In `@tests/cli/test_proof_commands.py`:
- Around line 243-278: The tests use loose substring checks (e.g., assert "1" in
result.output) which can match unrelated digits; update the three tests
(test_status_shows_open_count, test_status_shows_waived_count,
test_status_expired_waiver_reverts_to_open) to assert the exact summary text
instead of generic digits — for example assert "Open: 1" in result.output and
assert "Waived: 1" in result.output (or use an anchored regex like r"^Open:\s*1"
/ r"^Waived:\s*1" against the multiline output) so the status lines are
validated precisely and the expired-waiver test verifies the requirement is
counted as Open again.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ccb268b7-3683-4260-b68a-9368cb1af409
📒 Files selected for processing (1)
tests/cli/test_proof_commands.py
| # Step 3 — status: still open (run failure doesn't auto-satisfy) | ||
| result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) | ||
| assert result.exit_code == 0, result.output | ||
| assert "Open" in result.output |
There was a problem hiding this comment.
Assert the failed run leaves REQ-0001 open.
"Open" in the status output doesn't prove the failing run preserved the requirement state. Query the ledger here, or at least assert Open: 1, so the “fail doesn't auto-satisfy” path is actually covered.
Suggested assertion
result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)])
assert result.exit_code == 0, result.output
- assert "Open" in result.output
+ assert "Open: 1" in result.output
+ req = ledger.get_requirement(workspace, "REQ-0001")
+ assert req is not None
+ assert req.status == ReqStatus.OPEN📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Step 3 — status: still open (run failure doesn't auto-satisfy) | |
| result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) | |
| assert result.exit_code == 0, result.output | |
| assert "Open" in result.output | |
| # Step 3 — status: still open (run failure doesn't auto-satisfy) | |
| result = runner.invoke(app, ["proof", "status", "-w", str(workspace_path)]) | |
| assert result.exit_code == 0, result.output | |
| assert "Open: 1" in result.output | |
| req = ledger.get_requirement(workspace, "REQ-0001") | |
| assert req is not None | |
| assert req.status == ReqStatus.OPEN |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/cli/test_proof_commands.py` around lines 305 - 308, The test currently
checks only that "Open" appears in the proof status output; replace that weak
assertion with a precise check that the failing run left REQ-0001 open by
asserting the count (e.g., assert "Open: 1" in result.output) or by querying the
ledger after runner.invoke (use the same ledger client/helper used elsewhere in
tests) to fetch the requirement status for REQ-0001 and assert it remains
"Open". Update the assertion near the proof status invocation
(runner.invoke(app, ["proof", "status", "-w", str(workspace_path)])) to verify
the exact open count or ledger state so the "fail doesn't auto-satisfy" path is
actually validated.
Summary
Closes #455. Adds
tests/cli/test_proof_commands.py— 17 integration tests covering the completecf proofCLI surface area through the TyperCliRunneragainst real SQLite workspaces.Acceptance criteria
cf proof capturecreates a REQ and persists itcf proof runevaluates workspace against open REQscf proof waivemarks a REQ waived with expirycf proof statusshows correct summaryWhat's in the test file
TestCaptureTestRun_run_gatepatchedTestWaiveTestStatusTestClosedLoopDesign decisions
_run_gateis patched atcodeframe.core.proof.runner._run_gate— same pattern asTestRunnerintests/core/test_proof9.pyledger.get_requirement/ledger.list_evidence— output checks alone are insufficienttmp_pathworkspace viacreate_or_load_workspaceTest plan
uv run ruff checkcleanSummary by CodeRabbit