Adopt Production-Grade Loop: minimax implementer + 5-reviewer cycle#66
Adopt Production-Grade Loop: minimax implementer + 5-reviewer cycle#66ktsaou wants to merge 4 commits into
Conversation
Reduce ingest write-model complexity with behavior-preserving helper extraction and characterization coverage. Move SOW-0055 to completed and track residual writer file-size debt in SOW-0060.
Encode the operator's new operating model in the project contract: - AGENTS.md: new "Production-Grade Loop" section, Hard Rule #11, updated branch-protection + merge workflow. - project-workflow SKILL: roles table (CTO / implementer=minimax / 5 reviewers), updated cycle, Step 4/6/7/8 reflect the new protocol. - project-delegation SKILL: "The implementer is minimax" section, backup rotation order, new forbidden block wording. - project-second-opinions SKILL: the 5-reviewer set (glm, mimo, minimax, qwen, deepseek), PRODUCTION GRADE vote, P0/P1/P2/P3 stop conditions, claim-verification steps, deprecation note for codex/gemini on the production loop. This makes the protocol survive compactions: AGENTS.md is loaded at startup, and the three skills are the runtime enforcement. No production code changed. No SOW affected. Pure contract update.
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| BestPractice | 4 medium |
| Comprehensibility | 5 minor |
🟢 Metrics 140 complexity · 41 duplication
Metric Results Complexity 140 Duplication 41
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
P0 (durable specs were stale): - specs/workflow.md: Roles section + External Review section rewritten to match the 5-reviewer Production-Grade Loop. - specs/second-opinions.md: TL;DR, "show the user the prompts" rule, reviewer table, and ad-hoc subset all updated to the new protocol. P1 (must fix): - Implementer model pinned to llm-netdata-cloud/minimax-m3-coder everywhere (AGENTS.md, project-delegation, project-workflow, specs/second-opinions, specs/workflow). The reviewer-minimax uses the same model. Reviewer minimax invocation path also pinned. - Hard Rule #4 now says "5/5 PRODUCTION GRADE" instead of the weaker "at least one round". - Hard Rule #3 now references minimax and the Production-Grade Loop (was the generic "spawned subagents"). - "What the operator sees" / Reporting to the Operator: removed "file paths with line numbers as evidence" from the operator report in project-workflow SKILL — that lives in SOW ## Reviews and the PR description only. P2 (fixed in this PR): - AGENTS.md stop conditions: P2 wording made explicit ("fix in the same PR, re-trigger the cycle; merge only when 5/5 PG or only P3 noise remains"); hard-stall language clarified to business-level escalation only. - "When the loop runs" gained a "Contract / process changes" bullet: AGENTS.md, project skills, specs, CI config all go through the 5-reviewer cycle. - project-workflow SKILL: model name in Roles table updated to match; cross-reference now specifies the "Production-Grade Loop" section. - project-delegation SKILL: implementer vs reviewer wording differentiated. - project-second-opinions SKILL: ad-hoc reviewer table no longer lists the 5 production reviewers (they're reserved for the production cycle); the contradictory "are-not-used-for-ad-hoc" note is gone. - project-second-opinions SKILL: deepseek invocation path fixed to llm-netdata-cloud/deepseek-v4-pro. - project-second-opinions SKILL: SOW review template gained the "Vote PRODUCTION GRADE / NEEDS WORK" instruction so reviewer responses are machine-parseable. - Required First Checks: project-second-opinions added to the minimum load list (it's the runtime enforcement of the loop). P3 (documented, no fix this round): - Spec → Test → Code Protocol updated to reference the 5-reviewer cycle. - "Delegation Protocol" updated to mention the 5-reviewer cycle is delegated too — but only the CTO runs it. - Loop step 8 uses → consistently.
…ementer P1 fixes (must fix before merge): - Stop conditions split: P2 and P3 are now distinct bullets. P2 explicitly requires re-triggering the full 5-reviewer cycle; P3 does not. The previous grouping "P2/P3 NEEDS WORK → fix in this PR; merge when 5/5 PG or only P3 noise remains" hid the re-trigger requirement for P2 and contradicted the P2 definition. Applied in AGENTS.md and project-second-opinions SKILL. - Backup implementer rotation: clarified that the rotated reviewer is REMOVED from the 5-reviewer cycle for that PR (so the implementer is not also reviewing their own work) and the 5-reviewer set is filled by substituting from the ad-hoc set (codex/gemini/claude/kimi). Applied in AGENTS.md and project-delegation SKILL. P2 fixes (fixed in this PR): - "Reviewer unavailability" paragraph added to AGENTS.md and project-second-opinions SKILL: a single unavailable reviewer is substituted from the ad-hoc set with a SOW ## Reviews log entry; two or more simultaneous unavailability escalates as a hard stall. - Recording Reviews section in spec/second-opinions.md rewritten to match the SKILL workflow: reviewer attribution + vote, claim verification verdict, fix applied, final PG count. No raw prompts or reviewer output in the SOW. - "the user" -> "the operator" in spec/second-opinions.md. - "Recommended" section in spec/second-opinions.md reframed as "signals that the change is non-trivial and therefore mandatory under the 5-reviewer cycle" (was blurring mandatory/optional). - Discipline checklist in spec/workflow.md updated to "5/5 PRODUCTION GRADE, or only P3 noise with documented disposition". - Ad-hoc reviewer set paragraph in project-second-opinions SKILL simplified (removed the self-contradictory "reserved but CTO may still invoke" wording). - qwen model bumped to llm-netdata-cloud/qwen3.7-plus (matches workstation litellm config; the "always pin to latest stable" policy applies). Applied in spec/second-opinions.md and project-second-opinions SKILL. P3 polish (fixed in this PR): - project-workflow SKILL roles table now includes the implementer model version and the reviewer model versions. - AGENTS.md Hard Rule #11 wording cleaned up: "minimax" no longer appears twice in the reviewer list; uses "minimax implementer" / "minimax reviewer" labels. No new regressions. No production code changed.
There was a problem hiding this comment.
2 issues found across 19 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="internal/ingest/sow55_characterization_writer_test.go">
<violation number="1" location="internal/ingest/sow55_characterization_writer_test.go:107">
P2: The JSON path used to assert removal of `attr.x` is incorrect, so the test can pass even if `attr.x` was not removed.</violation>
</file>
<file name="internal/ingest/sow55_characterization_resolver_test.go">
<violation number="1" location="internal/ingest/sow55_characterization_resolver_test.go:96">
P2: Rollback assertion for `root_session_id` is under-specified and can false-pass on incorrect values.</violation>
</file>
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic
| if gotCN != "parent:agent:xyz" { | ||
| t.Errorf("childNativeId stash = %q, want %q (must survive nil-extras re-emit)", gotCN, "parent:agent:xyz") | ||
| } | ||
| gotAttr := scanString(t, db, `SELECT IFNULL(json_extract(extras_json,'$.attr.x'),'') FROM ops WHERE id=?`, opID) |
There was a problem hiding this comment.
P2: The JSON path used to assert removal of attr.x is incorrect, so the test can pass even if attr.x was not removed.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/ingest/sow55_characterization_writer_test.go, line 107:
<comment>The JSON path used to assert removal of `attr.x` is incorrect, so the test can pass even if `attr.x` was not removed.</comment>
<file context>
@@ -0,0 +1,153 @@
+ if gotCN != "parent:agent:xyz" {
+ t.Errorf("childNativeId stash = %q, want %q (must survive nil-extras re-emit)", gotCN, "parent:agent:xyz")
+ }
+ gotAttr := scanString(t, db, `SELECT IFNULL(json_extract(extras_json,'$.attr.x'),'') FROM ops WHERE id=?`, opID)
+ if gotAttr != "" {
+ t.Errorf("attr.x present after nil-extras re-emit = %q, want empty (graft must drop non-aiViewer keys)", gotAttr)
</file context>
| gotAttr := scanString(t, db, `SELECT IFNULL(json_extract(extras_json,'$.attr.x'),'') FROM ops WHERE id=?`, opID) | |
| gotAttr := scanString(t, db, `SELECT IFNULL(json_extract(extras_json,'$."attr.x"'),'') FROM ops WHERE id=?`, opID) |
| if got := scanString(t, db, `SELECT IFNULL(parent_session_id,'') FROM sessions WHERE id=?`, childID); got != "" { | ||
| t.Errorf("child parent_session_id = %q, want empty (linkage must roll back on notify failure)", got) | ||
| } | ||
| if got := scanString(t, db, `SELECT IFNULL(root_session_id,'') FROM sessions WHERE id=?`, childID); got == parentID { |
There was a problem hiding this comment.
P2: Rollback assertion for root_session_id is under-specified and can false-pass on incorrect values.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/ingest/sow55_characterization_resolver_test.go, line 96:
<comment>Rollback assertion for `root_session_id` is under-specified and can false-pass on incorrect values.</comment>
<file context>
@@ -0,0 +1,130 @@
+ if got := scanString(t, db, `SELECT IFNULL(parent_session_id,'') FROM sessions WHERE id=?`, childID); got != "" {
+ t.Errorf("child parent_session_id = %q, want empty (linkage must roll back on notify failure)", got)
+ }
+ if got := scanString(t, db, `SELECT IFNULL(root_session_id,'') FROM sessions WHERE id=?`, childID); got == parentID {
+ t.Errorf("child root_session_id = %q (= parent), want still self (linkage rolled back)", got)
+ }
</file context>
| if got := scanString(t, db, `SELECT IFNULL(root_session_id,'') FROM sessions WHERE id=?`, childID); got == parentID { | |
| if got := scanString(t, db, `SELECT IFNULL(root_session_id,'') FROM sessions WHERE id=?`, childID); got != childID { |
Summary
Encode the operator's new operating model in the project contract:
Why
minimaxinstance that writes code is not the same instance that reviews it.AGENTS.mdand the runtime enforcement is in the three skills.Out of scope
How the new protocol works (CTO/integration check)
After this lands, every non-trivial SOW follows this loop:
minimax(implementer)minimaxwrites code, runs local gates, returns diffReviewer will be invoked on this PR per the new protocol.
Summary by cubic
Adopts the Production-Grade Loop (implementer
minimaxwith a 5-reviewer cycle), pins exact model paths, and completes the ingest write-model cleanup with behavior-preserving refactors, stronger protocol rules (P2 vs P3, backup implementer rotation, reviewer availability), and new characterization tests.Refactors
internal/ingest/writer.go(event dispatcher),catalog_migrate.go(table-scoped updates),resolver.go(named transactional steps),rollup_refresh.go(shared bucket pass), androllup_backfill.go(prep/run/log split) without changing behavior.done/; opened SOW-0060 to track splitting the large writer file. No schema changes.Migration
minimax; reviewers areglm,mimo,minimax(fresh),qwen,deepseek. CTO runs reviewers; merge only on 5/5 PRODUCTION GRADE with gates/CI green. Protocol lives inAGENTS.mdand is enforced byproject-workflow,project-delegation, andproject-second-opinions.codex,gemini,claude,kimi); single reviewer unavailability substitutes from ad-hoc (two+ is a hard stall); SOW review recording clarified; “user” → “operator”; discipline checklist and “Recommended” signals updated.minimax=llm-netdata-cloud/minimax-m3-coder;deepseek=llm-netdata-cloud/deepseek-v4-pro;qwen=llm-netdata-cloud/qwen3.7-plus. Roles table lists versions; Hard Rule SOW-0001 Chunk 10: pricing data + temporal tiers + refresh script #11 wording polished.project-second-opinionsis a required first check; contract/process changes also run the loop.Written for commit 3e74b2d. Summary will update on new commits.