Skip to content

plan(seed-sentinel-security-eval): seed Cisco foundry-security-spec as Sentinel capability#188

Open
jankneumann wants to merge 4 commits into
mainfrom
claude/cisco-foundry-spec-integration-WV05K
Open

plan(seed-sentinel-security-eval): seed Cisco foundry-security-spec as Sentinel capability#188
jankneumann wants to merge 4 commits into
mainfrom
claude/cisco-foundry-spec-integration-WV05K

Conversation

@jankneumann
Copy link
Copy Markdown
Owner

Adapt the Cisco foundry-security-spec (agentic AI security evaluation) into an
OpenSpec seed change. All ~35 foundry clarification markers resolved up front.

  • constitution.md: 11 principles + Deviation D-1 (multi-vendor exception to
    single-provider reproducibility) mitigated by verdict-provenance
  • specs/sentinel-security-eval: 8 roles, finding lifecycle, 3-leg evidence gate,
    structure-based fingerprint, Validator-only exploited flag, coverage+yield
    auto-stop, auto-block, sandbox-by-infrastructure, CVSS-v4/CWE/needs-review policy
  • design.md: role->existing-capability binding table (Approach A), seed/roadmap
    boundary, deviation analysis, deferred-extension preconditions
  • proposal/tasks/work-packages/contracts stub; seed-only, no role logic

https://claude.ai/code/session_01VMF1MX95ryHATWjUpa9QMt

claude added 4 commits May 26, 2026 11:49
…s Sentinel capability

Adapt the Cisco foundry-security-spec (agentic AI security evaluation) into an
OpenSpec seed change. All ~35 foundry clarification markers resolved up front.

- constitution.md: 11 principles + Deviation D-1 (multi-vendor exception to
  single-provider reproducibility) mitigated by verdict-provenance
- specs/sentinel-security-eval: 8 roles, finding lifecycle, 3-leg evidence gate,
  structure-based fingerprint, Validator-only exploited flag, coverage+yield
  auto-stop, auto-block, sandbox-by-infrastructure, CVSS-v4/CWE/needs-review policy
- design.md: role->existing-capability binding table (Approach A), seed/roadmap
  boundary, deviation analysis, deferred-extension preconditions
- proposal/tasks/work-packages/contracts stub; seed-only, no role logic

https://claude.ai/code/session_01VMF1MX95ryHATWjUpa9QMt
…om project.md

Wire the Sentinel constitution + capability into openspec/project.md Domain
Context, and mark Phase 1-3 seed tasks complete.

https://claude.ai/code/session_01VMF1MX95ryHATWjUpa9QMt
…ementation

Decompose the seed into 19 prioritized, dependency-ordered candidates:
14 scheduled across 5 phases (foundation -> knowledge -> detection/triage ->
validation/reporting/coverage -> operability) + 5 deferred extension roles
recorded as BLOCKED with adopt-when preconditions. Each candidate binds to its
mapped existing capability per the seed's design.md D1. DAG validated acyclic.

https://claude.ai/code/session_01VMF1MX95ryHATWjUpa9QMt
Replace the "tolerated reproducibility liability" framing of Deviation D-1 with
a multi-vendor consensus mechanism: within-vendor consistency -> cross-vendor
calibration -> principled synthesis (confirmed/unconfirmed/disagreement),
reusing parallel-infrastructure's ConsensusSynthesizer. Governing rule: never
mix raw cross-vendor outputs on one scale.

- constitution.md D-1: 5 binding mitigations + calibration-quality residual risk
- spec: new "Multi-Vendor Verdict Consensus and Calibration" requirement (4
  scenarios); consensus-aware provenance; calibrated (not averaged) severity
- design.md: ConsensusSynthesizer binding row + rewritten D3 analysis
- roadmap: new sentinel-verdict-consensus item (P4); Reporter now depends on it

https://claude.ai/code/session_01VMF1MX95ryHATWjUpa9QMt
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffc351d8f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +171 to +172
A Sentinel finding SHALL progress through the states `candidate → verdict-assigned → confirmed → [validated] → published`. The five verdicts and their surfacing rules SHALL be: `true-positive` (surfaced), `false-positive` (internal), `needs-review` (surfaced to humans), `not-applicable` (internal), `code-quality` (internal) (foundry §7.2, FR-085–FR-093; Constitution II).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add a publication path for needs-review findings

The lifecycle here requires every published finding to pass through confirmed, but confirmed is only reachable from true-positive (see the later scenario), while needs-review is explicitly required to be surfaced to humans elsewhere in this spec. In runs where evidence is incomplete, implementers following this state machine will either be forced to suppress needs-review items (missing required human review) or violate the lifecycle contract, so the state transitions need an explicit needs-review publication path.

Useful? React with 👍 / 👎.


### Requirement: Multi-Vendor Verdict Consensus and Calibration

Sentinel SHALL combine per-vendor results into a verdict through principled synthesis rather than by placing raw outputs from different vendors on a shared scale (Deviation D-1; Constitution I, V). Each vendor SHALL apply the rubric uniformly so its own scale is self-consistent (within-vendor consistency). Before results from different vendors are combined, their scales SHALL be calibrated to a common reference using owned, versioned calibration configuration (cross-vendor calibration). Calibrated per-vendor results SHALL then be synthesized into a consensus verdict classified as `confirmed`, `unconfirmed`, or `disagreement` with each vendor's disposition recorded, reusing the `parallel-infrastructure` consensus substrate (`ConsensusSynthesizer`). The synthesized consensus verdict — not a lone vendor's — SHALL be what the Reporter publishes. Cross-vendor `disagreement` SHALL be surfaced for human attention rather than silently averaged.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reconcile consensus verdict taxonomy with triage verdicts

This requirement says cross-vendor synthesis produces verdicts confirmed/unconfirmed/disagreement and that this synthesized verdict is what Reporter publishes, but other requirements define publication behavior and labels around true-positive/needs-review/etc. Without a normative mapping between these two verdict taxonomies, different implementations can publish incompatible states for the same finding, breaking dedup/comparison and operator workflows across runs.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants