Skip to content

Add agentic fuzzer to CI as regression gate #11

Description

@PLNech

Context

The fuzzer (scripts/fuzz-rtk.py) currently runs manually. After the agentic fuzzing lab week experiment, we have 139 static tests across 35 families that catch real regressions.

Proposal

Add a CI job that runs python3 scripts/fuzz-rtk.py --rounds 0 (static tests only, no LLM) on every PR. Fail if FAIL count exceeds a threshold (currently 22 - all classified as by-design or known limitations).

Requirements

  • Docker available in CI (for docker ps/images tests) or skip those families
  • Python 3.10+ for the fuzzer script
  • rtk binary built from PR branch
  • Threshold stored in a config file so it can be tightened over time

Current baseline

  • 139 tests, 95 PASS, 12 WARN, 22 FAIL, 10 SKIP
  • Failure rate: 15.8% (all classified)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions