Providing sufficient baseline rules and identifying appropriate special skills #5

hannah-gorman-nycha · 2025-12-08T15:00:36Z

hannah-gorman-nycha
Dec 8, 2025

Good morning, I hope you are well. I am interested in applying a SpecOps approach at my agency. In reviewing the source files, I noticed that the each instruction set included initial documentation of key rules and edge cases, almost like "seed" documentation to provide the model sufficient information to interpret the code correctly.

A few questions about this:

How did you identify what level of seed documentation was necessary for the model to identify the relevant rules?
How did you determine which rule set types needed to be defined as special skills?
Do you have any concerns about the model "missing" content in code because it is not related to the initial rules you provided as direction? I'm wondering if we might we need to think about a spec-ops layer that first identifies what kinds of business information is available in legacy code, and then confirms the proper skill to use to review.

Thanks for your time!

mheadd · 2026-04-06T14:16:16Z

mheadd
Apr 6, 2026
Maintainer

Hannah, first let me apologize for the criminally late response to your post. These are excellent questions that get to the heart of some of the trickier design decisions in SpecOps. Let me address each one:

On identifying the right level of seed documentation

The short answer is: I probably didn't get it right up front in developing this approach, and you shouldn't expect to either. The methodology deliberately frames instruction sets as starting at Level 1 (basic guidance sufficient to begin) and evolving through practice. What guides the initial scope is a principle from empirical benchmarking: focused instruction sets with 2–3 targeted modules consistently outperform broad, comprehensive documentation packages. More content is not better content — overly large instruction sets can actually degrade agent performance compared to concise ones.

In practice, this means starting with enough guidance to get legible output, then iterating. When the agent consistently misses a pattern or produces malformed specs, that's a signal to add a specific rule. When an instruction is routinely ignored or leads to hallucinated structure, that's a signal to remove it. The path from rough to refined is as much about editing out what doesn't help as adding what does.

On determining which rule sets become dedicated skills vs. inline guidance

The test to apply: Is this procedural knowledge the model is unlikely to already have? General programming patterns (what a loop is, how functions work) don't need instruction sets — models know that. But something like how a state's categorical eligibility rules interact with federal SNAP guidelines, or how COBOL PERFORM THRU clauses should be documented — that's specialized procedural knowledge that won't be in training data. That's where a dedicated skill earns its overhead.

A secondary filter: Can the instruction set be scoped to a specific, repeatable class of tasks? "Understanding COBOL" is too broad. "Extracting business rules from COBOL conditional logic" is appropriately narrow. The narrower the task alignment, the more effective the skill.

On the concern about missing business information — this is a real gap

Your instinct here is well-founded, and it points to something I think of as a discovery-layer or routing problem. Phase 1 of the methodology (Discovery and Assessment) includes a knowledge assessment step specifically to surface what kinds of business logic and edge cases are likely present in a system before the AI agent starts generating specs. But you're right that this is largely a human-driven activity today — domain experts reviewing the system inventory, not an automated classification pass.

The idea you're describing — a SpecOps layer that first characterizes what types of business information are present in legacy code, and then selects or confirms the appropriate skill — is a compelling direction. Think of it as a "skill router" or meta-instruction-set. I've thought about this as a future pattern, but it hasn't been prototyped in a reference implementation yet. If you or others at your agency are in a position to explore this, it would be a genuinely valuable contribution to the community. The underlying research (the SkillsBench study referenced in INSTRUCTION-SETS.md supports the idea that skill selection matters significantly — models benefit from targeted, well-matched instruction sets, not broad coverage.

Happy to discuss further if you like. What type of system are you looking to apply this to?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Providing sufficient baseline rules and identifying appropriate special skills #5

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Providing sufficient baseline rules and identifying appropriate special skills #5

Uh oh!

hannah-gorman-nycha Dec 8, 2025

Replies: 1 comment

Uh oh!

mheadd Apr 6, 2026 Maintainer

On identifying the right level of seed documentation

On determining which rule sets become dedicated skills vs. inline guidance

On the concern about missing business information — this is a real gap

hannah-gorman-nycha
Dec 8, 2025

mheadd
Apr 6, 2026
Maintainer