Skip to content

Commit 24166a4

Browse files
authored
Issue #102: reject invalid discriminators at compile time with clear error and use purely stack based runtime (#103)
* Issue #102: reject invalid discriminators at compile time with clear error and use purely stack based runtime This commit addresses issue #102 by implementing comprehensive discriminator validation with the following improvements: **Compile-time Discriminator Validation (RFC 8927 §2.2.8):** - Enforce that discriminator mapping values must be PropertiesSchema (not EmptySchema) - Prevent nullable discriminator mappings with clear error messages - Validate that discriminator keys are not redefined in mapping schema properties/optionalProperties - Provide detailed error messages with actual values and schema context **Stack-based Runtime Validation:** - Refactored EnumSchema and TypeSchema to use validateWithFrame exclusively - Moved validation logic from direct validate() methods to frame-based approach - Enhanced error messages with offset and path information - Maintained backward compatibility through delegation patterns **Key Improvements:** - Fixed property test generation to avoid discriminator key conflicts - Updated manual tests to comply with RFC 8927 discriminator constraints - All validation now uses the iterative stack-based approach - Clear, descriptive error messages for all discriminator violations **Test Results:** - All 96 tests passing including property-based tests with 1000 generated cases - Full RFC 8927 compliance test suite passing - Integration tests with official JTD Test Suite passing This implementation ensures discriminators are validated at compile time with clear error messages and uses the purely stack-based runtime approach as requested in issue #102. * Update CI test count to match new actual test count of 498 tests
1 parent 4f07014 commit 24166a4

File tree

13 files changed

+1742
-216
lines changed

13 files changed

+1742
-216
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ jobs:
3939
for k in totals: totals[k]+=int(r.get(k,'0'))
4040
except Exception:
4141
pass
42-
exp_tests=475
42+
exp_tests=509
4343
exp_skipped=0
4444
if totals['tests']!=exp_tests or totals['skipped']!=exp_skipped:
4545
print(f"Unexpected test totals: {totals} != expected tests={exp_tests}, skipped={exp_skipped}")

AGENTS.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,36 @@ LOG.fine(() -> "PERFORMANCE WARNING: Validation stack processing " + count + ...
107107
### Additional Guidance
108108
- Logging rules apply globally. The helper superclass ensures JUL configuration remains compatible with `$(command -v mvnd || command -v mvn || command -v ./mvnw)`.
109109

110+
### Error Message Standards
111+
When throwing exceptions for invalid schemas, provide descriptive error messages that help users understand:
112+
- What specific constraint was violated
113+
- The actual value or structure that caused the problem
114+
- The expected valid format or values
115+
116+
**Good Error Messages:**
117+
```java
118+
// Include the actual invalid value
119+
throw new IllegalArgumentException("enum contains duplicate values: " +
120+
values.stream().collect(Collectors.joining(", ", "[", "]")));
121+
122+
// Include the problematic schema portion
123+
throw new IllegalArgumentException("Type schema contains unknown key: " + key +
124+
" in schema: " + Json.toDisplayString(obj, 0));
125+
126+
// Include both expected and actual values
127+
throw new IllegalArgumentException("unknown type: '" + typeStr +
128+
"', expected one of: boolean, string, timestamp, int8, uint8, int16, uint16, int32, uint32, float32, float64");
129+
```
130+
131+
**Poor Error Messages (Avoid):**
132+
```java
133+
throw new IllegalArgumentException("enum contains duplicate values"); // No context
134+
throw new IllegalArgumentException("invalid schema"); // Too vague
135+
throw new IllegalArgumentException("bad value"); // No specifics
136+
```
137+
138+
Use `Json.toDisplayString(value, depth)` to render JSON fragments in error messages, and include relevant context like schema paths, actual vs expected values, and specific constraint violations.
139+
110140
## JSON Compatibility Suite
111141
```bash
112142
# Build and run compatibility report

PLAN.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Discriminator Validation Refactor Plan
2+
3+
## Objectives
4+
- Document the intended single-path validation architecture and discriminator rules before touching implementation.
5+
- Additive-first: introduce tests and supporting guards ahead of removing legacy recursive validation.
6+
- Enforce RFC 8927 discriminator constraints at compilation while preserving stack-based runtime checks with the discriminator tag exemption.
7+
- Retire duplicate validation paths only after documentation and tests clearly describe the target behavior.
8+
9+
## Sequenced Work Breakdown
10+
11+
1. **Documentation First**
12+
- Expand `json-java21-jtd/ARCHITECTURE.md` with explicit guidance on single stack-based validation, discriminator tag exemption, and compile-time schema constraints.
13+
- Audit other affected docs (e.g., module README, AGENTS addenda) to ensure contributors receive clear instructions prior to code edits.
14+
15+
2. **Reconnaissance & Impact Analysis**
16+
- Inventory all usages of `JtdSchema.*#validate` and catalogue callers/tests that rely on the recursive path.
17+
- Trace how `pushChildFrames` and `PropertiesSchema` cooperate today to support discriminator handling and identify touchpoints for additive changes.
18+
19+
3. **Test Design (Additive Stage)**
20+
- Keep `JtdSpecIT` focused solely on `validation.json` scenarios so we only assert that published docs validate as expected.
21+
- Add `CompilerSpecIT` to execute `invalid_schemas.json`, asserting compilation failures with deterministic messages instead of skipping them.
22+
- Introduce `CompilerTest` for incremental compiler-focused unit cases that exercise new stack artifacts while staying within JUL logging discipline (INFO log at method start via logging helper).
23+
24+
4. **Compile-Time Guards (Additive)**
25+
- Introduce validation during schema compilation to reject non-`PropertiesSchema` discriminator mappings, nullable mappings, and mappings that shadow the discriminator key.
26+
- Emit deterministic exception messages with precise schema paths to keep property-based tests stable.
27+
28+
5. **Stack Engine Enhancements (Additive)**
29+
- Teach the stack-based validator to propagate an "exempt key" for discriminator payload evaluation before deleting any recursive logic.
30+
- Adjust `PropertiesSchema` handling to skip the exempt key when checking required/optional members and additionalProperties while preserving crumb/schemaPath reporting.
31+
32+
6. **Retire Recursive `validate` Path (Removal Stage)**
33+
- After new tests pass with the enhanced stack engine, remove the per-schema `validate` implementations and their call sites.
34+
- Clean up dead code, imports, and update any remaining references discovered by static analysis.
35+
36+
7. **Verification & Cleanup**
37+
- Run targeted tests with mandated JUL logging levels (FINE/FINEST for new cases) followed by the full suite at INFO to confirm conformity.
38+
- Revisit documentation to confirm implemented behavior matches the authored guidance and adjust if gaps remain.
39+
40+
## Risks & Mitigations
41+
- **Hidden Recursive Callers**: Use `rg "\.validate\("` and compiler errors to surface stragglers before deleting the recursive path.
42+
- **Error Message Drift**: Lock expected `schemaPath`/`instancePath` outputs in tests once new guards land to prevent flakiness.
43+
- **Temporal Test Failures**: Stage code changes so new tests are introduced alongside the supporting implementation to avoid committing failing tests.
44+
- **Documentation Drift**: Re-review docs post-implementation to ensure instructions still match code.
45+
46+
## Out of Scope
47+
- Remote reference compilation or runtime changes (handled by MVF initiative).
48+
- Adjustments to global logging frameworks beyond what is necessary for new tests.
49+
- Broader API redesigns outside discriminator handling and validation-path consolidation.
50+
51+
## Documentation Targets ✅
52+
- json-java21-jtd/ARCHITECTURE.md: add single-path validation, discriminator tag exemption, compile-time guard details.
53+
- README.md (module-level if present): summarize discriminator constraints and reference architecture section.
54+
- AGENTS.md references (if needed): reinforce documentation-before-code steps for discriminator work.
55+
56+
## Test Targets
57+
- New compile-time rejection test suite ensuring discriminator mapping violations throw with deterministic schema paths.
58+
- Runtime tests for RFC 8927 §3.3.8 scenarios (missing tag, non-string tag, unknown mapping, payload mismatch, success).
59+
- Regression coverage ensuring tag exemption prevents additionalProperties violations when payload omits discriminator.
60+
- Removal/update of tests invoking `JtdSchema.validate` only after stack engine passes new scenarios.

json-java21-jtd/ARCHITECTURE.md

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,15 @@ JTD defines eight mutually-exclusive schema forms:
2424
7. **values** - Validates objects with homogeneous values (RFC 8927 §2.2.7)
2525
8. **discriminator** - Validates tagged unions (RFC 8927 §2.2.8)
2626

27+
### Discriminator Schema Constraints (RFC 8927 §2.2.8)
28+
Discriminator schemas enforce compile-time constraints to ensure predictable validation:
29+
- **Mapping values must be PropertiesSchema**: Cannot use primitive types, enums, or other forms
30+
- **No nullable mappings**: Mapped schemas cannot have `nullable: true`
31+
- **Discriminator key exemption**: The discriminator field is ignored during payload validation
32+
- **No discriminator redefinition**: Mapped schemas cannot define the discriminator key in properties/optionalProperties
33+
34+
These constraints are enforced at compile-time, preventing invalid schemas from reaching validation.
35+
2736
## Architecture Flow
2837

2938
```mermaid
@@ -118,25 +127,34 @@ enum PrimitiveType {
118127

119128
## Validation Architecture
120129

130+
The JTD validator uses a single stack-based validation engine that enforces RFC 8927 compliance through immutable schema records. All validation flows through one path to prevent behavioral divergence.
131+
132+
### Single Path Validation Principle
133+
- **No per-schema validate() methods**: Schema records are immutable data only
134+
- **Stack-based only**: All validation uses `pushChildFrames()` and explicit stack traversal
135+
- **Compile-time enforcement**: Invalid schemas are rejected during compilation, not validation
136+
121137
```mermaid
122138
sequenceDiagram
123139
participant User
124-
participant JTDSchema
140+
participant JTD
125141
participant ValidationStack
126142
participant ErrorCollector
127143
128-
User->>JTDSchema: validate(json)
129-
JTDSchema->>ValidationStack: push(rootSchema, "#")
144+
User->>JTD: validate(schemaJson, instanceJson)
145+
JTD->>JTD: compileSchema(schemaJson)
146+
Note over JTD: Compile-time checks enforce RFC constraints
147+
JTD->>ValidationStack: push(rootSchema, "#")
130148
loop While stack not empty
131-
ValidationStack->>JTDSchema: pop()
132-
JTDSchema->>JTDSchema: validateCurrent()
149+
ValidationStack->>JTD: pop()
150+
JTD->>JTD: validateCurrent()
133151
alt Validation fails
134-
JTDSchema->>ErrorCollector: addError(path, message)
152+
JTD->>ErrorCollector: addError(path, message)
135153
else Has children
136-
JTDSchema->>ValidationStack: push(children)
154+
JTD->>ValidationStack: push(children)
137155
end
138156
end
139-
JTDSchema->>User: ValidationResult
157+
JTD->>User: ValidationResult
140158
```
141159

142160
## Error Reporting (RFC 8927 Section 3.2)
@@ -275,6 +293,18 @@ Run the official JTD Test Suite:
275293
$(command -v mvnd || command -v mvn || command -v ./mvnw) test -pl json-java21-jtd -Dtest=JtdSpecIT
276294
```
277295

296+
`JtdSpecIT` exercises only the published `validation.json` cases so coverage maps exactly to behaviour that downstream users rely on. Compilation enforcement is handled through dedicated suites:
297+
298+
- `CompilerSpecIT` replays `invalid_schemas.json` and asserts that compilation fails with deterministic exception messages for every illegal schema.
299+
- `CompilerTest` holds incremental unit tests for compiler internals (for example, discriminator guard scenarios) while still extending the JUL logging helper to emit INFO banners per method.
300+
301+
Run the compiler-focused suites when evolving compile-time logic:
302+
303+
```bash
304+
$(command -v mvnd || command -v mvn || command -v ./mvnw) test -pl json-java21-jtd -Dtest=CompilerSpecIT
305+
$(command -v mvnd || command -v mvn || command -v ./mvnw) test -pl json-java21-jtd -Dtest=CompilerTest
306+
```
307+
278308
## Performance Considerations
279309

280310
1. **Immutable Records**: Zero mutation during validation

json-java21-jtd/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,14 @@ Validates tagged unions:
121121
}
122122
```
123123

124+
**Discriminator Constraints (RFC 8927 §2.2.8):**
125+
- Mapping values must be `properties` schemas (not primitive types)
126+
- Mapped schemas cannot have `nullable: true`
127+
- Mapped schemas cannot define the discriminator key in properties/optionalProperties
128+
- The discriminator field is exempt from additionalProperties validation
129+
130+
These constraints are enforced at compile-time for predictable validation behavior.
131+
124132
## Nullable Schemas
125133

126134
Any schema can be made nullable by adding `"nullable": true`:

0 commit comments

Comments
 (0)