Skip to content

Conversation

@simbo1905
Copy link
Owner

@simbo1905 simbo1905 commented Sep 15, 2025

Summary

This PR implements comprehensive metrics collection in JsonSchemaCheckIT to provide defensible, repeatable compatibility statistics instead of estimated percentages.

Changes

  • Added SuiteMetrics class: Thread-safe metrics container with LongAdder counters
  • Enhanced test execution tracking: Now counts groups discovered, tests discovered, validations run, passed/failed
  • Categorized skip reasons:
    • unsupportedSchemaGroup: Whole groups skipped at compile time
    • testException: Individual tests that threw exceptions
    • lenientMismatch: Expected≠actual in lenient mode
  • Console summary: One-line output with detailed breakdown
  • Structured export: JSON/CSV output via -Djson.schema.metrics=json|csv
  • Per-file analysis: Detailed breakdown by test file
  • Updated documentation: Replaced estimated 71% with actual measured 63.3% compatibility
  • Fixed escaping bug: Proper JSON and CSV escaping for filenames with special characters

Example Output

Console Summary (Lenient Mode)

JSON-SCHEMA SUITE (LENIENT): groups=420 testsScanned=1822 run=1657 passed=1153 failed=0 skipped={unsupported=70, exception=2, lenientMismatch=504}

Console Summary (Strict Mode)

JSON-SCHEMA SUITE (STRICT): groups=420 testsScanned=1822 run=1657 passed=1153 failed=504 skipped={unsupported=70, exception=2, lenientMismatch=0}

Actual Measured Compatibility

63.3% (1,153 of 1,822 tests pass) - replacing the previous estimated claim of 71%

Test Coverage: 420 test groups, 1,657 validation attempts, 576 total skips categorized

Bug Fix: Filename Escaping

Fixed a critical bug where filenames containing special characters (quotes, commas, newlines) would break the JSON and CSV output:

  • JSON escaping: Properly handles quotes, backslashes, control characters, and Unicode
  • CSV escaping: Correctly handles commas, quotes, and newlines by wrapping in quotes when needed
  • Prevents malformed output: Ensures downstream parsing works correctly

Compatibility

  • Zero breaking changes: Existing behavior preserved
  • Thread-safe: Uses concurrent data structures for parallel execution
  • No new dependencies: Uses only JDK built-in classes
  • Backwards compatible: All existing test runs work exactly as before

Usage

# Default lenient mode with console metrics
mvn test -Dtest=JsonSchemaCheckIT

# Strict mode with console metrics
mvn test -Dtest=JsonSchemaCheckIT -Djson.schema.strict=true

# Export JSON metrics
mvn test -Dtest=JsonSchemaCheckIT -Djson.schema.metrics=json

# Export CSV metrics  
mvn test -Dtest=JsonSchemaCheckIT -Djson.schema.metrics=csv

Documentation Updates

  • Updated all README files to reflect actual measured compatibility (63.3% vs estimated 71%)
  • Added comprehensive metrics reporting documentation
  • Documented JSON/CSV export functionality
  • Provided clear usage examples for metrics collection

Impact

This provides the defensible metrics needed to support compatibility claims:

  • Actual measurements: No more estimated percentages
  • Detailed categorization: Understand why tests are skipped
  • Reproducible results: Same metrics every run
  • Tool-friendly: JSON/CSV for CI integration and reporting
  • Honest reporting: Accurate 63.3% instead of optimistic 71% estimate
  • Robust output: Proper escaping prevents parsing failures

Fixes #31
EOF < /dev/null

…tibility statistics

- Add comprehensive SuiteMetrics class with thread-safe counters
- Track groups discovered, tests discovered, validations run, passed/failed
- Categorize skips: unsupportedSchemaGroup, testException, lenientMismatch
- Add console summary line with detailed metrics breakdown
- Support JSON/CSV export via -Djson.schema.metrics=json|csv
- Add per-file breakdown for detailed analysis
- Preserve existing strict/lenient behavior while adding metrics
- Zero additional dependencies, thread-safe implementation

Fixes #31
…maCheckIT

- Replace estimated 71% compatibility with actual measured 63.3% (1,153 of 1,822 tests)
- Add comprehensive metrics reporting documentation
- Document test coverage: 420 groups, 1,657 validations, 576 skips categorized
- Add usage examples for JSON/CSV metrics export
- Clarify distinction between lenient and strict mode results
- Provide defensible statistics based on actual test suite measurements

The documentation now reflects the accurate, measured compatibility
statistics provided by the new metrics system rather than estimates.
@simbo1905 simbo1905 requested a review from Copilot September 15, 2025 05:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements comprehensive metrics collection in JsonSchemaCheckIT to provide defensible, repeatable compatibility statistics instead of estimated percentages.

  • Added SuiteMetrics class with thread-safe counters for groups, tests, validations, and skip reasons
  • Enhanced test execution tracking to count discovered vs run tests and categorize skips (unsupported groups, exceptions, lenient mismatches)
  • Implemented console summary and structured export functionality (JSON/CSV) with per-file breakdown
  • Updated documentation to replace estimated 71% compatibility with actual measured 63.3%

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
JsonSchemaCheckIT.java Added comprehensive metrics collection, SuiteMetrics class, and export functionality
README.md Updated compatibility claims from estimated 71% to measured 63.3% with metrics details
AGENTS.md Added metrics reporting documentation and current compatibility statistics
README.md (root) Updated JSON Schema compatibility section with measured statistics and metrics usage

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +143 to +145
private static SuiteMetrics.FileCounters perFile(Path file) {
return METRICS.perFile.computeIfAbsent(file.getFileName().toString(), k -> new SuiteMetrics.FileCounters());
}
Copy link

Copilot AI Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The method name 'perFile' is unclear and doesn't indicate it's a getter/accessor. Consider renaming to 'getOrCreateFileCounters' to better express its purpose.

Copilot uses AI. Check for mistakes.
);

if (!METRICS_FMT.isEmpty()) {
var outDir = java.nio.file.Path.of("target");
Copy link

Copilot AI Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Hardcoded 'target' directory path reduces flexibility. Consider using a system property or making it configurable, especially since other paths in the codebase may vary across different build environments.

Suggested change
var outDir = java.nio.file.Path.of("target");
var outDir = java.nio.file.Path.of(System.getProperty("test.output.dir", "target"));

Copilot uses AI. Check for mistakes.
Comment on lines +179 to +181
private static String buildJsonSummary(boolean strict, String timestamp) {
var totals = new StringBuilder();
totals.append("{\n");
Copy link

Copilot AI Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Manual JSON construction using StringBuilder is error-prone and hard to maintain. Consider using the existing ObjectMapper (MAPPER) to serialize a data structure to JSON, which would be more robust and less prone to formatting errors.

Copilot uses AI. Check for mistakes.
/**
* Thread-safe metrics container for the JSON Schema Test Suite run.
*/
final class SuiteMetrics {
Copy link

Copilot AI Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The SuiteMetrics class is missing comprehensive documentation. Add class-level Javadoc explaining its purpose, thread-safety guarantees, and the meaning of different counter categories to help future maintainers understand the metrics collection design.

Copilot uses AI. Check for mistakes.
@simbo1905 simbo1905 merged commit 65da474 into tests/openrpc-it Sep 15, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants