Description
This proposal introduces a class-based system for JSON schema generation and validation in the Context Generator. Rather than duplicating information in separate schema files, we'll use PHP class definitions as the single source of truth, extracting schema information directly from class structure, type hints, and targeted attributes.
2. Design Goals
- Establish a single source of truth for both PHP objects and JSON schema
- Leverage PHP 8.3's native features (type hints, constructor property promotion, etc.) whenever possible
- Use attributes strategically only where native PHP features are insufficient
- Generate comprehensive JSON schema for validation and IDE integration
- Enable automatic mapping between JSON and PHP objects
- Provide a consistent approach for all schema components (documents, sources, modifiers)
3. Key Components
3.1 Class Hierarchy
The schema structure will be represented by a hierarchy of PHP classes:
JsonSchema (root)
├── Documents[]
│ ├── Sources[]
│ │ ├── FileSource
│ │ ├── GithubSource
│ │ ├── GitDiffSource
│ │ ├── UrlSource
│ │ └── TextSource
│ └── Modifiers[]
│ ├── PhpSignatureModifier
│ ├── PhpContentFilterModifier
│ ├── PhpDocsModifier
│ └── SanitizerModifier
└── Settings
└── ModifierAliases{}
3.2 Core Attributes
We'll define a set of attributes to enhance the schema information that can't be expressed through PHP's type system alone:
// Marks classes that should be included in schema generation
#[Attribute(Attribute::TARGET_CLASS)]
class SchemaType {
public function __construct(
public readonly string $name,
public readonly string $description = '',
public readonly bool $isRoot = false,
) {}
}
// Skip properties from schema generation
#[Attribute(Attribute::TARGET_PROPERTY)]
class Skip {}
// Override property details for schema
#[Attribute(Attribute::TARGET_PROPERTY)]
class Property {
public function __construct(
public readonly ?string $name = null,
public readonly ?string $description = null,
public readonly ?string $format = null,
) {}
}
// Mark property as required in schema
#[Attribute(Attribute::TARGET_PROPERTY)]
class Required {}
// Add pattern validation for string properties
#[Attribute(Attribute::TARGET_PROPERTY)]
class Pattern {
public function __construct(
public readonly string $pattern,
public readonly ?string $description = null
) {}
}
// Define enumerated values
#[Attribute(Attribute::TARGET_PROPERTY)]
class Enum {
public function __construct(public readonly array $values) {}
}
// Define the structure of array items
#[Attribute(Attribute::TARGET_PROPERTY)]
class Items {
public function __construct(
public readonly string $type,
public readonly ?string $ref = null,
public readonly ?array $enum = null,
) {}
}
// Define a property as reference to another schema type
#[Attribute(Attribute::TARGET_PROPERTY)]
class Reference {
public function __construct(
public readonly string $type,
) {}
}
4. Schema Generation Process
The schema generation process will involve:
- Finding all classes marked with
#[SchemaType]
attribute - Using reflection to analyze class properties, types, and attributes
- Mapping PHP types to JSON schema types
- Building a complete schema structure from the class hierarchy
- Serializing the schema to JSON
4.1 Type Mapping
PHP types will be mapped to JSON schema types:
int
,float
→number
string
→string
bool
→boolean
array
→array
object
→object
- Class types → References to defined schemas
4.2 Schema Generator Implementation
class SchemaGenerator
{
private array $definitions = [];
private array $processedClasses = [];
public function generateSchema(): array
{
// Find all classes with SchemaType attribute
$schemaClasses = $this->findSchemaClasses();
// Process root schema class first
$rootClass = $this->findRootSchemaClass($schemaClasses);
$this->processClass(new \ReflectionClass($rootClass));
// Process all remaining classes
foreach ($schemaClasses as $class) {
if ($class !== $rootClass) {
$this->processClass(new \ReflectionClass($class));
}
}
return [
'$schema' => 'http://json-schema.org/draft-07/schema#',
'fileMatch' => ['context.json'],
'title' => 'Context Generator Configuration',
'description' => 'Configuration schema for Context Generator',
'type' => 'object',
'required' => ['documents'],
'properties' => [
'documents' => [
'type' => 'array',
'description' => 'List of documents to generate',
'items' => ['$ref' => '#/definitions/document']
],
'settings' => [
'type' => 'object',
'description' => 'Global settings',
'$ref' => '#/definitions/settings'
]
],
'definitions' => $this->definitions
];
}
private function processClass(\ReflectionClass $class): void
{
// Implementation details...
}
private function processProperty(\ReflectionProperty $property, array &$schema): void
{
// Implementation details...
}
// Additional helper methods...
}
5. Implementation Example
5.1 Source Type Definition
Here's how a source type would be defined using this approach:
#[SchemaType(name: 'fileSource', description: 'File source - includes content from local filesystem')]
final class FileSource extends AbstractSourceWithModifiers implements FilterableSourceInterface
{
public function __construct(
#[Required]
#[Property(description: 'Path(s) to directory or files to include')]
public readonly string|array $sourcePaths,
string $description = '',
#[Property(description: 'File pattern(s) to match')]
public readonly string|array $filePattern = '*.*',
#[Property(description: 'Patterns to exclude files')]
public readonly array $notPath = [],
#[Property(description: 'Patterns to include only files in specific paths')]
public readonly string|array $path = [],
#[Property(description: 'Patterns to include only files containing specific content')]
public readonly string|array $contains = [],
#[Property(description: 'Patterns to exclude files containing specific content')]
public readonly string|array $notContains = [],
#[Pattern(pattern: '^[<>]=?\\s+\\d+[KMG]i?$|^since\\s+.+$',
description: 'Size constraints (e.g., "> 10K", "< 1M")')]
#[Property(description: 'Size constraints for files')]
public readonly string|array $size = [],
#[Property(description: 'Whether to display a directory tree visualization')]
public readonly bool $showTreeView = true,
array $modifiers = [],
) {
parent::__construct($description, $modifiers);
}
// Method implementations...
}
5.2 Modifier Definition
Similarly, here's how a modifier would be defined:
#[SchemaType(name: 'phpContentFilterModifier', description: 'PHP content filter modifier - filter PHP class elements based on criteria')]
final class PhpContentFilterModifier implements ModifierInterface
{
public function __construct(
#[Items(type: 'string')]
#[Property(description: 'Method names to include (empty means include all unless exclude_methods is set)')]
public readonly array $includeMethods = [],
#[Items(type: 'string')]
#[Property(description: 'Method names to exclude')]
public readonly array $excludeMethods = [],
#[Items(type: 'string', enum: ['public', 'protected', 'private'])]
#[Property(description: 'Method visibilities to include')]
public readonly array $methodVisibility = ['public', 'protected', 'private'],
#[Property(description: 'Whether to keep method bodies or replace with placeholders')]
public readonly bool $keepMethodBodies = false,
// Additional properties...
) {}
// Method implementations...
}
6. Integration with Existing Code
6.1 Loading Configuration
The existing configuration loader will be enhanced to use the new object-oriented model:
class JsonConfigDocumentsLoader implements DocumentsLoaderInterface
{
// Existing code...
public function load(): DocumentRegistry
{
// Read and parse JSON as before
$jsonContent = $this->files->read($this->configPath);
$config = \json_decode($jsonContent, true, flags: JSON_THROW_ON_ERROR);
// Use new object mapper to create schema objects
$objectMapper = new ObjectMapper();
$jsonSchema = $objectMapper->mapToObject($config, JsonSchema::class);
// Convert to DocumentRegistry for backward compatibility
return $this->convertToDocumentRegistry($jsonSchema);
}
// Additional methods...
}
6.2 Validation
The schema objects can be used for validation:
class JsonValidator
{
private SchemaGenerator $schemaGenerator;
public function __construct(SchemaGenerator $schemaGenerator)
{
$this->schemaGenerator = $schemaGenerator;
}
public function validate(array $data): array
{
$schema = $this->schemaGenerator->generateSchema();
// Use JSON Schema validation library
$validator = new \JsonSchema\Validator();
$validator->validate($data, $schema);
return $validator->getErrors();
}
}
7. Benefits of This Approach
- Single Source of Truth: PHP classes define both runtime behavior and schema constraints
- Self-documenting: Class structure and attributes make the schema requirements clear
- Type Safety: PHP's type system provides first-class validation
- IDE Support: Class structure enables autocomplete and refactoring support
- Extensibility: Easy to add new source types or modifiers
- Maintainability: Changes to properties are reflected automatically in the schema
Metadata
Metadata
Assignees
Labels
Type
Projects
Status