Skip to content

Templates

ehennestad edited this page Apr 8, 2025 · 3 revisions

Schema Processing System - createFormSchemas.js

Overview

The source schemas in the templates folder serves as templates that needs to be processed in order to produce the complete schemas representing each page in the Wizard. The transformation of the schema templates are done by the assembleRJSFSchemas function from the createFormSchemas.js module.

This document provides an explanation of how the schema processing system works.

Main Function: assembleRJSFSchemas

Purpose

The assembleRJSFSchemas function orchestrates the schema processing pipeline by transforming modular schema templates into complete, self-contained schemas:

  1. Injects controlled term- and other metadata instances from the Knowledge Graph into the schemas
  2. Resolves references between schema files (RJSF does not natively support external schema references (i.e a reference to a schema in an external file))
  3. Processes text modules and other dynamic content
  4. Outputs production-ready schemas for the React JSON Schema Form (RJSF) components

Workflow

The function executes the following steps in sequence:

  1. Copy Source Schemas:

    • Copies all schema files from templates/source_schemas to a temporary directory
    • Creates a working copy that can be modified without affecting the original templates
  2. Expand Definition Schemas:

    • Processes schema definition files in the definitions folder
    • Uses populateSchema to expand template variables and inject controlled terms
  3. Dereference Definition Schemas:

    • Resolves all $ref references in definition schemas
    • Creates standalone, self-contained schema definitions
    • Saves these to a dereferenced subdirectory
  4. Expand Form Schemas:

    • Processes the main form schemas (experiment.json, funding.json, etc.)
    • Applies the same template expansion as with definition schemas
  5. Dereference Form Schemas:

    • Resolves all references in the main form schemas
    • Outputs the final schemas to src/modules/Wizard/Schemas for use by the application
  6. Post-Process Funding Schema:

    • Applies special processing to the funding schema
    • Handled by a separate module (postProcessFundingSchema)
  7. Resolve UI Schema:

    • Processes UI schema files that control the form layout and behavior
    • Resolves all references and saves to src/schemas
  8. Cleanup:

    • Removes the temporary directory and all intermediate files
    • Restores the original working directory

Key Components

1. processSchemas.js

This utility function handles the batch processing of schema files:

  • Takes source and target directories, a processing function, and a title
  • Reads all files from the source directory
  • Applies the processing function to each file
  • Writes the results to the target directory
  • Provides logging and error handling

The function is used multiple times in the main workflow with different processing functions to handle different transformation steps.

2. populateSchema.js

This is the core schema transformation engine that:

  • Recursively traverses schema objects
  • Processes special properties that indicate dynamic content:
    • controlledTermSet: Injects a set of controlled terms with their instances
    • instanceTypeSet: Injects instances of specific types (e.g., Person, Organization)
    • schemaTypeSet: Injects referenced schemas
    • controlledTerm: Injects instances of a specific controlled term
    • keywordSet: Injects keywords from controlled terms
    • textModule: Injects HTML content from text module files
    • openMindsType: Injects instances of OpenMINDS types that are not controlled terms

The function transforms template schemas into complete schemas with all dynamic content resolved.

Key Transformation Functions:

  • addControlledTermSetToSchema: Creates a two-level selection mechanism (term → instance)
  • addKeywordSetSetToSchema: Adds keyword suggestions from controlled terms
  • addReferencedSchemas: Incorporates other schemas based on selection
  • addControlledTermInstancesToSchema: Populates enums or examples with controlled term instances
  • addOpenMindsInstanceToSchema: Formats OpenMINDS instances for display
  • getHtmlString: Processes HTML text modules for inclusion in schemas

3. getControlledTerms.js

This module handles the loading of controlled terms and instances:

  • importControlledTerms: Loads controlled term definitions from JSON files
  • importInstances: Loads instance data (e.g., Person, Organization) from JSON files

These functions provide the data that populateSchema injects into the schemas.

4. postProcessFundingSchema.js

This specialized module handles the complex funding schema:

  • Processes relationships between organizations and funding instances
  • Creates a dynamic schema structure based on available funding options
  • Customizes the form behavior based on whether an organization has existing funding

Technical Details

Schema Dereferencing

The system uses @apidevtools/json-schema-ref-parser to resolve $ref references in schemas. This process:

  1. Replaces references with the actual content they point to
  2. Creates self-contained schemas that don't rely on external files
  3. Requires careful management of working directories to resolve relative paths correctly

Working Directory Management

The code carefully manages the working directory (process.chdir()) during dereferencing to ensure that relative paths in schema references are resolved correctly. This is crucial for the proper functioning of the reference parser.

File System Operations

The system uses:

  • fs-extra for enhanced file system operations
  • Custom mkdirIfNotExists utility for directory creation
  • Promises-based file operations for asynchronous processing

Importance in the Application

This schema processing system is fundamental to the EBRAINS Metadata Wizard for several reasons:

  1. Modularity: Enables schema definitions to be reused across multiple forms
  2. Dynamic Content: Allows forms to include up-to-date controlled terms and instances
  3. Maintainability: Keeps source schemas clean and template-based
  4. Performance: Pre-processes schemas at build time rather than runtime
  5. Flexibility: Supports complex form behaviors through schema dependencies

The resulting schemas define the structure, validation rules, and UI behavior for all forms in the application, making this system essential to the metadata collection workflow.