Skip to content

Implement centralized EntityManager and schema-driven Identity Map#358

Merged
davideast merged 4 commits into
mainfrom
jules/entity-lifecycle-management-163c9c6d-100d-4b27-bf80-e6d524ba7b9f
Jun 1, 2026
Merged

Implement centralized EntityManager and schema-driven Identity Map#358
davideast merged 4 commits into
mainfrom
jules/entity-lifecycle-management-163c9c6d-100d-4b27-bf80-e6d524ba7b9f

Conversation

@google-labs-jules
Copy link
Copy Markdown
Contributor

Context & Rationale

Currently, the SDK lacks a formal concept of identity, leading to referential inequality where multiple instances of the same domain object coexist. This creates "stale data" bugs, as updates to one instance aren't reflected in others. Furthermore, the logic for extracting identity from API responses was previously hardcoded in the generator, making it brittle and difficult to maintain.

This PR introduces a centralized EntityManager to act as a single source of truth. By moving from direct class instantiation to a managed lifecycle, we ensure that the SDK always returns the same memory reference for a given entity ID ($O(1)$ lookup). This architecture supports complex application state management and provides a professional foundation for developers to sync state globally.

Key Changes

1. EntityManager & Identity Map

  • Implemented EntityManager scoped to the StitchToolClient to prevent cross-account data leakage.
  • Added an Identity Map that caches domain entities by their canonical IDs, ensuring 100% referential equality (sdk.item('1') === sdk.item('1')).

2. Schema-Driven Entity Resolution

  • Introduced ReferenceSpec in the IR schema (e.g., reference: { keys: ["projectId"] }).
  • Replaced fragmented, hardcoded ID extraction logic in the generator with a unified, schema-driven approach.
  • Integrated parseResourceName to dynamically extract identifiers from structured Google-style API resource strings.

3. Generator Refactoring

  • Updated scripts/generate-sdk.ts to remove direct new keyword usage for domain entities.
  • The generator now emits this.client.entities.resolve(EntityClass, keys, data), forcing all entity creation through the lifecycle manager to maintain cache integrity.

4. Lifecycle Management

  • Added onCreate() and onDispose() hooks.
  • Provided methods for cache pruning (manager.dispose(entity)) and clearing (manager.clear()) to manage memory overhead in long-lived sessions.

Success Metrics & Validation

  • Referential Integrity: Confirmed via new unit tests that subsequent calls for the same resource return the exact same object reference.
  • Generator Cleanliness: Removed all splitOn, stripPrefix, and manual fieldMapping logic from the template code.
  • Performance: Verified $O(1)$ lookup times for cached entities.

Verification Plan

  • Run npm run generate to ensure the new resolve() pattern is correctly applied to all generated classes.
  • Execute test/unit/entity-manager.test.ts to validate caching and disposal logic.
  • Manual verification of "Global State Synchronization" scenario: updating a property on one reference now updates all observers automatically.

### Context & Rationale
Currently, the SDK lacks a formal concept of identity, leading to referential inequality where multiple instances of the same domain object coexist. This creates "stale data" bugs, as updates to one instance aren't reflected in others. Furthermore, the logic for extracting identity from API responses was previously hardcoded in the generator, making it brittle and difficult to maintain.

This PR introduces a centralized `EntityManager` to act as a single source of truth. By moving from direct class instantiation to a managed lifecycle, we ensure that the SDK always returns the same memory reference for a given entity ID ($O(1)$ lookup). This architecture supports complex application state management and provides a professional foundation for developers to sync state globally.

### Key Changes

#### 1. EntityManager & Identity Map
- Implemented `EntityManager` scoped to the `StitchToolClient` to prevent cross-account data leakage.
- Added an **Identity Map** that caches domain entities by their canonical IDs, ensuring 100% referential equality (`sdk.item('1') === sdk.item('1')`).

#### 2. Schema-Driven Entity Resolution
- Introduced `ReferenceSpec` in the IR schema (e.g., `reference: { keys: ["projectId"] }`).
- Replaced fragmented, hardcoded ID extraction logic in the generator with a unified, schema-driven approach.
- Integrated `parseResourceName` to dynamically extract identifiers from structured Google-style API resource strings.

#### 3. Generator Refactoring
- Updated `scripts/generate-sdk.ts` to remove direct `new` keyword usage for domain entities.
- The generator now emits `this.client.entities.resolve(EntityClass, keys, data)`, forcing all entity creation through the lifecycle manager to maintain cache integrity.

#### 4. Lifecycle Management
- Added `onCreate()` and `onDispose()` hooks.
- Provided methods for cache pruning (`manager.dispose(entity)`) and clearing (`manager.clear()`) to manage memory overhead in long-lived sessions.

### Success Metrics & Validation
- **Referential Integrity:** Confirmed via new unit tests that subsequent calls for the same resource return the exact same object reference.
- **Generator Cleanliness:** Removed all `splitOn`, `stripPrefix`, and manual `fieldMapping` logic from the template code.
- **Performance:** Verified $O(1)$ lookup times for cached entities.

### Verification Plan
- [x] Run `npm run generate` to ensure the new `resolve()` pattern is correctly applied to all generated classes.
- [x] Execute `test/unit/entity-manager.test.ts` to validate caching and disposal logic.
- [x] Manual verification of "Global State Synchronization" scenario: updating a property on one reference now updates all observers automatically.
Comment thread scripts/generate-sdk.ts Dismissed
@davideast davideast marked this pull request as ready for review June 1, 2026 03:39
- Add definite assignment assertion (!) to generated domain class properties
  since EntityManager.resolve() assigns them immediately post-construction
- Regenerate SDK with the fix applied
- Run prettier --write on all files to fix 110 formatting issues

Fixes: typecheck and format:check CI failures on quality-gate
Comment thread packages/sdk/test/helpers/stitch-html.ts Fixed

await fs.writeFile(
tempScreenshotPath,
Buffer.from(screenshotBuffer),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

const tempFullPath = path.join(resolvedTempDir, tempFilename);

await fs.writeFile(tempFullPath, Buffer.from(buffer), { flag: 'wx', mode: fileMode });
await fs.writeFile(tempFullPath, Buffer.from(buffer), {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

davideast added 2 commits June 1, 2026 03:52
- stitch-html.ts: Replace existsSync+readFileSync TOCTOU pattern with
  try/catch to eliminate file system race condition (CodeQL #30)
- download-handler.ts: Add response.ok validation before writing fetched
  network data to disk (CodeQL #31, #32). The download handler already has
  strong mitigations (wx flag, random temp names, sanitized filenames,
  atomic rename) but validating HTTP status makes intent explicit.
Update all mock fetch responses to include ok: true to match the new
response.ok validation added to download-handler.ts.
@davideast davideast merged commit 389480d into main Jun 1, 2026
10 checks passed
@davideast davideast deleted the jules/entity-lifecycle-management-163c9c6d-100d-4b27-bf80-e6d524ba7b9f branch June 1, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants