Skip to content

Fix #1929: Bug: /api/v1/embeddings/maintenance causes 100% CPU and event loop starvation on#2038

Merged
syzsunshine219 merged 4 commits into
MemTensor:dev-v2.0.22from
Memtensor-AI:bugfix/autodev-1929
Jul 2, 2026
Merged

Fix #1929: Bug: /api/v1/embeddings/maintenance causes 100% CPU and event loop starvation on#2038
syzsunshine219 merged 4 commits into
MemTensor:dev-v2.0.22from
Memtensor-AI:bugfix/autodev-1929

Conversation

@Memtensor-AI

Copy link
Copy Markdown
Collaborator

Description

Fixes issue #1929GET /api/v1/embeddings/maintenance blocking the Node.js event loop for 4+ minutes at 100% CPU on databases with large traces tables. Root cause was computeEmbeddingMaintenanceStats()collectEmbeddingSlots() paginating every trace/policy/world_model/skill row through repos.*.list(), which reads and decodes the BLOB vector columns (vec_summary / vec_action / vec) purely so the stats path could inspect each vector's length. On the reporter's DB that meant ~1.1 GB of pread64 traffic and ~270 MB of JS heap allocations per request, all on the synchronous better-sqlite3 path.

The fix introduces embeddingMaintenanceCounts() in apps/memos-local-plugin/core/storage/repos/embedding_maintenance.ts, which issues five SELECT COUNT(*) + SUM(CASE WHEN ...) queries — one per (table, vec column) slot — using LENGTH(vec) for the dimension check. SQLite's LENGTH() returns the BLOB header byte length without copying the buffer, so the stats path never leaves SQLite. computeEmbeddingMaintenanceStats() in core/pipeline/memory-core.ts is rewired to the new helper; the pre-fix semantic filters (shouldTraceHaveEmbeddings short-text skip and isLightweightMemoryTrace action-vec carveout) are preserved verbatim in the SQL WHERE clauses so per-bucket counts do not shift for already-installed users. The public EmbeddingMaintenanceStats JSON shape is unchanged — the HTTP route, JSON-RPC bridge, viewer, and existing tests see the same response.

The tier-2 scanAndTopK bounding the reporter flagged as an "Additional Fix" is intentionally out of scope for this PR; the title and OpenClaw event-loop-block log both point at the maintenance endpoint, and keeping the surface tight makes the fix easy to review and revert.

Verification: 4/4 new unit tests pass (tests/unit/storage/embedding-maintenance.test.ts), 28/28 memory-core façade tests pass (tests/unit/pipeline/memory-core.test.ts, including the pre-existing repairs missing and wrong-dimension imported trace embeddings and does not require action vectors for lightweight memory traces regressions). npx vitest run across the whole suite shows 1048 passing / 3 pre-existing failures (e2e/v7-full-chain, migrator::namespace-visibility, traces-count::> 500) that all reproduce on the base branch dev-20260624-v2.0.22 after git stash — unrelated to this change. tsc -p tsconfig.json --noEmit and tsc -p tsconfig.build.json both exit 0.

Related Issue (Required): Fixes #1929

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g. code style improvements, linting)
  • Documentation update

How Has This Been Tested?

Automated tests are pending.

  • Unit Test
  • Test Script Or Test Steps (please provide)
  • Pipeline Automated API Test (please provide)

Checklist

  • I have performed a self-review of my own code
  • I have commented my code in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have created related documentation issue/PR in MemOS-Docs (if applicable)
  • I have linked the issue to this PR (if applicable)
  • I have mentioned the person who will review this PR

@MatthewZhuang, @CarltonXiang, @syzsunshine219, @World-controller please review this PR.

Reviewer Checklist

…sor#1929)

`GET /api/v1/embeddings/maintenance` used to paginate every trace / policy /
world_model / skill row through `repos.*.list()`, which reads full BLOB
vector columns and decodes them into Float32Array on the JS heap purely so
the maintenance path could inspect each vector's length. On a production
deployment with ~93K traces × 2 vector columns × 1536 dims × 4 bytes ≈ 1.1 GB
of BLOB pread64 traffic and ~270 MB of JS heap allocations per request, all
on the synchronous better-sqlite3 path — the entire OpenClaw gateway event
loop was starved for 4+ minutes at 100% CPU while the stats call ran, as
strace (99.96% pread64) and the observed `eventLoopDelayMaxMs=285883` /
`durationMs=292731` confirmed.

The maintenance endpoint only needs counts, not the vector bodies. This
change adds `embeddingMaintenanceCounts()` to `core/storage/repos/`, which
issues five `SELECT COUNT(*) + SUM(CASE WHEN ...)` queries — one per
`(table, vec column)` slot — using `LENGTH(vec)` for the dimension check.
SQLite's `LENGTH()` on a BLOB column returns the header byte count without
copying the buffer, so the stats path never leaves SQLite. The two pre-fix
semantic filters (`shouldTraceHaveEmbeddings` and `isLightweightMemoryTrace`)
are preserved verbatim in the WHERE clauses so per-bucket counts do not
shift for already-installed users. The public `EmbeddingMaintenanceStats`
JSON shape is unchanged.

- Add `core/storage/repos/embedding_maintenance.ts` with SQL-only
  `embeddingMaintenanceCounts()` + `inferStoredEmbeddingByteLen()` helpers.
- Re-export them (and `FLOAT32_BYTES` / `EmbeddingCounts`) from
  `core/storage/repos/index.ts`.
- Rewire `core/pipeline/memory-core.ts::computeEmbeddingMaintenanceStats()`
  to the SQL fast path; drop the dead `inferStoredEmbeddingDimension(slots)`
  and `emptyEmbeddingStatsByKind()` helpers.
- New `tests/unit/storage/embedding-maintenance.test.ts` (4 cases) pins the
  bucket semantics, lightweight-memory carveout, short-text filter,
  dim-mismatch detection, empty-DB safety, `expectedByteLen=0` fallback,
  and the mode-based byte-length inference.

The tier-2 `scanAndTopK` bounding the reporter flagged as an "Additional
Fix" is out of scope for this PR — the title and OpenClaw event-loop-block
log both point at the maintenance endpoint, and keeping the surface tight
makes the fix easy to review and revert.

Verification: 4/4 new unit tests pass, 28/28 memory-core façade tests pass,
`npx vitest run` shows 1048 passing / 3 pre-existing failures (v7 e2e /
namespace-visibility migrator regression / traces-count > 500) that all
reproduce on the base branch after `git stash` — unrelated to this change.
`tsc -p tsconfig.json --noEmit` and `tsc -p tsconfig.build.json` both clean.

Fixes MemTensor#1929

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Memtensor-AI

Copy link
Copy Markdown
Collaborator Author

🤖 Open Code Review

Target: PR #2038
Task: 9c112d2b4811cbe6
Base: dev-20260624-v2.0.22
Head: bugfix/autodev-1929

🔍 OpenCodeReview found 2 issue(s) in this PR.


1. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4082)

The return type EmbeddingMaintenanceStats["byKind"]["trace"] is reused for all four bucket kinds (trace, policy, world_model, skill). If the type definition of these four bucket slots ever diverges (e.g., world_model gets an extra field or a different shape), TypeScript will silently accept the wrong type for the other three buckets — since the function signature pins everything to the trace bucket shape.

Consider using the more precise shared type directly. Since EmbeddingCountsBucket (from the new embedding_maintenance.ts) plus needsRepair is the actual intended shape, either:

  • Add a NeedsRepair intersection type and use it, or
  • Use EmbeddingMaintenanceStats["byKind"][keyof EmbeddingMaintenanceStats["byKind"]] which is the union of all bucket types.

Suggested fix:

function addNeedsRepair(bucket: {
  totalSlots: number;
  ready: number;
  missing: number;
  dimMismatch: number;
}): EmbeddingCountsBucket & { needsRepair: number } {

2. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4082)

The return type EmbeddingMaintenanceStats["byKind"]["trace"] is used to type the output for all four bucket kinds (trace, policy, world_model, skill). If the byKind bucket shapes ever diverge between kinds, TypeScript will silently accept a structurally compatible but semantically wrong type for the non-trace buckets, masking type errors.

Consider annotating the return type using the shared bucket shape directly — e.g., EmbeddingCountsBucket & { needsRepair: number } — which expresses intent more precisely and avoids an implicit coupling to the trace slot's type.

Generated by cloud-assistant via Open Code Review.

@Memtensor-AI Memtensor-AI changed the base branch from dev-20260624-v2.0.22 to dev-v2.0.22 July 2, 2026 07:28
…in addNeedsRepair

Return type was pinned to `EmbeddingMaintenanceStats["byKind"]["trace"]`,
but the helper is reused for all four bucket kinds (trace, policy,
world_model, skill). If any bucket shape ever diverged, TypeScript would
silently accept a structurally compatible but semantically wrong type on
the non-trace buckets.

Switch the return type to `EmbeddingCountsBucket & { needsRepair: number }`
— the shared bucket shape from `embedding_maintenance.ts` — which is what
the helper actually produces and does not implicitly couple to the trace
slot.

Fixes Open Code Review finding on PR MemTensor#2038.
@Memtensor-AI

Copy link
Copy Markdown
Collaborator Author

🤖 Open Code Review

Target: PR #2038
Task: 9c112d2b4811cbe6
Base: dev-20260624-v2.0.22
Head: bugfix/autodev-1929

🔍 OpenCodeReview found 14 issue(s) in this PR.


1. apps/memos-local-plugin/core/storage/repos/index.ts (L8-L14)

These symbols from ./embedding_maintenance.js are not used anywhere inside index.ts — they are only re-exported at the bottom of the file. The established convention throughout this barrel file (lines 77–93) is a direct export { … } from "…" form. Using an intermediate import here is redundant and breaks consistency.

Suggested fix: remove this import block entirely and replace the bottom export / export type blocks with direct re-exports:

export {
  embeddingMaintenanceCounts,
  inferStoredEmbeddingByteLen,
  FLOAT32_BYTES,
} from "./embedding_maintenance.js";
export type { EmbeddingCounts, EmbeddingCountsBucket } from "./embedding_maintenance.js";
💡 Suggested Change

Before:

import {
  embeddingMaintenanceCounts,
  inferStoredEmbeddingByteLen,
  FLOAT32_BYTES,
  type EmbeddingCounts,
  type EmbeddingCountsBucket,
} from "./embedding_maintenance.js";

After:

export {
  embeddingMaintenanceCounts,
  inferStoredEmbeddingByteLen,
  FLOAT32_BYTES,
} from "./embedding_maintenance.js";
export type { EmbeddingCounts, EmbeddingCountsBucket } from "./embedding_maintenance.js";

2. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Using the named type removes a silent structural duplicate and makes the contract self-documenting.

💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

3. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Using the named type removes this silent duplicate and keeps the structural contract in one place.

💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

4. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Replace it with the named type to avoid a silent duplicate and make the contract self-documenting:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {

5. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Replace it with the named type to eliminate this silent duplicate and make the contract self-documenting.

💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

6. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Reusing the named type eliminates this silent duplicate and keeps the structural contract in one place.

💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

7. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Reusing the named type eliminates this silent duplicate and keeps the structural contract in one place.

Suggested fix:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}

8. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface (totalSlots, ready, missing, dimMismatch). Replace the inline literal with the named type to eliminate this silent duplicate and keep the structural contract in one place:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}
💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

9. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The parameter type here is a 4-field inline structural literal (totalSlots, ready, missing, dimMismatch) that is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace the inline literal with the named type to eliminate this silent duplicate and keep the structural contract in one place:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}

10. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The parameter type here is a 4-field inline structural literal (totalSlots, ready, missing, dimMismatch) that is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace it with the named type to eliminate this silent duplicate and keep the structural contract in one place:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}
💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

11. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The parameter type is a 4-field inline structural literal (totalSlots, ready, missing, dimMismatch) that is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace the inline literal with the named type to eliminate this silent duplicate and keep the structural contract in one place.

Suggested fix:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}
💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

12. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The parameter type for bucket is a 4-field inline structural literal (totalSlots, ready, missing, dimMismatch) that is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace it with the named type to eliminate this silent duplicate and keep the structural contract in one place.

Suggested fix:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}

13. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket ({ totalSlots: number; ready: number; missing: number; dimMismatch: number; }) is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace the inline literal with the named type to eliminate this silent duplicate and keep the structural contract in one place.

Suggested fix:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}
💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

14. apps/memos-local-plugin/core/pipeline/memory-core.ts (L4078-L4083)

The inline structural type for bucket ({ totalSlots: number; ready: number; missing: number; dimMismatch: number; }) is byte-for-byte identical to the already-imported EmbeddingCountsBucket interface. Replace the inline literal with the named type to eliminate this silent duplicate and keep the structural contract in one place.

Suggested fix:

function addNeedsRepair(
  bucket: EmbeddingCountsBucket,
): EmbeddingCountsBucket & { needsRepair: number } {
  return { ...bucket, needsRepair: bucket.missing + bucket.dimMismatch };
}
💡 Suggested Change

Before:

  function addNeedsRepair(bucket: {
    totalSlots: number;
    ready: number;
    missing: number;
    dimMismatch: number;
  }): EmbeddingCountsBucket & { needsRepair: number } {

After:

  function addNeedsRepair(
    bucket: EmbeddingCountsBucket,
  ): EmbeddingCountsBucket & { needsRepair: number } {

Generated by cloud-assistant via Open Code Review.

autodev-bot and others added 2 commits July 2, 2026 15:58
…nd reuse EmbeddingCountsBucket

Follow-up to PR MemTensor#2038 Open Code Review:

1. `core/storage/repos/index.ts`: the `embedding_maintenance` symbols were
   imported and then re-exported in two separate statements, while every
   other barrel entry in this file uses the direct `export { … } from "…"`
   form (lines 77–93). Collapse to the same shape:
     export {
       embeddingMaintenanceCounts,
       inferStoredEmbeddingByteLen,
       FLOAT32_BYTES,
     } from "./embedding_maintenance.js";
     export type { EmbeddingCounts, EmbeddingCountsBucket }
       from "./embedding_maintenance.js";

2. `core/pipeline/memory-core.ts`: `addNeedsRepair()` declared its `bucket`
   parameter as an inline literal `{ totalSlots; ready; missing; dimMismatch }`
   that is byte-for-byte identical to the already-imported
   `EmbeddingCountsBucket`. Replace the inline literal with the named type
   so the structural contract lives in one place.

Behaviour unchanged — pure type / re-export tidy-up.
@Memtensor-AI

Copy link
Copy Markdown
Collaborator Author

Automated Test Results: PASSED\n\nCloud test-engine rerun after resolving the dev-v2.0.22 merge conflict.\n\nRun: tr-d0d77e53-48e\nScope: memos_local_plugin\nResult: 34/34 tests passed\nCommand group: memos_local_plugin/unit\nDuration: 29s\n\nLocal pre-push verification also passed: npm run build, plus focused vitest for embedding-maintenance and memory-core.\n\nStatus: merge conflict resolved; automated scope test passed. Manual code review is still required before merge.

@syzsunshine219 syzsunshine219 merged commit f1f31e5 into MemTensor:dev-v2.0.22 Jul 2, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated bug Something isn't working | 功能异常

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants