Skip to content

fix: batch missing search embeddings#2041

Draft
RerankerGuo wants to merge 1 commit into
MemTensor:mainfrom
RerankerGuo:fix/batch-missing-search-embeddings
Draft

fix: batch missing search embeddings#2041
RerankerGuo wants to merge 1 commit into
MemTensor:mainfrom
RerankerGuo:fix/batch-missing-search-embeddings

Conversation

@RerankerGuo

Copy link
Copy Markdown

Description

Related Issue (Required): Fixes #1482

Batches missing embedding backfills during API search MMR deduplication.

SearchHandler._extract_embeddings() previously sent every memory without cached embedding in one embedder call. Providers such as DashScope text-embedding-v4 reject batches larger than 10, so a search result with many missing embeddings could fail during deduplication. This change caps each missing-embedding call at 10 documents while preserving the returned embeddings in the original memory order.

No public API, request/response model, OpenAPI, dependency, or database schema changes.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • Unit Test
  • Test Script Or Test Steps (please provide)
  • Pipeline Automated API Test (please provide)

Commands run:

python3 -m py_compile src/memos/api/handlers/search_handler.py tests/api/test_search_handler_embedding_batches.py

Result: passed.

Also attempted:

PYTHONPATH=src python3 -m pytest tests/api/test_search_handler_embedding_batches.py -q
make format

These are blocked in the local environment because pytest and poetry are not installed. Direct test-function execution is also blocked because pydantic is not installed.

Checklist

  • I have performed a self-review of my own code | 我已自行检查了自己的代码
  • I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释
  • I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常
  • I have created related documentation issue/PR in MemOS-Docs (if applicable) | 我已在 MemOS-Docs 中创建了相关的文档 issue/PR(如果适用)
  • I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用)
  • I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人

@Memtensor-AI Memtensor-AI changed the base branch from main to dev-v2.0.22 July 2, 2026 10:57
@Memtensor-AI

Copy link
Copy Markdown
Collaborator

Automated Test Results: QUEUED

This PR was retargeted to dev-v2.0.22; cloud test-engine rerun has been queued.

  • Run: tr-7c2dae84-562 on cloud test-engine 10010
  • Branch tested: refs/pull/2041/head

I will post the final pass/fail result when the run completes.

@Memtensor-AI

Copy link
Copy Markdown
Collaborator

Automated Test Results: PASSED

Cloud test-engine rerun against dev-v2.0.22 completed successfully.

  • Run: tr-7c2dae84-562 on cloud test-engine 10010
  • memos_python_core/changed-repo-python: 1 passed, 0 failed, 0 skipped

Manual code review is still required before merge.

@CarltonXiang CarltonXiang deleted the branch MemTensor:main July 3, 2026 07:25
@syzsunshine219 syzsunshine219 reopened this Jul 3, 2026
@syzsunshine219 syzsunshine219 added the needs-audit Requires manual audit before merge label Jul 3, 2026
@syzsunshine219 syzsunshine219 changed the base branch from dev-v2.0.22 to main July 3, 2026 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-audit Requires manual audit before merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: Extracting too many missing_documents embeddings cause embedder error

4 participants