fix(adapter): fire session.close in daemon thread to unblock event loop by chiefmojo · Pull Request #1953 · MemTensor/MemOS

chiefmojo · 2026-06-21T23:30:28Z

Problem

The Hermes adapter calls session.close synchronously via requests.post on the asyncio event loop thread. When the bridge shuts down or becomes unresponsive the HTTP call blocks, stalling Discord heartbeats and causing a disconnect.

Fix

Move session.close off the event loop using loop.run_in_executor(None, ...). The call runs in the default thread-pool executor so asyncio heartbeats continue while the request is in flight. A 10-second timeout caps the wait.

docs(memos-local-plugin): clarify install path and stale dir names (MemTensor#1540) The README's 'Quick start' section told users to use install.sh instead of npm install, but the warning was buried and users still tried 'npm install -g @memtensor/memos-local-plugin' first. The reporter in MemTensor#1540 encountered this on a Hermes deployment. This change: - Promotes the 'do not run npm install -g' notice to a prominent IMPORTANT callout explaining why global install is wrong (no agent-home deploy, no config.yaml, no bridge/viewer) and that the tarball intentionally ships built artifacts only. - Adds a Troubleshooting subsection covering the two specific symptoms in the bug report: the 'package not found' misread, and the stale web/ and site/ directory names (web/ is now viewer/, site/ was removed by commit 26e7e3d). - Mentions install.ps1 for Windows alongside install.sh. - CHANGELOG: record the docs fix and reference MemTensor#1540. Documentation-only change; no code or runtime behavior touched. Co-authored-by: MemOS AutoDev <autodev@memtensor.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>

…_() got an unexpected keyword a (MemTensor#1889) fix: remove invalid chunker parameter from SystemParser test instantiation - SystemParser.__init__() signature changed to (embedder, llm=None) - Test was still passing chunker=None causing TypeError - Fixes all 5 failing tests in test_system_parser.py Fixes MemTensor#1888 Co-authored-by: MemOS AutoDev <autodev@memos.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>

…tributeError when given None (MemTensor#1884) * test: add comprehensive tests for clean_json_response (issue MemTensor#1525) - Add test suite in tests/mem_os/test_format_utils.py - Cover None input ValueError with diagnostic message - Cover markdown removal, whitespace stripping, edge cases - Verify fix for AttributeError when LLM returns None * style: format clean_json_response tests --------- Co-authored-by: MemOS AutoDev <autodev@memos.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>

…date_cube_access — fails for ev (MemTensor#1903) fix: validate current user not target in share_cube_with_user (MemTensor#1901) share_cube_with_user(cube_id, target_user_id) called _validate_cube_access(cube_id, target_user_id), but the validator signature is (user_id, cube_id). The cube_id therefore landed in the user_id slot and _validate_user_exists raised "User '<cube_id>' does not exist or is inactive" for every well-formed call, making the API unusable. The in-code comment "Validate current user has access to this cube" already documented the correct intent: the sharing user (self.user_id) must have access to the cube being shared, not the target. Switch the call to self._validate_cube_access(self.user_id, cube_id). The target user's existence is independently checked on the next line via validate_user(target_user_id), so that path is unchanged. Add regression tests in tests/mem_os/test_memos_core.py that pin down: - validate_user_cube_access is consulted with (self.user_id, cube_id), - add_user_to_cube is called with (target_user_id, cube_id) on success, - a missing target raises "Target user '<id>' does not exist". Closes MemTensor#1901 Co-authored-by: MemOS AutoDev Bot <autodev@memtensor.local> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>

on_session_end() called bridge.request("session.close") inline with a 30 s blocking urlopen(). gateway/run.py calls this synchronously from _handle_reset_command (an async fn), so the blocking I/O ran on the asyncio event loop thread, preventing Discord heartbeats from firing and causing forced reconnection after 10–30 s. The session.close response is unused and errors are already suppressed, so the call is semantically fire-and-forget. Moving it to a daemon thread is the correct fix: the event loop is never blocked, the request still goes out, and a 5 s timeout keeps it bounded if the bridge is dead. Reproducer: Violet 2026-06-12 09:54 (agent.log lines 3629–3791). Spec: ~/specs/memos-bridge-blocking-shutdown-spec.md

…status checks can trigger paid (MemTensor#1899) * Fix MemTensor#1897: fix(memos-local-plugin): add LLM circuit breaker for terminal provider errors Issue MemTensor#1897 reported ~12,900 paid LLM requests in 24 h on Hermes against a DeepSeek key with insufficient balance. The local `system_model_status` row count (12,900) closely tracked the provider-side `request_count` (11,344) for the same billing window. The naming is misleading: `system_model_status` is not a health probe; it is the audit row written once per LLM call (ok / fallback / error) inside `core/llm/client.ts`. With no circuit breaker, every pipeline subscriber (capture / session-relation / reward / L2 / L3 / skill / retrieval LLM filter / world-model) kept firing on every turn / closed episode / induction, generating one paid request each. Add a per-`LlmClient` circuit breaker: - Trips on terminal errors: HTTP 401/402/403 or messages containing `insufficient balance` / `invalid api key` / `unauthorized` / `account suspended` / `billing`. - Open: short-circuits subsequent calls inside the facade without contacting the provider. Throws `MemosError(LLM_UNAVAILABLE)` with `details.circuitOpen=true` so existing catch blocks still work. - Half-open after cool-down (default 5 min, configurable, min 30 s): next call probes the provider; success closes the breaker, terminal failure re-opens it for another cool-down. - Host fallback rescues a call without tripping the breaker — fallback exists precisely to keep going when the primary is down. - Coalesces `system_model_status="circuit_open"` audit rows to at most one per ~25 s while the breaker stays open, so we don't replace paid spam with audit-row spam. - Exposes `circuitOpen` / `circuitOpenUntil` / `circuitOpenedReason` via `LlmClientStats` for the Overview viewer card. - Enabled by default; legacy behaviour available via `circuitBreaker.enabled = false`. Tests: 9 new vitest cases under `tests/unit/llm/client.test.ts` covering trip on 402, trip on "insufficient balance" message, no trip on generic transient, coalescing, half-open close on success, host-fallback rescues without trip, disabled mode, stats fields, and re-open on terminal probe failure. All 59 LLM and 28 pipeline tests pass; `tsc --noEmit` clean. Out of scope (tracked separately): 429 `Retry-After` handling (issue MemTensor#1620), per-tool rate limits, daily budget caps. * Fix LLM breaker with host fallback --------- Co-authored-by: autodev-bot <autodev@memtensor.local> Co-authored-by: Jiang <33757498+hijzy@users.noreply.github.com> Co-authored-by: GU TIANCHUN <96930846+TianchunGu@users.noreply.github.com> Co-authored-by: Dubberman <48425266+whipser030@users.noreply.github.com>

Memtensor-AI · 2026-06-30T12:05:56Z

⚠️ Automated Test Results: ENV ISSUE

The test environment encountered an issue that requires manual attention.

Details: Executor error: Command failed: git clone --depth 1 --branch fix/session-close-blocking git@github.com:MemTensor/MemOS.git /data/test-workspaces/8473675036a7bc6e/repo
Cloning into '/data/test-workspaces/8473675036a7bc6e/repo'...
warning: Could not find remote branch fix/session-close-blocking to clone.
fatal: Remote branch fix/session-close-blocking not found in upstream origin
Branch: fix/session-close-blocking

Memtensor-AI · 2026-07-02T08:53:23Z

Cloud AutoDev retest on dev-v2.0.22: PASSED.

Run: tr-e87049f5-b07 on test-engine-v4.
Scope: memos_local_plugin changed-python-source validation for the Hermes adapter file.
Result: 2/2 passed.

Ignore earlier full plugin unit failure for this PR; it hit the known baseline storage migrator failure before scope narrowing was fixed.

Memtensor-AI · 2026-07-02T11:18:13Z

Automated Test Results: PASSED

Cloud test-engine rerun against dev-v2.0.22 completed successfully.

Run: tr-2e8cdec0-160 on cloud test-engine 10012
memos_local_plugin/changed-python-source: 2 passed, 0 failed, 0 skipped

Manual code review is still required before merge.

Memtensor-AI and others added 5 commits June 14, 2026 17:24

chiefmojo mentioned this pull request Jun 21, 2026

fix(bridge): add 20s timeout guard to core.shutdown() to prevent orphaned processes #1799

Open

Memtensor-AI and others added 2 commits June 30, 2026 19:56

Merge branch 'dev-20260604-v2.0.19' into fix/session-close-blocking

5fc9516

Memtensor-AI changed the base branch from dev-20260604-v2.0.19 to dev-v2.0.22 July 1, 2026 13:15

CarltonXiang deleted the branch MemTensor:main July 3, 2026 07:25

CarltonXiang closed this Jul 3, 2026

syzsunshine219 reopened this Jul 3, 2026

syzsunshine219 added the needs-audit Requires manual audit before merge label Jul 3, 2026

syzsunshine219 changed the base branch from dev-v2.0.22 to main July 3, 2026 08:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(adapter): fire session.close in daemon thread to unblock event loop#1953

fix(adapter): fire session.close in daemon thread to unblock event loop#1953
chiefmojo wants to merge 7 commits into
MemTensor:mainfrom
chiefmojo:fix/session-close-blocking

chiefmojo commented Jun 21, 2026

Uh oh!

Memtensor-AI commented Jun 30, 2026

Uh oh!

Memtensor-AI commented Jul 2, 2026

Uh oh!

Memtensor-AI commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

chiefmojo commented Jun 21, 2026

Problem

Fix

Related

Uh oh!

Memtensor-AI commented Jun 30, 2026

⚠️ Automated Test Results: ENV ISSUE

Uh oh!

Memtensor-AI commented Jul 2, 2026

Uh oh!

Memtensor-AI commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants