feat(reward): sentinel-based exchange-count bypass for cron episodes#1848
Open
chiefmojo wants to merge 6 commits into
Open
feat(reward): sentinel-based exchange-count bypass for cron episodes#1848chiefmojo wants to merge 6 commits into
chiefmojo wants to merge 6 commits into
Conversation
Cron jobs always produce exactly 1 user↔agent exchange — the task prompt plus one reply — so minExchangesForCompletion: 2 zero-scores every cron episode before content is even evaluated. This starves L2 induction of signal after the bridge stabilises. Adds `cronSentinels` to RewardConfig (schema, defaults, types). When the first user turn starts with a sentinel prefix, check 1 (exchange count) is skipped; content/triviality checks still apply. Default sentinel covers the Hermes cron prompt. The `snapshot.meta?.initialUserText` fallback handles episodes scored during recovery when turns aren't materialised. If the field is absent the episode falls back to the old skip behaviour — no false positives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docs(memos-local-plugin): clarify install path and stale dir names (MemTensor#1540) The README's 'Quick start' section told users to use install.sh instead of npm install, but the warning was buried and users still tried 'npm install -g @memtensor/memos-local-plugin' first. The reporter in MemTensor#1540 encountered this on a Hermes deployment. This change: - Promotes the 'do not run npm install -g' notice to a prominent IMPORTANT callout explaining why global install is wrong (no agent-home deploy, no config.yaml, no bridge/viewer) and that the tarball intentionally ships built artifacts only. - Adds a Troubleshooting subsection covering the two specific symptoms in the bug report: the 'package not found' misread, and the stale web/ and site/ directory names (web/ is now viewer/, site/ was removed by commit 26e7e3d). - Mentions install.ps1 for Windows alongside install.sh. - CHANGELOG: record the docs fix and reference MemTensor#1540. Documentation-only change; no code or runtime behavior touched. Co-authored-by: MemOS AutoDev <autodev@memtensor.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>
…_() got an unexpected keyword a (MemTensor#1889) fix: remove invalid chunker parameter from SystemParser test instantiation - SystemParser.__init__() signature changed to (embedder, llm=None) - Test was still passing chunker=None causing TypeError - Fixes all 5 failing tests in test_system_parser.py Fixes MemTensor#1888 Co-authored-by: MemOS AutoDev <autodev@memos.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>
…tributeError when given None (MemTensor#1884) * test: add comprehensive tests for clean_json_response (issue MemTensor#1525) - Add test suite in tests/mem_os/test_format_utils.py - Cover None input ValueError with diagnostic message - Cover markdown removal, whitespace stripping, edge cases - Verify fix for AttributeError when LLM returns None * style: format clean_json_response tests --------- Co-authored-by: MemOS AutoDev <autodev@memos.ai> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>
…date_cube_access — fails for ev (MemTensor#1903) fix: validate current user not target in share_cube_with_user (MemTensor#1901) share_cube_with_user(cube_id, target_user_id) called _validate_cube_access(cube_id, target_user_id), but the validator signature is (user_id, cube_id). The cube_id therefore landed in the user_id slot and _validate_user_exists raised "User '<cube_id>' does not exist or is inactive" for every well-formed call, making the API unusable. The in-code comment "Validate current user has access to this cube" already documented the correct intent: the sharing user (self.user_id) must have access to the cube being shared, not the target. Switch the call to self._validate_cube_access(self.user_id, cube_id). The target user's existence is independently checked on the next line via validate_user(target_user_id), so that path is unchanged. Add regression tests in tests/mem_os/test_memos_core.py that pin down: - validate_user_cube_access is consulted with (self.user_id, cube_id), - add_user_to_cube is called with (target_user_id, cube_id) on success, - a missing target raises "Target user '<id>' does not exist". Closes MemTensor#1901 Co-authored-by: MemOS AutoDev Bot <autodev@memtensor.local> Co-authored-by: Matthew <heimixiaozhuang@zju.edu.cn>
Collaborator
|
Automated Test Results: PASSED Cloud test-engine rerun against
Manual code review is still required before merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a config-driven bypass so cron-initiated episodes can be scored even when they fall below the
minExchangesForCompletionthreshold.Problem
Scheduled agent jobs produce single-exchange episodes (the cron prompt + agent response). With the default
minExchangesForCompletion: 1floor these score fine, but tighter settings — or multi-turn flows that cron fires as one composite turn — hit the "too few exchanges" skip condition. The cron session produces real work worth scoring but the exchange-count gate silently discards it.Solution
RewardConfig.cronSentinels: string[]— an array of substrings. If any user message in the episode starts with one of these strings, the exchange-count check (gate 1) is bypassed. All other quality gates (content length, triviality, tool-heavy ratio) remain active.The default is an empty array (
[]), so behavior is unchanged for existing deployments. Operators who run scheduled agent jobs add their sentinel string toconfig.yaml:Design notes
message.startsWith(sentinel)).Test plan
cronSentinelsarray: no change to existing behavior🤖 Generated with Claude Code