fix(memos-local-plugin): add maxTokens to LlmSchema & SkillEvolverSchema#1896
Open
chouti wants to merge 1 commit into
Open
fix(memos-local-plugin): add maxTokens to LlmSchema & SkillEvolverSchema#1896chouti wants to merge 1 commit into
chouti wants to merge 1 commit into
Conversation
Reasoning models (deepseek-reasoner, o1*, gpt-5-thinking) consume hundreds of tokens on chain-of-thought. The hard-coded 1024 cap on LLM JSON reflection caused 73+ 'llm.json malformed' errors per episode, blocking episode closure and disconnecting the bridge. - Add maxTokens: NumberInRange(4000, 1024, 32768) to LlmSchema - Add maxTokens: NumberInRange(4000, 1024, 32768) to SkillEvolverSchema - Add corresponding defaults in defaults.ts (backward compat) TypeBox's Value.Default preserves user-supplied fields, so old configs without maxTokens get the 4000 default; new configs can override.
Collaborator
|
Automated Test Results: FAILED Cloud test-engine rerun against
Failed cases:
Do not merge until this is fixed and cloud tests pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Reasoning models (e.g.
deepseek-reasoner,o1*,gpt-5-thinking) consume hundreds of tokens on chain-of-thought before emitting JSON content. Whenllm.jsonreflection /skillEvolver.crystallizeis called with the legacy hard-codedmax_tokens: 1024, the response is truncated mid-JSON and the bridge logsllm.json malformed73+ times in a single reflection cycle.Result: episodes never close,
recoveryReason: "dirty_reward_rescore"piles up, and the MemOS viewer reportsbridge.status: disconnectedeven though the daemon is alive.Reproduction
Start bridge with no override of
max_tokens. After ~1 episode:Direct curl confirms root cause: API returns
finish_reason: "length"with truncated content.Fix
Add
maxTokens(default 4_000) toLlmSchemaandSkillEvolverSchema. Reasonable budget for reasoning models while still clamping below OpenAI's 32k ceiling.LlmSchema.maxTokens: NumberInRange(4000, 1024, 32768)SkillEvolverSchema.maxTokens: NumberInRange(4000, 1024, 32768)defaults.tsfor old configsTypeBox's
Value.Defaultalready preserves user-supplied fields, so the change is backward compatible — old configs withoutmaxTokensget the 4_000 default; new configs can override.Verification
Before patch:
bridge.status: disconnectedAfter patch (config unchanged, schema updated):
bridge.status: connectedFiles changed
apps/memos-local-plugin/core/config/schema.ts(+15 lines, 2 field defs)apps/memos-local-plugin/core/config/defaults.ts(+2 lines, 2 default values)