fix(opencode): break auto-compact loop when compaction makes no progress#29150
fix(opencode): break auto-compact loop when compaction makes no progress#29150ZehuaWang wants to merge 3 commits into
Conversation
Closes anomalyco#28543 When a model's configured context window is smaller than what the provider actually serves (e.g. GitHub Copilot's claude-opus-4.7 mapped at 144K in models.dev when the real ceiling is higher), every successful turn keeps reporting "overflowing" token counts. Auto-compaction then fires before each new prompt AND inside the processor on each finish-step, and we never escape it. Add a stall detector that compares the reported token count between consecutive auto-compaction triggers in a single run. If a second auto-compaction would fire with the token count not having dropped by at least 5%, throw a typed ContextOverflowError instead of recreating the compaction task forever. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The following comment was made by an LLM, it may be inaccurate: Based on my search, I found several related PRs addressing auto-compaction and context overflow issues: Potentially Related PRs:
Why they might be related: These PRs all address aspects of the auto-compaction loop, token overflow detection, and preventing infinite compaction cycles. PR #27919 appears most similar as it also fixes an "infinite compaction loop." You should verify whether #27919 was already attempted, what its status is, and how this PR (#29150) differs in its approach (using a stall detector vs. other mechanisms). |
Review found the boundary check used >= instead of >, so a reduction that hit exactly the threshold (e.g. 200K → 190K) was flagged as stalled and threw ContextOverflowError, contradicting the PR description that requires "at least 5% reduction" to escape stall. The boundary test name said "returns false right at the 5% boundary" but the assertion said true — which made the inconsistency obvious. Change the comparison to strict `>`. Now exactly-(1-threshold) reduction counts as progress; only reductions strictly less than the threshold trip the guard. Update the boundary test name, comment, and assertions to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switching the percentage check from >= to > closed the boundary false-positive (200K -> 190K now counts as progress) but opened a zero-token hole: when the provider directly throws ContextOverflowError, SessionProcessor.halt sets needsCompaction without running step-finish, so handle.message.tokens stays at the zero-initialized values. The percentage check then evaluates 0 > 0 * 0.95 = false on every fire and the loop keeps recreating compactions. Add a one-line guard so two consecutive zero-token compactions trip the stall detector. Unit test added alongside the boundary cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Issue for this PR
Closes #28543
Type of change
What does this PR do?
When a model's context limit in models.dev is smaller than what the provider actually serves, every turn reports overflow tokens and auto-compaction fires forever. Once from the pre-model-call isOverflow check, then again from the processor's own finish-step overflow check inside handle.process.
Track the token count from the last auto-compaction in this runLoop invocation. If we'd fire again without the count having dropped by at least 5%, throw ContextOverflowError instead. Reset the tracker when overflow is no longer detected so healthy long sessions aren't affected.
How did you verify your code works?
bun test test/session/compaction.test.ts test/session/prompt.test.ts— all pass. Added unit tests around the 5% boundary and one integration test that seeds an overflow assistant message, calls prompt.loop, and asserts it exits with ContextOverflowError.Also built the binary and pointed it at a stub openai-compatible server that always reports overflow usage. Original 1.15.10 made 20 LLM calls before timing out at 90s. Patched build exits in ~4s with the typed error.
Checklist