Cap event history scanned by StuckDetector by enyst · Pull Request #1829 · OpenHands/software-agent-sdk

enyst · 2026-01-26T15:27:49Z

Summary

Avoids materializing the full conversation event history when running stuck detection.

Changes

StuckDetector.is_stuck() now scans only a recent, fixed-size window of events instead of list(self.state.events).
Added a regression test ensuring stuck detection still triggers correctly even with a large backlog of older events.

Testing

uv run pytest -q tests/cross/test_stuck_detector.py
uv run pre-commit run -a

Addresses in part #1824

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:f5dda1e-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-f5dda1e-python \
  ghcr.io/openhands/agent-server:f5dda1e-python

All tags pushed for this build

ghcr.io/openhands/agent-server:f5dda1e-golang-amd64
ghcr.io/openhands/agent-server:f5dda1e-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:f5dda1e-golang-arm64
ghcr.io/openhands/agent-server:f5dda1e-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:f5dda1e-java-amd64
ghcr.io/openhands/agent-server:f5dda1e-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:f5dda1e-java-arm64
ghcr.io/openhands/agent-server:f5dda1e-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:f5dda1e-python-amd64
ghcr.io/openhands/agent-server:f5dda1e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:f5dda1e-python-arm64
ghcr.io/openhands/agent-server:f5dda1e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:f5dda1e-golang
ghcr.io/openhands/agent-server:f5dda1e-java
ghcr.io/openhands/agent-server:f5dda1e-python

About Multi-Architecture Support

Each variant tag (e.g., f5dda1e-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., f5dda1e-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-01-26T15:31:03Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/conversation
stuck_detector.py	125	20	84%	131, 135–136, 207, 216–217, 220, 245, 249, 253, 258–260, 273, 282, 306–307, 313–314, 320
TOTAL	16429	4797	70%

Co-authored-by: openhands <openhands@all-hands.dev>

The reconcile() call after run completion was removed in PR #1820, but this caused a race condition where events emitted during the final moments of the run could be lost if the WebSocket didn't deliver them before run() returned. This was observed in CI where test_events_not_lost_during_client_disconnection failed because the client only received 3 events while the REST API had 6 events - the ActionEvent(finish) and ObservationEvent(finish) were missing. The fix restores the reconcile() call in _wait_for_run_completion() to ensure all events are captured after run completion. This is safe because reconcile() is idempotent and will only add events that are missing from the client's cache. Fixes the flaky test failure in PR #1829. Co-authored-by: openhands <openhands@all-hands.dev>

The reconcile() call after run completion was removed in PR #1820, but this caused a race condition where events emitted during the final moments of the run could be lost if the WebSocket didn't deliver them before run() returned. This was observed in CI where test_events_not_lost_during_client_disconnection failed because the client only received 3-4 events while the REST API had 6 events - the ActionEvent(finish) and ObservationEvent(finish) were missing. Reproduction: - Inject a 3s delay in the WebSocket callback for finish events - Run the conversation with a finish tool call - Observe that without the reconcile() call, the client is missing events The fix restores the reconcile() call in _wait_for_run_completion() to ensure all events are captured after run completion. This is safe because reconcile() is idempotent and will only add events that are missing from the client's cache. Fixes the flaky test failure in PR #1829. Co-authored-by: openhands <openhands@all-hands.dev>

This PR fixes the race condition where events emitted during the final moments of a run could be lost if the WebSocket didn't deliver them before run() returned. ## Root Cause The race condition occurs when: 1. Server emits events (ActionEvent, ObservationEvent) 2. Client polls and sees 'finished' status 3. run() returns before WebSocket delivers those events ## Solution Instead of using the expensive reconcile() which fetches ALL events, we introduce reconcile_recent() which only fetches events after the last known timestamp. This is much more efficient for long conversations. The fix: 1. Added reconcile_recent() method to RemoteEventsList that uses the timestamp__gte filter to only fetch recent events 2. Call reconcile_recent() after run completion to catch any events that were missed by WebSocket ## Reproduction Added test_event_loss_repro.py which reliably reproduces the race condition by injecting a 3s delay in the WebSocket callback for finish events. Without the fix, the test fails because the client is missing ActionEvent(finish) and ObservationEvent(finish). ## Testing - All cross tests pass - The reproduction test passes with the fix Fixes the flaky test failure in PR #1829. Co-authored-by: openhands <openhands@all-hands.dev>

openhands-sdk/openhands/sdk/conversation/stuck_detector.py

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

Overall, this is a solid optimization that prevents memory issues with large event histories. The implementation correctly limits the scan window, and the test coverage is thorough. However, there are a few important considerations around the behavioral change and documentation.

openhands-sdk/openhands/sdk/conversation/stuck_detector.py

all-hands-bot · 2026-01-26T17:21:31Z

openhands-sdk/openhands/sdk/conversation/stuck_detector.py

-
-        events = events[last_user_msg_index + 1 :]
+        if last_user_msg_index != -1:
+            events = events[last_user_msg_index + 1 :]


🟠 Important: This is a significant behavioral change from the original implementation. Previously, when no user message was found, the function would log a warning and return False (not stuck). Now it proceeds to check ALL events in the window for stuck patterns.

Implications:

Stuck detection can now trigger even without a recent user message in the 20-event window

The warning log is removed, reducing debugging visibility

If the actual last user message is beyond the 20-event window, the detector will analyze events that include previous user interactions

While test_is_stuck_without_recent_user_message_still_detects_loop validates the new behavior works, this change should be:

Explicitly mentioned in the PR description as a behavioral change

Documented in the docstring (e.g., "If no user message is found in the recent window, all events in the window are analyzed")

Consider whether removing the warning log is intentional or if it should be adjusted to log when operating without a recent user message boundary.

^^ This is incorrect, mostly. The point of looking for user message is to make sure the agent doesn't immediately stop because of an older stuck. The user sending a message is a cut off point (we don't want to look before a user message).

If there is no user message, then there is no previously stuck either. It's OK to look at all 20.

And if there is a previously stuck, that's strange, but it's still OK to look now and trigger a stop.

openhands-sdk/openhands/sdk/conversation/stuck_detector.py

tests/cross/test_stuck_detector.py

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww

LGTM and seems pretty stragiht forward to me!

openhands-ai · 2026-01-27T00:07:19Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Agent Server

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1829 at branch `openhands/stuck-detector-cap-events`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

Cap event history scanned by StuckDetector

4232375

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai bot mentioned this pull request Jan 26, 2026

Proposal: don't use full events history in the OH ecosystem #1824

Open

Reduce stuck detector scan window to last 20 events

28f897a

Co-authored-by: openhands <openhands@all-hands.dev>

enyst mentioned this pull request Jan 26, 2026

refactor: remove redundant shutdown reconcile call #1820

Merged

enyst and others added 2 commits January 26, 2026 17:12

Merge branch 'main' into openhands/stuck-detector-cap-events

d62e157

Fix stuck detection when no user message in recent window

2d66223

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww mentioned this pull request Jan 26, 2026

fix: wait for WebSocket terminal status to prevent event loss #1832

Merged

enyst commented Jan 26, 2026

View reviewed changes

openhands-sdk/openhands/sdk/conversation/stuck_detector.py Outdated Show resolved Hide resolved

Make stuck detector scan window a module constant

ce4d514

Co-authored-by: openhands <openhands@all-hands.dev>

enyst marked this pull request as ready for review January 26, 2026 17:18

all-hands-bot reviewed Jan 26, 2026

View reviewed changes

enyst and others added 4 commits January 26, 2026 18:25

Update openhands-sdk/openhands/sdk/conversation/stuck_detector.py

31b8678

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

Update openhands-sdk/openhands/sdk/conversation/stuck_detector.py

308352d

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

test: cover stuck detection with <20 events

a96d7f4

Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'main' into openhands/stuck-detector-cap-events

97d7835

xingyaoww approved these changes Jan 27, 2026

View reviewed changes

Merge branch 'main' into openhands/stuck-detector-cap-events

a847f56

enyst enabled auto-merge (squash) January 27, 2026 01:38

enyst merged commit 6c86d7d into main Jan 27, 2026
18 checks passed

enyst deleted the openhands/stuck-detector-cap-events branch January 27, 2026 01:40

enyst added the invariants the design invariants of the codebase label Jan 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cap event history scanned by StuckDetector#1829

Cap event history scanned by StuckDetector#1829
enyst merged 10 commits intomainfrom
openhands/stuck-detector-cap-events

enyst commented Jan 26, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

all-hands-bot Jan 26, 2026

Uh oh!

enyst Jan 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww left a comment

Uh oh!

openhands-ai bot commented Jan 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

enyst commented Jan 26, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Uh oh!

github-actions bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

all-hands-bot Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

enyst Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot commented Jan 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enyst commented Jan 26, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Jan 26, 2026 •

edited

Loading

enyst Jan 26, 2026 •

edited

Loading