-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save complete trajectory in presence of history truncation #6751
base: main
Are you sure you want to change the base?
Conversation
There is something I don't understand about this:
|
Cc: @csmith49 To elaborate a bit:
TBH this is fragile code, it doesn't feel great to rely on The main reason for Re: stuck |
I'm still working through the control-flow of |
🤔 That's a good question! 😅 ...Shouldn't change anything, because the B controller closes, and the B state.history may be there, but A controller doesn't read it. It doesn't seem to care about B's
A's controller has A's agent |
Ah interesting... so the get_history() call is supposed to work? But it isn't... (there's no delegation involved in my experiment) FYI my commit was aae611a, which is built on top of e487008 Everything between id 0 and 77 are truncated/lost in the final history. |
Oh wait I see the problem! |
25739af
to
e07c89b
Compare
Yeah I agree, especially fragile if we want to export/save trajectory before closing the controller, or, say, in the middle of a conversation.
Yeah understood, it's hard to deal with legacy esp without enough test coverage |
@@ -193,6 +193,8 @@ def on_event(event: Event): | |||
# NOTE: the saved state does not include delegates events | |||
end_state.save_to_session(event_stream.sid, event_stream.file_store) | |||
|
|||
await controller.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are entirely correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be messing up with the event loop... causing tests failure. Will investigate later.
I was just coming back to this PR after I saw nothing calls it - the agent_session does (but then that's UI only). Some recent-ish refactoring must have broken this. Weird though. Thank you for the investigation! |
End-user friendly description of the problem this fixes or functionality that this introduces
Give a summary of what the PR does, explaining any non-trivial design decisions
#4977 introduced truncation feature to handle long context errors by cutting the history. Since trajectory saving feature depends on "controller.state.history", this causes saved trajectories to only contain partial history.
Direct cause is that we are not closing the controller properly in headless mode. If we had done so, the history would be recovered properly and thus trajectory would have been complete. This PR fixes this issue.
Note thatstate.history
is being used in many places (including but not limited to a few benchmark evaluation harness), which we might also want to evaluate if they actually need truncated history or full history. Maybe we need better names for truncated/full history.Also, note that stuck detector also depends onstate.history
, which could potentially lead to malfunction of stuck detector if the loop appears around the place where we do the truncate.Anyways, this PR fixes the partial trajectory issue and includes a unit test, which could be used as a testbed for any future renaming/refactoring.~~
Kudos to @adityasoni9998 for finding this issue after analyzing a few evaluation trajectories.
Link of any specific issues this addresses
This also includes a small fix for #6749 inopenhands/server/routes/trajectory.py
To run this PR locally, use the following command: