-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: add support for ContetxtWindowCompressionConfig in RunConfig #2206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add support for ContetxtWindowCompressionConfig in RunConfig #2206
Conversation
@hangfei, could you please take a look at this and review? Regarding the session resumption feature — I don’t think it's sufficient to rely solely on Based on my tests:
Suggestion: It might be better not to use session resumption via Also we need another way to handle session resumption in voice mode after 24 hours since the live api session resumption is only valid for that duration. |
@ac-machache good point. i have a fix here and PTAL: https://github.com/google/adk-python/pull/2270/files. regarding 24 hours, do you have some use cases that span more than 24 hours. |
thanks! I was wondering if there is any way to test if this actually works or not. One challenge we found is that sometimes either due to bugs on our side or on other dependencies, the feature doesn't actually work. So it would be good if we have a way to test if the window actually get compressed or not. |
Okay, regarding testing the feature window compression — I’ll try it tomorrow and see if it works in a real use case in addition to the unit tests. As for the 24-hour limit in the agentic workflow: a user might want to resume an old session even 3 days later. Since we can’t really predict this behavior, it might be better to support longer session resumption, even if implementing it is a bit more complex. |
@hangfei , Regarding
|
Turn | prompt_token_count |
ECW (-581 FBT ) |
Observation |
---|---|---|---|
1 | 612 | 31 | Initial turn |
2 | 689 | 108 | Growing ECW |
3 | 758 | 177 | Growing ECW |
4 | 1009 | 428 | Nearing trigger |
5 | 1148 | 567 | ECW > trigger (512) |
6 | 813 | 232 | Compression activated |
This run confirmed compression kicked in once ECW > trigger_tokens
.
Test Run 2: trigger_tokens=1024
, sliding_window(target_tokens=512)
Turn | prompt_token_count |
ECW (-581 FBT ) |
Observation |
---|---|---|---|
1 | 600 | 19 | Start |
2 | 697 | 116 | Growing |
9 | 1484 | 903 | Still under trigger |
10 | 1618 | 1037 | Trigger Point 1 |
11 | 1738 | 1157 | Compression pending |
12 | 1102 | 521 | Compression Effect 1 |
15 | 1580 | 999 | ECW regrowth |
16 | 1826 | 1245 | Trigger Point 2 |
17 | 1065 | 484 | Compression Effect 2 |
Conclusion
These results demonstrate that:
- The
context_window_compression
mechanism works predictably, managing dynamic content in long conversations. - Compression activates after the ECW (prompt minus FBT) crosses the
trigger_tokens
threshold. - Once triggered, the ECW shrinks close to
target_tokens
, validating the sliding window mechanism.
Note: The Fixed Base Tokens (~581) ensures
prompt_token_count
will always be higher thantarget_tokens
, but this does not affect the effectiveness of compression.
@hangfei, Is there any blocking point holding it up? I’d be happy to help unblock or address any feedback if needed |
Summary
This PR adds support for
ContextWindowCompressionConfig
inRunConfig
.This enables context window compression using a
trigger_tokens
threshold and a sliding window with atarget_tokens
limit.This feature is useful for managing long-running audio inputs.
Related Issue
Closes #2188
Testing Plan
test_streaming_with_context_window_compression_config