refactor(event cache): consolidate logic around retrieving the latest gap #4733
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This sits on top of #4731, which could be reviewed independently or as part of this PR (since it's only a single commit).
Before this PR, the logic when back-paginating is the following:
One can see that the gap is looked for in multiple places. Also, each of the major steps involve taking and releasing the state lock, which means that in between, some other things can happen to the room event cache's state. In particular, it can be shrunk after step (1) but before step (2), confusing the code a lot. In particular, the test that got transformed into a regression test by enabling the event cache's storage showed the following:
load_more_events_backwards()
returns "Gap", meaning "wait for an initial prev-batch token" in this case.The key here is that step (2) should've called
load_more_event_backwards()
again, to properly handle the consequences of shrinking (either prepend a new chunk of events, or load a latest gap).This PR consolidates all this logic, so
load_more_event_backwards()
outcome gets more information:prev-batch
token, which is passed to the networking step.conclude_load_more_for_fully_loaded_chunk
is sufficiently documented with the code comment.When the outcome is
WaitForInitialPrevToken
, we do the race between waiting 3 seconds or getting a prev-batch token from sync, and restart by calling theload_more_event_backwards()
again. This properly handles a linked chunk that's shrunk when the state lock was released, fixing the regression test.As a result,
get_or_wait_for_token
becomes useless, as it was only used in tests, and its meaning wasn't quite correct in the presence of the event cache, because of lazy-loading. I've removed it; first I started to port the tests, but they didn't make a lot of sense anymore (the waiting doesn't happen inload_more_event_backwards()
, so I would've had to callpaginate_backwards_impl
, turning those "unit" tests into full integration tests — which I didn't want to). All this code is heavily tested in event cache integration tests, or indirectly by the timeline tests as well, so I'm confident it's safe to remove these tests.Part of #3280.