-
Notifications
You must be signed in to change notification settings - Fork 287
feat(sdk): Implement EventCache
lazy-loading
#4632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sdk): Implement EventCache
lazy-loading
#4632
Conversation
Thanks for the detailed summary. Could it be incorporated as part of internal documentation, somehow?
I don't think it's true: the timeline would recover once it hits the start of the timeline, by triggering back-paginations. Those back-paginations would then fill the gaps at the event cache layer, and those filled gaps would be propagated to the timeline via new vectordiffs. The result would still be confusing to users, though: you got to the top of the apparent timeline, and now the gaps that you hadn't seen below would be filled with new messages.
I think that in this case, instead of loading the gap and being satisfied with it (and letting a future call to back-pagination resolve it into events), we could block, at this point, and resolve such a gap right now. Then we don't have the problem with missing messages, after an app has been backgrounded for a while. If the app is offline at this point, we would still need a way to display potential gaps. There's also another caveat to add, if we wanted to represent gaps: we would have many of them that are spurious. In particular, when restarting the app, we don't reuse the SSS's previous (With heuristics, we could decide to show gaps only after the "initial gap", i.e. a gap that would be observed after an initial SSS response.) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4632 +/- ##
==========================================
+ Coverage 85.83% 85.89% +0.05%
==========================================
Files 292 292
Lines 33758 33905 +147
==========================================
+ Hits 28977 29123 +146
- Misses 4781 4782 +1 ☔ View full report in Codecov by Sentry. |
9a2fbd4
to
128de80
Compare
978bc87
to
cd53aef
Compare
ff8c705
to
a4dc0ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks super nice at a first glance, well done! I like the thorough testing at each layer, and the nice commenting in the tests.
I have a few fundamental proposals below, and a few nits here and there. Let's discuss them :)
// This is a funny game. We are having fun, don't we? Let's paginate to load | ||
// moaare! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you ok Ivan 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/me is asking myself
…
Kind of.
323e08c
to
8f2161b
Compare
40f5a32
to
00f0925
Compare
This patch removes a `TODO` in `BackPaginationOutcome`. Events it contains are deduplicated by the `EventCache` (see `event_cache::Deduplicator`) when inserted inside `RoomEventCache`.
This patch updates `RoomEventCacheState::new` to load a single chunks of events instead of all events. It solves bugs where all events were loaded, while removing the gaps in between, thus the `Timeline` wasn't able to load the missing events to fill the gaps.
3122867
to
66a762d
Compare
This patch adds the `RoomEventCacheState::load_more_events_backwards` method to load a new chunk and to insert it at the beginning of the `LinkedChunk`. It uses the new `EventCacheStore::load_previous_chunk` method, along with the new `LinkedChunkBuilder::insert_new_first_chunk` method.
…event cache. This patch updates `Pagination::run_backwards_impl` to paginate in the event cache. The flow is now as follow: - backwards pagination tries to load and to insert a new previous chunk from the store - if the new chunk contains events, they are returned, pagination done - if the new chunk is a gap, the flow continues - (as previously) check for a prev batch token (it exists in the newly inserted gap) - (as previously) run a network request, replace the gap by the new events - etc. The new part is to load and to insert a new previous chunk. The rest is stays the same. The code has been moved in code to keep the lock releases happy and to clarify the code.
…gination. This patch changes the semantics regarding what to do in case of duplicated events received during a backwards pagination. Previously, the strategy for both sync and backwards pagination was the same: With the new received events, when duplicated events are detected, the old events are removed and the new ones are kept. The strategy is reversed for backwards pagination: the old events are always kept, and the duplicated events are removed from the new events. The rest of the patch is about removing dead code because of this change.
…evious`. This patch installs `lazy_previous` in the `LinkedChunkBuilder`.
This patch moves `chunk_debug_string` from `rooms/mod.rs` to `rooms/event.rs`. In addition, it restores (and rewrites) a test, initially for `chunk_debug_string`, now for `RoomEvents::debug_string` whichh is the public API.
66a762d
to
8d43a86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!!! Thanks for your work there, only minor comments here so I'm approving this :-)
Great job, I'm really delighted with our design, and that the room event cache simulates back-paginations mostly like /messages
. Very nice!
/// inserted, `Ok(None)` if a gap has been inserted or if the | ||
/// store is disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do need to update the comment again 😇
// No events, hmmm… | ||
assert!(pagination_outcome.events.is_empty()); | ||
|
||
// … that's because the start of the timeline is finally reached! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's all, folks 🥳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a ride!
Immediate follow up #4684 |
Apollo e Dafne (Bernini)
This is a follow-up of #4594.
The situation
Lazy-loading is now implemented on the
Timeline
's output. TheTimeline
outputs a certain maximum number of initial items. When a backwards pagination is run, theTimeline
first tries to paginate through in-memory items. When they are all exhausted, theTimeline
asks theEventCache
to provide more events, which will be transformed into items.Prior to this patch, the
EventCache
was reading the entire events of a room. This is wrong and it is the reason of some bugs:EventCache
(it relies on theLinkedChunk
which has been designed to represent gaps). However, because theTimeline
has no way to represent gaps, gaps are removed: only events are kept. TheTimeline
will do a pagination when it reaches one of its end, e.g. “I've reached my top, please feed me with more events!”. TheTimeline
isn't aware that events may be missing inside its set of events, because gaps have been removed. That's why users were seeing missing messages in theirTimeline
! And it was impossible to recover, except by clearing the cache.TimelineEventHandler::deduplicate_local_timeline_item
#4601, fix(ui): Fix performance ofAllRemoteEvents::(in|de)crement_all_timeline_item_index_after
#4608, fix(ui): Fix performance ofReadReceiptTimelineUpdate::apply
#4612, and fix(sdk): Improve performance ofRoomEvents::maybe_apply_new_redaction
#4616, theTimeline
is able to handle 10'000 events 10 times faster. TheTimeline
lazy-loading also dramatically reduces the number of items broadcasted to the apps/callers/consumers/subscribers. But still, theEventCache
outputs too many events, and theTimeline
hasn't been designed for that. TheEventCache
relies on theLinkedChunk
to hold the events. The capacity of a chunk in theLinkedChunk
is 128 in the current implementation, which means a chunk can contain up to 128 events. Much better than thousands.The solution
This patch is going to change the behaviour of the
EventCache
when its storage is enabled. Instead of loading all chunks, only the last one will be loaded. That's it. When theTimeline
will trigger a pagination, theEventCache
will first try to load the previous chunk if it exists, otherwise it means all events have been exhausted and a network pagination must be done. However, if the previous exists, there are 2 cases:Items
—i.e. it contains events—, then it's all good, the chunk is loaded, and updates will be broadcasted to theTimeline
, happy easy-peasy path.Gap
—i.e. it contains a… gap—, then a pagination is triggered over the network to fill this gap. The code will stay unchanged here, we already have this mechanism: theGap
is replaced byItems
, updates will be broadcasted to theTimeline
, happy slow path.This, will fix performance issues, and bugs (c.f. missing messages), but… this isn't the end of the journey.
Apollo e Dafne
Imagine the following scenario:
limited
flag)EventCache
, and so displayed by theTimeline
This behaviour can feel absurd for many reasons:
EventCache
, they are here (!), but we don't want to load them because they are before a gap, otherwise we end up in the current situation where we load all events, no matter the presence of gaps or not, and we end up in the missing message situation.One solution to this problem is to add a way for the
Timeline
to represent gaps. I propose to introduce (later, in another patch) a newVirtualTimelineItem
of kindGap
. It changes the behaviour of theTimeline
greatly: when aVirtualTimelineItem::Gap
enters the viewport of aTimeline
, the app/caller has to trigger a pagination. Such virtual timeline item can be rendered as a loader. TheTimeline
won't trigger a pagination when it reaches one of its end anymore: this newVirtualTimelineItem
of kindGap
will be entirely managed by theTimeline
. A new method will allow to fill/replace gaps, which will trigger a pagination from theEventCache
. Behaviour is still undefined and it raises many unknowns:Gap
in theTimeline
? When we are offline only?Gap
? It can be a bit disturbing for the user to see its messages, then on top of it a loader, then on top of it some events.EventCache
, do we always load this gap and its previous chunk?Well, it raises many many questions. We need to be extremely careful before digging into this.
An alternative exists though: automatic backwards pagination ✨. The SDK can automatically runs backwards pagination to fill all the existing gaps, in parallel of the sync. A correct heuristic must be determined to not bloat the network and to not drain the battery of the user's device (e.g. auto-run for the top most used rooms, up until n events, stuff like that, this is random ideas). It brings several advantages:
Timeline
.A note about Apollo e Dafne. First off, le Bernin is one of my favourite artist. Second, this sculputure is fantastic in many regards. The movements. The unique representation of this myth. The greek inspirations. The details (oh, the sandals…). Third, this patch evokes me this story of Apollo and Dafne. Apollo is in love with Dafne, and Dafne doesn't like him. Apollo is running after Dafne, and Dafne avoids him, escapes him as much as possible. This story is based on Cupido who shot two arrows: one made of gold to create love, another one made of lead to exhaust love. Every time we are going one step closer to perfect offline support, this goal slips away. #RomanticProgramming
Tasks
load_last_chunk
andload_previous_chunk
in theEventCacheStore
traitMemoryStore
load_last_chunk
andload_previous_chunk
SqliteEventCacheStore
load_last_chunk
andload_previous_chunk
from_last_chunk
andinsert_new_first_chunk
in theLinkedChunkBuilder
builderfrom_last_chunk
andinsert_new_first_chunk
with all possible errorsRoomEventCacheState::new
only loads the last chunkRoomEventCacheState::load_more_events_backwards
Pagination::run_backwards
to useRoomEventCacheState::load_more_events_backwards
This PR was starting to be quite big. Many patches have been extracted as smaller PR:
LinkedChunk
can start by a gap #4653RoomEventCache::subscribe
is now infallible #4660Chunk::lazy_previous
in theLinkedChunk
#4675EventCacheStore
to addload_last_chunk
andload_previous_chunk
#4678Immediate follow up
Addload_chunk_identifier_generator
and explains why it is usefulMergeload_last_chunk
andload_previous_chunk
intoload_chunk(room_id: &RoomId, before: Option<ChunkIdentifier>)
LinkedChunkBuilderTest
, chore(common): RemoveLinkedChunkBuilderTest
#4695LinkedChunkBuilder
as a set of a standalone functions instead of static methods, refactor(common):builder::LinkedChunkBuilder::*
becomeslazy_loader::*
#4686remove_events_by_id
must not ignore if the event ID isn't found: it meansDeduplicator
has found an duplicated event not in-memory but in the store only. This bug isn't related to this PR, but it has been discovered during the testing process, fix: Duplicated events are always removed #4706events.event_id
in SQLite: feat(sqlite) Add an index onevents.event_id
and.room_id
#4685EventCache
storage #3280EventCache
lazy-loading #4632