-
Notifications
You must be signed in to change notification settings - Fork 15
Implement lazy loading of thumbnails #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
In order to reduce memory usage, the new 'thumbnail.max_ahead' and 'thumbnail.max_behind' can be used to specify a limit to the number of thumbnails after the current selection that can be loaded. Thumbnails outside of this range will be unloaded, and will be loaded again when they enter the range. Setting either of these variables to 0 will remove the loading limit, to restore previous behavior.
The 'thumbnail.max_count' configuration setting prevents thumbnails from being unloaded if they are under this count, even if they fall outside the range specified by 'thumbnail.max_ahead' and 'thumbnail.max_behind'. If this value is 0, it is ignored. ignored.
max_behind -> load_behind max_ahead -> load_ahead max_count -> unload_threshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, this is excellent work, with really nice test-cases!
I did leave a few nit-picking comments, the one part that once again requires a bit of thinking is the threshold
as it does seem to make the implementation a tad more clumsy. Would love to hear your thoughts 😊
Have a great week-end!
You're welcome! I'm glad you generally like how this is turning out. You have a good weekend, too! I most-certainly will investigate the bug. I am not immediately able to reproduce it, but as mentioned in the review for one of the tests, I think there's some kind of race condition that could be causing this. |
* Rename test_thumbnaillazy.feature -> thumbnaillazy.feature * Replace consecutive 'When' statements with 'Ands' in tests * More explicit description for thumbnail.unload_threshold * Rename range calculation functions - _first_index -> _first_rendered_index - _last_index -> _last_rendered_index - _index_range -> _rendered_index_range * Refactor code to return as soon as possible
Previously, icons were pruned after updating the position. Sometimes the thumbnail creation callback ran after this update finished, and some old thumbnails were thus not pruned. Fix this by pruning every time a thumbnail is created, instead of at position updates. The thumbnail pruning function optimized to avoid redundant prunes, to improve performance given the more numerous calls. With these fixes, the issue causing the tests to need to wait for thumbnails to load before changing settings is also gone.
Think I fixed our race condition and understand why it happened. Fixing this also removes the need for the redundant check for rendered thumbnails in the tests. See 15610c3 |
Hmmmn, now we have one of the tests passing on my machine, but failing on the remote. I think what's happening is that moving pruning inside the thumbnail creation callback is making the handling of the |
I haven't quite figured this out yet, but my findings thus far are:
My best guess is that this has something to do with the order in which thumbnail creation jobs are assigned to threads. Could possibly fix this by having some kind of signal that's sent when all the jobs are finished, to run the pruning function. |
Quick thanks for all the work you are putting in and investigations you are doing! As a heads-up: I will unfortunately not be able to take a deeper look until the week-end, as I am fully busy with a business trip during the week. |
Leaving my notes here: The number of images to be loaded being less than the number of threads set for the pool is what seems to cause this to happen. I believe Qt basically works by issuing a job to each thread and waiting for those jobs to finish before assigning new ones. When navigating the thumbnails, async thumbnail load is called with a new set of thumbnails when the position changes. At this time, the current behavior is to clear the pool to remove queued threads. So, when the number of threads is less than the number of jobs, this means there are threads in the queue that will never be started, and thus, they will never be loaded and checked against the threshold. I'm able to get deterministic behavior by not clearing the pool, but that will create a greater performance hit from always loading every queued thread. So, we have a variety of choices here, and I'm inclined to wonder whether this deterministic behavior is truly necessary. |
As thumbnails can be passed in arbitrary order to create_thumbnails_async, sort them such that those closest to the currently selected thumbnail are queued first.
So, I believe I finally understood what is happening, thanks for your work and explanations! I agree that it should not be deterministic which part is affected by the I however do think we should ensure that the two ahead and two behind are rendered. And this is perfectly the case. I would rewrite the tests along:
I am not sure why you decided to explicitly avoid the trailing zeros, as it makes the ordering quite confusing, at least for me. I.e., As I am still unsure I understand the pruning and so forth correctly everywhere, I may need another deep-dive into the code, but really feel like we are converging towards the expected behaviour 😊 |
Rewrote the tests as specified and ended up removing the test for a thumbnail at a specific index, because that's really the only part that's non-deterministic. There wasn't really a great reason for stripping the trailing zeroes, to be honest. Somewhere in my brain at the time, I may had conceived it would be more concise to remove them, and I was trying to match another test case I was using as reference. Anyway, they're not stripped now. Also added a little bit of code that allows the icon position update to happen if moving around by clicking, rather than by scrolling or using the keyboard. And hey, thanks! This has been a slog, so I appreciate your being patient with me. Is there anything I could help clarify concerning the pruning algorithm? |
Perfect, thank you so much! Definitely not a slog from your side, this is excellent, I have been a bit inconsistent in my reviews though 😄 The one thing I am currently not able to wrap my head around, is if we need to call the pruning once a single new thumbnail has been created. Probably, yes, but it does seem to cause some overhead, even when nothing is pruned. Couldn't pruning be done once scrolling? Sure, there may be short times with "too little" thumbnails, but those will be filled then again. |
Of course! I'm glad you're ultimately happy with the way this is turning out. If I recall correctly, I originally tried to do the pruning after scrolling, but there was a race condition in which the scroll function would look for thumbnails to prune and then return before the new thumbnail was actually loaded. Since the new thumbnail wasn't loaded yet, old ones that should have been pruned sometimes weren't. |
That's unfortunate. I also suppose it doesn't make sense to try to prune once the current set of thumbnails has been created? There doesn't seem to be something along |
I also just tested moving the That said, The more complicated optimization might be to look through rendered thumbnails, and check whether they are in range, instead of scanning through out of range thumbnails and seeing if they're loaded. As for the other option you just described, I tried something where I tried to only call |
Thanks for the hint! I like the idea of the low-hanging fruit(s). Another one would be to skip the call completely, if the settings are at default (i.e. But this is getting out-of-hand, and possibly too complicated alltogether. If you don't mind, I think I would merge this once all your commits are pushed (dict comprehension?) and we can continue thinking if we come up with something more efficient in another PR 😊 EDIT: FYI, will be offline for the rest of the day, hopefully back on this tomorrow |
By the time you sent your post, I honestly had already just about implemented the low-hanging fruit of checking if there were any thumbnails to be unloaded before scanning. Your idea to check if load_ahead and load_behind are not -1 is already sort of implicitly done, in that retrieving the first or last rendered index will produce the first or last index of the list. So, when put into a range, it's empty and the loop should just quit as a result. But I agree, this is starting to get into micro-optimization territory. I personally think the performance overhead should be fairly negligible like this. Anyway, the dict comprehension has been pushed too, and we can merge the pull request here. Though, there is a final bit of unrelated envelope pushing I feel compelled to mention: The way the pruning is done is from 'left to right'. When the unload threshold is exceeded there's a strong bias to pruning the thumbnails at the beginning of the list, before considering ones at the end of the list. It might be fairer to alternate between thumbnails at the beginning and end of the list, so either end is pruned roughly equally. UPDATE: I also realized the loops that check each index could probably be prematurely terminated if the unload threshold is no longer exceeded. Should still be fine though, as this is mostly relevant to situations where a high unload threshold is set. If it's low most/all entries would still need to be scanned. UPDATE II: The number of icons expected to be loaded based on a given range could also be checked. The first thumbnail being loaded should really clear what's outside of the range, making subsequent calls quit immediately and their overhead negligible under most circumstances. |
Sorry for the delay I finally got around to test the option of "only enabling pruning in case unload threshold > 0" in 1ecc088. Instead of having the check in each call, we only connect the
I see were you are coming from with the bias pruning thumbnails from the beginning. So instead of pruning ahead and behind, one would need to prune something along (there is probably a better way): def _prune_icons(self) -> None:
"""Prune indexes before and after the current index."""
if not self._prunable_thumbnails_exist():
return
idx_behind = range(0, self._first_rendered_index())
idx_ahead = reversed(range(self._last_rendered_index() + 1, len(self._paths)))
for idx in flatten(itertools.zip_longest(idx_behind, idx_ahead)):
if idx is not None: # None from zip_longest
self._prune_index(idx) With this general approach, one could probably also check UPDATE II: Not sure I got this, they would quit immediately as the first one has already cleared a sufficient amount, and thus |
I'm sorry if I pulled this PR out longer than intended. Again, we could totally just merge. We might be focusing too much on code that is already acceptably performant. Checking I did look at the code you proposed. It looks sensible (and more consistent, since I have no idea why I checked Here, there's record keeping done to both disconnect when necessary, and to avoid multiple consecutive connects (which can produce extra calls to def _maybe_connect_pruning(self, unload_threshold: int):
"""Only connect pruning in case unloading should be performed."""
if unload_threshold > -1:
if not self._pruning_connected:
self._manager.created.connect(self._prune_icons)
self._pruning_connected = True
elif self._pruning_connected:
self._manager.created.disconnect(self._prune_icons)
self._pruning_connected = False There's also another For fixing the bias, your code looks about correct to my initial observation and like it more or less produces the result I had in mind. I'd just want to test it. I probably don't have anything much better than that. Finally, as for UPDATE II, honestly, I don't entirely remember what I meant. It's something I'd probably have to write a prototype for to properly conceptualize (and test) again, but I think I meant something that followed UPDATE I, if the length of |
I have a photos folder with 1500+ pictures, and without this change my vimiv would always freeze pretty bad. Even after a few minutes it wouldn't respond, so I I tried merging this change with v0.9.0. And used this configuration:
This is what makes sense to me. No freeze. I'm able to navigate the thumbnails with no issue. Excellent work done here, good job!! 👍 I'm going to use this for now on! 😄 |
Now for one of the more ambitious changes discussed in #651 The general goal is to allow thumbnails to be displayed in a lazy-loaded manner, and unloaded when they fall out of a certain range. This can save memory and improve startup times at the cost of potentially higher disk IO and CPU usage when browsing thumbnails.
The number of thumbnails displayed at any given time is configurable to this end by the following options. The default values should maintain the original behavior.
thumbnail.load_behind
: Max thumbnails to render behind the selected one. -1 (default) is no limitthumbnail.load_ahead
: Max thumbnails to render ahead of the selected one. -1 (default) is no limitthumbnail.unload_threshold
: Don't unload thumbnails unless more are loaded than this. -1 (default) is no limit