Skip to content

Only load searchindex when needed #2553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Feb 20, 2025

This PR makes it so that the searchindex.js file is only loaded when the user arrives on a page with ?search= in the URL or if the user opens the search. It means that the book size will be only the text content until the user actually needs to run a search.

cc @notriddle

@rustbot rustbot added the S-waiting-on-review Status: waiting on a review label Feb 20, 2025
@notriddle
Copy link
Contributor

If you're going to mess with the search engine, can you also write some gui tests that exercise it?

@GuillaumeGomez
Copy link
Member Author

Very good point.

@GuillaumeGomez
Copy link
Member Author

Added the GUI test. :)

@@ -2,9 +2,6 @@
// an iframe (because of JS disabled).
// Regression test for <https://github.com/rust-lang/mdBook/issues/2528>.

// We disable the requests checks because `searchindex.json` will always fail
// locally.
fail-on-request-error: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! 👍

@GuillaumeGomez
Copy link
Member Author

When the JS is being retrieved, there is now a spinner and the input is not disabled anymore:

image

@GuillaumeGomez GuillaumeGomez requested a review from ehuss February 21, 2025 22:14
@GuillaumeGomez GuillaumeGomez force-pushed the load-on-need branch 2 times, most recently from 38b4ea7 to c96b7e4 Compare March 10, 2025 11:13
@GuillaumeGomez
Copy link
Member Author

Rebased and fixed merge conflict.

@notriddle
Copy link
Contributor

It looks like you reverted the loading throbber, and now it disables the search input while it's loading again.

16cbfd6 added it, but the pull request doesn't have it any more.

@GuillaumeGomez
Copy link
Member Author

Arg indeed. Rebase went wrong I guess. Great catch, thanks!

@notriddle
Copy link
Contributor

Everything seems good here!

@GuillaumeGomez
Copy link
Member Author

Rebased.

@GuillaumeGomez
Copy link
Member Author

Re-rebased and also ran eslint.

@GuillaumeGomez
Copy link
Member Author

Rebased. If you could take a look @ehuss. If you need help to understand what this PR is doing, don't hesitate to ask!

@ehuss
Copy link
Contributor

ehuss commented Mar 31, 2025

Can you say more about the reasoning behind this change? For me, it comes off as a worse experience, so we are somehow not seeing the same thing. For example, after opening a book and looking at it, I may want to search for something. Today this loads instantly (since it is eagerly loaded), but with this change I may need to wait 10+ seconds for anything to happen. Is this trying to save bandwidth? Or is it some performance issue?

I'm not completely against this kind of change, but the description here doesn't explain the motivation or acknowledge the potential downsides.

@GuillaumeGomez
Copy link
Member Author

Today this loads instantly (since it is eagerly loaded), but with this change I may need to wait 10+ seconds for anything to happen. Is this trying to save bandwidth? Or is it some performance issue?

I'm very surprised. What book do you have this 10+ seconds wait?

The whole point of this PR is to actually speed up the page load and reduce the (page) size by default until you actually need the search. This is becomes more and more useful as the book size (and its search index) grows. This is how we allow rustdoc to always load extremely quickly, whatever the size of the crate (and of its search index).

@ehuss
Copy link
Contributor

ehuss commented Apr 2, 2025

The RFCs book has a 22MB index. I suppose 10s might be a bit of an exaggeration for many people, as I was just doing some throttling tests, though I think we should be sensitive to people without fast internet.

Can you help me understand how this speeds up page load? My understanding is that the index is loaded async in the background. My expectation is that since it is loading in the background, it shouldn't have any measurable effect on initial render time. What little page-load profiling I've done doesn't show a difference with this PR.

@GuillaumeGomez
Copy link
Member Author

It's mostly for people with limited internet access (not in bandwidth but in data) that will get their life improved with this PR: it only loads extra content on demand.

Can you help me understand how this speeds up page load? My understanding is that the index is loaded async in the background. My expectation is that since it is loading in the background, it shouldn't have any measurable effect on initial render time. What little page-load profiling I've done doesn't show a difference with this PR.

I wasn't clear, my bad. The load time should normally be close to not impacted (except I forgot to convert the JS to JSON.parse(), meaning that for big search indexes, all pages will have an impact when the JS will be parsed, because JS parsing is performed in the main thread, should be fixed in #2633). Here, the impact is to reduce memory usage for the tab (until you actually need the search) and saving data.

@GuillaumeGomez
Copy link
Member Author

Ah I found a good analogy: with the RFC book, instead of loading the 22MB search index on all pages whether you need or not, you will only load it when you want to perform a search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: waiting on a review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants