-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for QCH preprocessing #64
Conversation
This greatly improves the speed of cssless processing by using regular expressions instead of a full-blown CSS parser to manipulate the `style` attribute where possible. The CSS used on the site is simple enough so this works just fine. The tests needed some massaging to get them to pass again, because the new handling caused the following minor changes to the output. * Floating point values are always formatted with the decimal point, which was omitted previously when the fractional part was zero. * Existing CSS declarations are not touched when other declarations are added or removed, so there are some differences where spaces were added previously but aren't now.
Premailer is used by the cssless preprocessing script to inline CSS styles, it hasn't been maintained in a while. This adds the source code of the premailer module so we can make performance improvements to cssless preprocessing. These are going to be changes specific to this project, which won't be of any use to upstream premailer.
This adds an option to premailer for dropping style tags even if there are leftover rules. The cssless preprocessing would drop those tags after premailer processing anyway, this way we don't even keep them.
This further improves the speed of cssless processing by using regular expressions in premailer instead of full parsing, if possible. Premailer is now also used with validation turned off, we just assume that the CSS we get from the site is valid. This takes the time for running unit tests from over 7 seconds before the improvements down to about 1.4 seconds, which should make building a release more bearable (keep in mind we're talking about 5k files to be processed, each taking about 0.7 seconds).
Add missing release files, add `doc_html` to `all` target, make formatting more consistent.
Thanks for shouting, I agree that the performance of cssless preprocessing is too slow. I don't think that we should compromise the correctness for speed though :). This PR improves the speed of the CSS processing on our side roughly twice just by caching the results as the majority of CSS rules are roughly similar. The improvements on premailer side look good to me. Ideally they should go to that repo, I'm sure we can work something out as I've contributed a PR there in recent past. Would you mind if I contributed your changes (with appropriate attribution of course) or would you like to do a PR yourself? |
I also had the idea of caching parsed CSS styles and tried to put the cssutils objects on their respective DOM elements, unfortunately there's no way to put non-string attributes there. About the premailer improvements, I don't think (apart from the |
So according to my benchmarks the following results are obtained by processing the first 30 pages sequentially (needs modification of the preprocess_qch.py file): Original (cppreference-doc @ 824afb2, peterbe/premailer@ebfd310): 71.2s Caching results on cppreference-doc side (cppreference-doc @ 49f023a, peterbe/premailer@ebfd310): 49.5s (1.4x improvement) Caching results on both cppreference-doc and premailer (cppreference-doc @ 0ab29ba, The 3.6x improvement is much less than what you achieved, but at least we don't compromise on correctness. It's maybe worth exploring other optimizations. See these PRs: |
I don't know if you had a chance to run the preprocessing step with the new cssless QCH preprocessor yet, but I can tell you it's incredibly slow (like, hours for the complete site on my Intel Core i5-7200U laptop). I tried to improve the performance of that, however, my changes might be somewhat controversial.
I profiled the cssless preprocessing script and noticed that most of the execution time was spent parsing CSS and dumping it to text again and again. So I decided to see whether this could be done using regular expressions instead, since the CSS used on the site is not too complicated. Turns out this works just fine, cutting the execution time of the unit tests in half.
A lot of the CSS parsing happens in premailer, not in our preprocessing script. So the second step I took was to include the premailer source code into the project (premailer hasn't been seriously maintained in several years), so we could modify it. I did the same thing again, using regular expressions instead of full CSS parsing where possible. This again worked great, the execution times for unit tests (which include full preprocessing of two files) is now down to about 1.4s from over 7s at the beginning.
Now the thing that might be controversial is that of course this could break at any time if the site started using CSS that is more complex than a simple regex can handle. I don't see that happening, but feel free to reject this PR if you're concerned about that.