Skip to content

Improve code quality for push_new_empty_item. #3341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 23, 2018

Conversation

jrmuizel
Copy link
Collaborator

@jrmuizel jrmuizel commented Nov 22, 2018

LLVM's ability to eliminate memcpy's across basic blocks is bad.
Because of this, we want to make sure that we avoid putting branches
in code where we want the memcpy elimination to happen.
rust-lang/rust#56172 has a reduced test case
of this happening.

This change lifts the branch caused by unwrap() above the creation of
the SpecificDisplayItem. It ends up saving a memcpy of 127 bytes along
with reducing pop_reference_frame by 18 instructions.


This change is Reviewable

LLVM's ability to eliminate memcpy's across basic blocks is bad.
Because of this, we want to make sure that we avoid putting branches
in code where we want the memcpy elimination to happen.
rust-lang/rust#56172 has a reduced test case
of this happening.

This change lifts the branch caused by unwrap() above the creation of
the SpecificDisplayItem. It ends up saving a memcpy of 127 bytes along
with reducing pop_reference_frame by 18 instructions.
@gw3583
Copy link
Contributor

gw3583 commented Nov 22, 2018

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 97a7db1 has been approved by gw3583

@jrmuizel
Copy link
Collaborator Author

jrmuizel commented Nov 22, 2018

This is sort of an ugly change so I'm interested to know how people feel about it.

Here's the before and after asm:

after:

webrender_api::display_list::DisplayListBuilder::pop_reference_frame (src/display_list.rs:1281):
 push    rbp
 mov     rbp, rsp
 sub     rsp, 368
 mov     rax, qword, ptr, [rdi, +, 40]
 test    rax, rax
 je      LBB514_2
 mov     rcx, qword, ptr, [rdi, +, 24]
 lea     rax, [rax, +, 2*rax]
 shl     rax, 4
 mov     rdx, qword, ptr, [rcx, +, rax, -, 8]
 mov     qword, ptr, [rbp, -, 8], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 16]
 mov     qword, ptr, [rbp, -, 16], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 24]
 mov     qword, ptr, [rbp, -, 24], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 32]
 mov     qword, ptr, [rbp, -, 32], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 48]
 mov     rax, qword, ptr, [rcx, +, rax, -, 40]
 mov     qword, ptr, [rbp, -, 40], rax
 mov     qword, ptr, [rbp, -, 48], rdx
 mov     qword, ptr, [rbp, -, 56], 0
 mov     qword, ptr, [rbp, -, 64], 0
 mov     qword, ptr, [rbp, -, 368], 18
 mov     rax, qword, ptr, [rbp, -, 48]
 mov     rcx, qword, ptr, [rbp, -, 40]
 mov     qword, ptr, [rbp, -, 192], rax
 mov     qword, ptr, [rbp, -, 184], rcx
 mov     rax, qword, ptr, [rbp, -, 32]
 mov     qword, ptr, [rbp, -, 176], rax
 mov     rax, qword, ptr, [rbp, -, 24]
 mov     qword, ptr, [rbp, -, 168], rax
 mov     rax, qword, ptr, [rbp, -, 16]
 mov     qword, ptr, [rbp, -, 160], rax
 mov     rax, qword, ptr, [rbp, -, 8]
 mov     qword, ptr, [rbp, -, 152], rax
 mov     qword, ptr, [rbp, -, 144], 0
 mov     rax, qword, ptr, [rbp, -, 80]
 mov     rcx, qword, ptr, [rbp, -, 72]
 mov     qword, ptr, [rbp, -, 136], rax
 mov     qword, ptr, [rbp, -, 128], rcx
 mov     rax, qword, ptr, [rbp, -, 64]
 mov     qword, ptr, [rbp, -, 120], rax
 mov     rax, qword, ptr, [rbp, -, 56]
 mov     qword, ptr, [rbp, -, 112], rax
 mov     qword, ptr, [rbp, -, 96], 0
 mov     qword, ptr, [rbp, -, 104], 0
 mov     byte, ptr, [rbp, -, 88], 1
 lea     rsi, [rbp, -, 368]
 call    webrender_api::display_list::serialize_fast
 add     rsp, 368
 pop     rbp
 ret
LBB514_2:
 lea     rdi, [rip, +, l_anon.1f7bba913a8d8ed49380105111efd8e9.29]
 call    core::panicking::panic

before:

webrender_api::display_list::DisplayListBuilder::pop_reference_frame (src/display_list.rs:1281):
 push    rbp
 mov     rbp, rsp
 push    rbx
 sub     rsp, 568
 xorps   xmm0, xmm0
 movaps  xmmword, ptr, [rbp, -, 80], xmm0
 mov     qword, ptr, [rbp, -, 88], 0
 mov     qword, ptr, [rbp, -, 96], 0
 mov     rax, qword, ptr, [rdi, +, 40]
 test    rax, rax
 je      LBB514_2
 mov     rbx, rdi
 mov     rcx, qword, ptr, [rdi, +, 24]
 lea     rax, [rax, +, 2*rax]
 shl     rax, 4
 mov     rdx, qword, ptr, [rcx, +, rax, -, 8]
 mov     qword, ptr, [rbp, -, 24], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 16]
 mov     qword, ptr, [rbp, -, 32], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 24]
 mov     qword, ptr, [rbp, -, 40], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 32]
 mov     qword, ptr, [rbp, -, 48], rdx
 mov     rdx, qword, ptr, [rcx, +, rax, -, 48]
 mov     rax, qword, ptr, [rcx, +, rax, -, 40]
 mov     qword, ptr, [rbp, -, 56], rax
 mov     qword, ptr, [rbp, -, 64], rdx
 mov     qword, ptr, [rbp, -, 400], 18
 lea     rdi, [rbp, -, 392]
 lea     rsi, [rbp, -, 568]
 mov     edx, 168
 call    _memcpy
 mov     rax, qword, ptr, [rbp, -, 64]
 mov     rcx, qword, ptr, [rbp, -, 56]
 mov     qword, ptr, [rbp, -, 224], rax
 mov     qword, ptr, [rbp, -, 216], rcx
 mov     rax, qword, ptr, [rbp, -, 48]
 mov     qword, ptr, [rbp, -, 208], rax
 mov     rax, qword, ptr, [rbp, -, 40]
 mov     qword, ptr, [rbp, -, 200], rax
 mov     rax, qword, ptr, [rbp, -, 32]
 mov     qword, ptr, [rbp, -, 192], rax
 mov     rax, qword, ptr, [rbp, -, 24]
 mov     qword, ptr, [rbp, -, 184], rax
 mov     qword, ptr, [rbp, -, 176], 0
 mov     rax, qword, ptr, [rbp, -, 112]
 mov     rcx, qword, ptr, [rbp, -, 104]
 mov     qword, ptr, [rbp, -, 168], rax
 mov     qword, ptr, [rbp, -, 160], rcx
 mov     rax, qword, ptr, [rbp, -, 96]
 mov     qword, ptr, [rbp, -, 152], rax
 mov     rax, qword, ptr, [rbp, -, 88]
 mov     qword, ptr, [rbp, -, 144], rax
 mov     rax, qword, ptr, [rbp, -, 80]
 mov     rcx, qword, ptr, [rbp, -, 72]
 mov     qword, ptr, [rbp, -, 128], rcx
 mov     qword, ptr, [rbp, -, 136], rax
 mov     byte, ptr, [rbp, -, 120], 1
 mov     al, byte, ptr, [rbp, -, 9]
 mov     byte, ptr, [rbp, -, 113], al
 movzx   eax, word, ptr, [rbp, -, 11]
 mov     word, ptr, [rbp, -, 115], ax
 mov     eax, dword, ptr, [rbp, -, 15]
 mov     dword, ptr, [rbp, -, 119], eax
 lea     rsi, [rbp, -, 400]
 mov     rdi, rbx
 call    webrender_api::display_list::serialize_fast
 add     rsp, 568
 pop     rbx
 pop     rbp
 ret
LBB514_2:
 lea     rdi, [rip, +, l_anon.cb11e9af7d531ec043b1c95d23ca69be.29]
 call    core::panicking::panic

@jrmuizel
Copy link
Collaborator Author

Ha. That was a more aggressive r+ than I was expecting :)

@bors-servo
Copy link
Contributor

⌛ Testing commit 97a7db1 with merge b4f7e43...

bors-servo pushed a commit that referenced this pull request Nov 22, 2018
Improve code quality for push_new_empty_item.

LLVM's ability to eliminate memcpy's across basic blocks is bad.
Because of this, we want to make sure that we avoid putting branches
in code where we want the memcpy elimination to happen.
rust-lang/rust#56172 has a reduced test case
of this happening.

This change lifts the branch caused by unwrap() above the creation of
the SpecificDisplayItem. It ends up saving a memcpy of 127 bytes along
with reducing pop_reference_frame by 18 instructions.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3341)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - status-appveyor, status-taskcluster
Approved by: gw3583
Pushing b4f7e43 to master...

@bors-servo bors-servo merged commit 97a7db1 into servo:master Nov 23, 2018
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Nov 23, 2018
…1cee3dd7339b (WR PR #3341). r=kats

servo/webrender#3341

Differential Revision: https://phabricator.services.mozilla.com/D12753

--HG--
extra : moz-landing-system : lando
jankeromnes pushed a commit to jankeromnes/gecko that referenced this pull request Nov 24, 2018
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 3, 2019
…1cee3dd7339b (WR PR #3341). r=kats

servo/webrender#3341

Differential Revision: https://phabricator.services.mozilla.com/D12753

UltraBlame original commit: ab0838a6a61bbcba9b45a622ee16b4ea744ec712
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 3, 2019
…1cee3dd7339b (WR PR #3341). r=kats

servo/webrender#3341

Differential Revision: https://phabricator.services.mozilla.com/D12753

UltraBlame original commit: ab0838a6a61bbcba9b45a622ee16b4ea744ec712
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 3, 2019
…1cee3dd7339b (WR PR #3341). r=kats

servo/webrender#3341

Differential Revision: https://phabricator.services.mozilla.com/D12753

UltraBlame original commit: ab0838a6a61bbcba9b45a622ee16b4ea744ec712
bhearsum pushed a commit to mozilla-releng/staging-firefox that referenced this pull request May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants