-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Tweak SlicePartialEq to allow MIR-inlining the compare_bytes call
#150945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Tweak `SlicePartialEq` to allow MIR-inlining the `compare_bytes` call
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (8018bcc): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.0%, secondary -2.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.0%, secondary 2.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary -0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.812s -> 477.487s (0.78%) |
|
Hmm, so this recovered the syn loss from #150265 (comment), but isn't an obvious overall win. I do like removing the second cc @saethlin in case you have any thoughts here. |
|
rustbot has assigned @Mark-Simulacrum. Use |
|
I don't have any strong opinions on this. It would be neat if we had better ways to learn about the impacts of MIR opts than squinting at the perf result. |
This comment has been minimized.
This comment has been minimized.
|
r=me with rebase if we want to go ahead. |
150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just `memcmp`. This should also allow MIR-inlining the length check, which was previously blocked.
60aecfc to
51de309
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
"Execution context was destroyed, most likely because of a navigation." |
|
The more I think about it the more I like removing the @bors r=Mark-Simulacrum |
|
Scheduling: Encourage a mixture of rollup and non-rollup PRs. @bors p=5 |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing a234ae6 (parent) -> 1e5065a (this PR) Test differencesNo test diffs found Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 1e5065a4d99e0e3ccf1a1719055308e7a20e8f36 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (1e5065a): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -7.2%, secondary 2.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 8.6%, secondary 4.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 476.908s -> 474.687s (-0.47%) |
#150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just
memcmpand thus hopefully preserving the perf win.This should also allow MIR-inlining the length check, which was previously blocked, and thus might allow some obvious non-matches to optimize away as well.