Skip to content

Inline bridge::Buffer methods. #97604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 4, 2022

Conversation

nnethercote
Copy link
Contributor

This fixes a performance regression caused by making Buffer
non-generic in #97004.

r? @eddyb

This fixes a performance regression caused by making `Buffer`
non-generic in rust-lang#97004.
@rust-highfive
Copy link
Contributor

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 31, 2022
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2022
@bors
Copy link
Collaborator

bors commented May 31, 2022

⌛ Trying commit dee353d with merge 96f4496f4f75f1e98a276af608d7b4a9a06fc333...

@nnethercote
Copy link
Contributor Author

This is an alternative to #97539.

@bors
Copy link
Collaborator

bors commented Jun 1, 2022

☀️ Try build successful - checks-actions
Build commit: 96f4496f4f75f1e98a276af608d7b4a9a06fc333 (96f4496f4f75f1e98a276af608d7b4a9a06fc333)

@rust-timer
Copy link
Collaborator

Queued 96f4496f4f75f1e98a276af608d7b4a9a06fc333 with parent e094492, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (96f4496f4f75f1e98a276af608d7b4a9a06fc333): comparison url.

Instruction count

  • Primary benchmarks: mixed results
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
3.3% 3.3% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.2% -0.2% 1
Improvements 🎉
(secondary)
-1.7% -1.7% 3
All 😿🎉 (primary) 1.5% 3.3% 2

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
3.8% 3.8% 2
Improvements 🎉
(primary)
-1.2% -1.2% 1
Improvements 🎉
(secondary)
-2.2% -2.2% 1
All 😿🎉 (primary) -1.2% -1.2% 1

Cycles

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
5.2% 5.2% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 5.2% 5.2% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 1, 2022
@nnethercote
Copy link
Contributor Author

The CI perf results are similar enough to those in #97539.

@eddyb
Copy link
Member

eddyb commented Jun 4, 2022

@bors r+

@bors
Copy link
Collaborator

bors commented Jun 4, 2022

📌 Commit dee353d has been approved by eddyb

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 4, 2022
@bors
Copy link
Collaborator

bors commented Jun 4, 2022

⌛ Testing commit dee353d with merge cb0584f...

@bors
Copy link
Collaborator

bors commented Jun 4, 2022

☀️ Test successful - checks-actions
Approved by: eddyb
Pushing cb0584f to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 4, 2022
@bors bors merged commit cb0584f into rust-lang:master Jun 4, 2022
@rustbot rustbot added this to the 1.63.0 milestone Jun 4, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cb0584f): comparison url.

Instruction count

  • Primary benchmarks: mixed results
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
3.8% 3.8% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.3% -0.3% 1
Improvements 🎉
(secondary)
-2.1% -2.1% 3
All 😿🎉 (primary) 1.8% 3.8% 2

Max RSS (memory usage)

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
2.1% 2.1% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-2.4% -2.4% 1
All 😿🎉 (primary) N/A N/A 0

Cycles

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
5.2% 5.2% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 5.2% 5.2% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@nnethercote nnethercote deleted the inline-bridge-Buffer-methods branch June 5, 2022 22:49
@rylev
Copy link
Member

rylev commented Jun 7, 2022

@nnethercote I'm a bit confused as this PR was opened to address a performance regression, but it seems to be itself a partial perf regression (at least in instruction counts). This causes a near 4% perf regression in serde_derive-1.0.136. Granted that particular benchmark has been somewhat noisy but not nearly to the level seen here.

Can you justify why this perf regression is acceptable and then label with the perf-regression-triaged label?

@nnethercote
Copy link
Contributor Author

I'm confident the serde_derive regression is unrelated to this PR. It only happens on full opt builds, but this PR doesn't make any changes that affect code generation. If it was a genuine regression I would expect check and debug builds to also regress, and possibly the incremental scenarios.

We occasionally see this kind of random fluctuation with opt builds. It's probably caused by a slight change in codegen unit partitioning. You can see on the graphs at perf.rust-lang.org that there was a similar drop for serde_derive-1.0.136-opt check full on May 28, but again none of the other serde_derive runs were affected. It's quite possible that some lucky PR will get a perf "improvement" on that case, and then some unlucky PR will get a corresponding "regression" some time after that.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jun 8, 2022
@eddyb
Copy link
Member

eddyb commented Jun 8, 2022

@rylev I'm not even sure it's random, but rather anything that results in more intensive proc macro codegen for the benefit of the runtime performance (i.e. actually using the proc macro) will cause a "shift" in benchmarks, where:

  • proc macro benchmarks (e.g. serde_derive) get slower to compile
  • benchmarks that use proc macros a lot (e.g. token-stream-stress, cargo) get faster

If you look at the regression results it looks like a clear-cut regression, but that's just because serde_derive got hidden - if you check "Show non-relevant results", you can see it's in the green, so it was faster to compile, but at the cost of being slower to use.

I was going to claim that the asymmetry of relative % makes this more confusing, but that's more noticeable the closer you get to -50% (which is undone by +100%, i.e. halving undone by doubling), and for -0.67% it's barely anything.

So maybe there is some noise, although this PR is clearly worse than #97539's perf run for serde_derive, while being better for token-stream-stress (so the same tradeoff but more of it).


It only happens on full opt builds, but this PR doesn't make any changes that affect code generation. If it was a genuine regression I would expect check and debug builds to also regress, and possibly the incremental scenarios.

This PR adds #[inline] to a bunch of functions, which:

  • just like Make bridge::Buffer generic again. #97539, those functions are now codegen'd in the crate using them, i.e. serde_derive in the case of the relevant benchmarks
  • unlike just having them generic, #[inline] also hints to LLVM to more aggressively try to inline, and that only affects opt builds, not debug ones

Also, just remembered that Cargo doesn't actually optimize proc macros so... how are we getting improvements from changes that should only matter if proc macros were getting optimized?!

@nnethercote nnethercote mentioned this pull request Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants