Inline `bridge::Buffer` methods. #97604

nnethercote · 2022-05-31T23:32:30Z

This fixes a performance regression caused by making Buffer
non-generic in #97004.

This fixes a performance regression caused by making `Buffer` non-generic in rust-lang#97004.

rust-highfive · 2022-05-31T23:32:32Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

nnethercote · 2022-05-31T23:33:43Z

@bors try @rust-timer queue

rust-timer · 2022-05-31T23:33:44Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-05-31T23:33:52Z

⌛ Trying commit dee353d with merge 96f4496f4f75f1e98a276af608d7b4a9a06fc333...

nnethercote · 2022-05-31T23:34:08Z

This is an alternative to #97539.

bors · 2022-06-01T01:05:14Z

☀️ Try build successful - checks-actions
Build commit: 96f4496f4f75f1e98a276af608d7b4a9a06fc333 (96f4496f4f75f1e98a276af608d7b4a9a06fc333)

rust-timer · 2022-06-01T01:05:16Z

Queued 96f4496f4f75f1e98a276af608d7b4a9a06fc333 with parent e094492, future comparison URL.

rust-timer · 2022-06-01T02:43:24Z

Finished benchmarking commit (96f4496f4f75f1e98a276af608d7b4a9a06fc333): comparison url.

Instruction count

Primary benchmarks: mixed results
Secondary benchmarks: 🎉 relevant improvements found

	mean¹	max	count²
Regressions 😿 (primary)	3.3%	3.3%	1
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-0.2%	-0.2%	1
Improvements 🎉 (secondary)	-1.7%	-1.7%	3
All 😿🎉 (primary)	1.5%	3.3%	2

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	3.8%	3.8%	2
Improvements 🎉 (primary)	-1.2%	-1.2%	1
Improvements 🎉 (secondary)	-2.2%	-2.2%	1
All 😿🎉 (primary)	-1.2%	-1.2%	1

Cycles

Results

Primary benchmarks: 😿 relevant regression found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	5.2%	5.2%	1
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	5.2%	5.2%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

nnethercote · 2022-06-01T03:07:39Z

The CI perf results are similar enough to those in #97539.

eddyb · 2022-06-04T00:47:04Z

@bors r+

bors · 2022-06-04T00:47:05Z

📌 Commit dee353d has been approved by eddyb

bors · 2022-06-04T04:51:33Z

⌛ Testing commit dee353d with merge cb0584f...

bors · 2022-06-04T07:32:06Z

☀️ Test successful - checks-actions
Approved by: eddyb
Pushing cb0584f to master...

rust-timer · 2022-06-04T08:51:15Z

Finished benchmarking commit (cb0584f): comparison url.

Instruction count

Primary benchmarks: mixed results
Secondary benchmarks: 🎉 relevant improvements found

	mean¹	max	count²
Regressions 😿 (primary)	3.8%	3.8%	1
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-0.3%	-0.3%	1
Improvements 🎉 (secondary)	-2.1%	-2.1%	3
All 😿🎉 (primary)	1.8%	3.8%	2

Max RSS (memory usage)

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	2.1%	2.1%	1
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-2.4%	-2.4%	1
All 😿🎉 (primary)	N/A	N/A	0

Cycles

Results

Primary benchmarks: 😿 relevant regression found
Secondary benchmarks: no relevant changes found

	mean¹	max	count²
Regressions 😿 (primary)	5.2%	5.2%	1
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	5.2%	5.2%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

rylev · 2022-06-07T14:04:45Z

@nnethercote I'm a bit confused as this PR was opened to address a performance regression, but it seems to be itself a partial perf regression (at least in instruction counts). This causes a near 4% perf regression in serde_derive-1.0.136. Granted that particular benchmark has been somewhat noisy but not nearly to the level seen here.

Can you justify why this perf regression is acceptable and then label with the perf-regression-triaged label?

nnethercote · 2022-06-08T00:39:37Z

I'm confident the serde_derive regression is unrelated to this PR. It only happens on full opt builds, but this PR doesn't make any changes that affect code generation. If it was a genuine regression I would expect check and debug builds to also regress, and possibly the incremental scenarios.

We occasionally see this kind of random fluctuation with opt builds. It's probably caused by a slight change in codegen unit partitioning. You can see on the graphs at perf.rust-lang.org that there was a similar drop for serde_derive-1.0.136-opt check full on May 28, but again none of the other serde_derive runs were affected. It's quite possible that some lucky PR will get a perf "improvement" on that case, and then some unlucky PR will get a corresponding "regression" some time after that.

@rustbot label: +perf-regression-triaged

eddyb · 2022-06-08T04:37:20Z

@rylev I'm not even sure it's random, but rather anything that results in more intensive proc macro codegen for the benefit of the runtime performance (i.e. actually using the proc macro) will cause a "shift" in benchmarks, where:

proc macro benchmarks (e.g. serde_derive) get slower to compile
benchmarks that use proc macros a lot (e.g. token-stream-stress, cargo) get faster

If you look at the regression results it looks like a clear-cut regression, but that's just because serde_derive got hidden - if you check "Show non-relevant results", you can see it's in the green, so it was faster to compile, but at the cost of being slower to use.

I was going to claim that the asymmetry of relative % makes this more confusing, but that's more noticeable the closer you get to -50% (which is undone by +100%, i.e. halving undone by doubling), and for -0.67% it's barely anything.

So maybe there is some noise, although this PR is clearly worse than #97539's perf run for serde_derive, while being better for token-stream-stress (so the same tradeoff but more of it).

It only happens on full opt builds, but this PR doesn't make any changes that affect code generation. If it was a genuine regression I would expect check and debug builds to also regress, and possibly the incremental scenarios.

This PR adds #[inline] to a bunch of functions, which:

just like Make bridge::Buffer generic again. #97539, those functions are now codegen'd in the crate using them, i.e. serde_derive in the case of the relevant benchmarks
unlike just having them generic, #[inline] also hints to LLVM to more aggressively try to inline, and that only affects opt builds, not debug ones

Also, just remembered that Cargo doesn't actually optimize proc macros so... how are we getting improvements from changes that should only matter if proc macros were getting optimized?!

Inline bridge::Buffer methods.

dee353d

This fixes a performance regression caused by making `Buffer` non-generic in rust-lang#97004.

rust-highfive assigned eddyb May 31, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 31, 2022

nnethercote mentioned this pull request May 31, 2022

Make bridge::Buffer generic again. #97539

Closed

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2022

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 1, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 4, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 4, 2022

bors merged commit cb0584f into rust-lang:master Jun 4, 2022

rustbot added this to the 1.63.0 milestone Jun 4, 2022

nnethercote deleted the inline-bridge-Buffer-methods branch June 5, 2022 22:49

rustbot added the perf-regression-triaged The performance regression has been triaged. label Jun 8, 2022

nnethercote mentioned this pull request Aug 3, 2022

Proc macro tweaks #97004

Merged

Inline bridge::Buffer methods. #97604

Inline bridge::Buffer methods. #97604

Uh oh!

Conversation

nnethercote commented May 31, 2022

Uh oh!

rust-highfive commented May 31, 2022

Uh oh!

nnethercote commented May 31, 2022

Uh oh!

rust-timer commented May 31, 2022

Uh oh!

bors commented May 31, 2022

Uh oh!

nnethercote commented May 31, 2022

Uh oh!

bors commented Jun 1, 2022

Uh oh!

rust-timer commented Jun 1, 2022

Uh oh!

rust-timer commented Jun 1, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

nnethercote commented Jun 1, 2022

Uh oh!

eddyb commented Jun 4, 2022

Uh oh!

bors commented Jun 4, 2022

Uh oh!

bors commented Jun 4, 2022

Uh oh!

bors commented Jun 4, 2022

Uh oh!

rust-timer commented Jun 4, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

rylev commented Jun 7, 2022

Uh oh!

nnethercote commented Jun 8, 2022

Uh oh!

eddyb commented Jun 8, 2022

Uh oh!

Uh oh!

Inline `bridge::Buffer` methods. #97604

Inline `bridge::Buffer` methods. #97604