Skip to content

Conversation

@scottmcm
Copy link
Member

@scottmcm scottmcm commented Jan 10, 2026

#150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just memcmp and thus hopefully preserving the perf win.

This should also allow MIR-inlining the length check, which was previously blocked, and thus might allow some obvious non-matches to optimize away as well.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 10, 2026
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Jan 10, 2026
Tweak `SlicePartialEq` to allow MIR-inlining the `compare_bytes` call
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 10, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 11, 2026

☀️ Try build successful (CI)
Build commit: 8018bcc (8018bcc8c2cb8bbdb7e2eee7163156d48c0bcc85, parent: f57eac1bf98cb5d578e3364b64365ec398c137df)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8018bcc): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.5% [0.3%, 1.1%] 7
Regressions ❌
(secondary)
0.4% [0.4%, 0.4%] 1
Improvements ✅
(primary)
-0.8% [-2.8%, -0.1%] 5
Improvements ✅
(secondary)
-0.9% [-2.7%, -0.1%] 3
All ❌✅ (primary) -0.0% [-2.8%, 1.1%] 12

Max RSS (memory usage)

Results (primary 1.0%, secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.9% [2.0%, 10.8%] 6
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.9% [-6.7%, -3.1%] 3
Improvements ✅
(secondary)
-2.7% [-2.7%, -2.7%] 1
All ❌✅ (primary) 1.0% [-6.7%, 10.8%] 9

Cycles

Results (primary 2.0%, secondary 2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.0% [2.0%, 2.0%] 1
Regressions ❌
(secondary)
2.7% [2.0%, 3.2%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.0% [2.0%, 2.0%] 1

Binary size

Results (primary 0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.4% [0.1%, 0.7%] 4
Regressions ❌
(secondary)
0.3% [0.1%, 0.5%] 4
Improvements ✅
(primary)
-0.3% [-1.2%, -0.0%] 4
Improvements ✅
(secondary)
-1.2% [-1.2%, -1.2%] 1
All ❌✅ (primary) 0.0% [-1.2%, 0.7%] 8

Bootstrap: 473.812s -> 477.487s (0.78%)
Artifact size: 391.34 MiB -> 391.34 MiB (0.00%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 11, 2026
@scottmcm
Copy link
Member Author

Hmm, so this recovered the syn loss from #150265 (comment), but isn't an obvious overall win.

I do like removing the second [rustc_no_mir_inline] that #150265 had added, though, so maybe it makes sense regardless.

cc @saethlin in case you have any thoughts here.

@scottmcm scottmcm marked this pull request as ready for review January 11, 2026 18:07
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 11, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 11, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 11, 2026

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@saethlin
Copy link
Member

I don't have any strong opinions on this. It would be neat if we had better ways to learn about the impacts of MIR opts than squinting at the perf result.

@rust-bors

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member

r=me with rebase if we want to go ahead.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2026
150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just `memcmp`.

This should also allow MIR-inlining the length check, which was previously blocked.
@scottmcm scottmcm force-pushed the tweak-slice-partial-eq branch from 60aecfc to 51de309 Compare January 27, 2026 08:10
@rustbot
Copy link
Collaborator

rustbot commented Jan 27, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
REPOSITORY                                   TAG       IMAGE ID       CREATED      SIZE
ghcr.io/dependabot/dependabot-updater-core   latest    354d02aa29ac   7 days ago   783MB
=> Removing docker images...
Deleted Images:
untagged: ghcr.io/dependabot/dependabot-updater-core:latest
untagged: ghcr.io/dependabot/dependabot-updater-core@sha256:596da3f22bcbdff2c96fd7126001278022c834c1621c5efa2ad1a7794590636c
deleted: sha256:354d02aa29acf525570c732b6e006ecf138de6d63ca525d552eb4b24880ddc6c
deleted: sha256:8b7af0e426bc2cbeeacfd96b8354d3b80016991520977197e62090e47abaede8
deleted: sha256:cadf11ef1de7fdd5eab563757942353684047f09b212dc99d6ed48e8acf34d62
deleted: sha256:569b0caf9d5285db44ccd2629a3470139eea755be423a33a54d8a24cb3926bfa
deleted: sha256:f9dc5feb048d8f9fd43137e3998f59e9acfbd76c47a4e14984d109654119e282
---
tests/ui/double_parens.rs ... ok
tests/ui/drain_collect.fixed ... ok
tests/ui/duplicate_underscore_argument.rs ... ok
tests/ui/duplicated_attributes.rs ... ok
tests/ui/duration_suboptimal_units.rs ... ok
tests/ui/duration_suboptimal_units_days_weeks.rs ... ok
tests/ui/duration_subsec.rs ... ok
tests/ui/double_parens.fixed ... ok
tests/ui/duration_suboptimal_units_days_weeks.fixed ... ok
tests/ui/duration_suboptimal_units.fixed ... ok
tests/ui/duration_subsec.fixed ... ok
tests/ui/empty_docs.rs ... ok
tests/ui/elidable_lifetime_names.rs ... ok
tests/ui/else_if_without_else.rs ... ok
tests/ui/eager_transmute.rs ... ok
---
[WARNING] line 39: Delta is 0 for "x", maybe try to use `compare-elements-position` instead?

======== tests/rustdoc-gui/setting-go-to-only-result.goml ========

[ERROR] setting-go-to-only-result output:
Execution context was destroyed, most likely because of a navigation.
stack: Error: Execution context was destroyed, most likely because of a navigation.
    at rewriteError (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/ExecutionContext.js:457:15)
    at async #evaluate (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/ExecutionContext.js:389:60)
    at async ExecutionContext.evaluate (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/ExecutionContext.js:277:16)
    at async IsolatedWorld.evaluate (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/IsolatedWorld.js:100:16)
    at async CdpFrame.evaluate (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Frame.js:362:20)
    at async CdpPage.evaluate (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Page.js:826:20)
    at async /checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/browser-ui-test/src/index.js:432:28
    at async waitForConditionTrue (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/browser-ui-test/src/utils.js:209:13)
    at async runAllCommands (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/browser-ui-test/src/index.js:431:22)
    at async innerRunTestCode (/checkout/obj/build/x86_64-unknown-linux-gnu/test/rustdoc-gui/node_modules/browser-ui-test/src/index.js:714:21)



<= doc-ui tests done: 146 succeeded, 1 failed, 0 filtered out

@scottmcm
Copy link
Member Author

"Execution context was destroyed, most likely because of a navigation."

@scottmcm scottmcm closed this Jan 27, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 27, 2026
@scottmcm scottmcm reopened this Jan 27, 2026
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 27, 2026
@scottmcm
Copy link
Member Author

The more I think about it the more I like removing the rustc_no_mir_inline that needed a giant comment, so let's do it.

@bors r=Mark-Simulacrum

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 27, 2026

📌 Commit 51de309 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 27, 2026
@Zalathar
Copy link
Member

Scheduling: Encourage a mixture of rollup and non-rollup PRs.

@bors p=5

@rust-bors

This comment has been minimized.

@rust-bors rust-bors bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 28, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 28, 2026

☀️ Test successful - CI
Approved by: Mark-Simulacrum
Duration: 3h 10m 31s
Pushing 1e5065a to main...

@rust-bors rust-bors bot merged commit 1e5065a into rust-lang:main Jan 28, 2026
19 of 23 checks passed
@rustbot rustbot added this to the 1.95.0 milestone Jan 28, 2026
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing a234ae6 (parent) -> 1e5065a (this PR)

Test differences

No test diffs found

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 1e5065a4d99e0e3ccf1a1719055308e7a20e8f36 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. pr-check-1: 1921.4s -> 1644.9s (-14.4%)
  2. aarch64-gnu-debug: 4603.4s -> 3957.1s (-14.0%)
  3. x86_64-gnu-llvm-21-3: 7011.9s -> 6049.7s (-13.7%)
  4. dist-apple-various: 5134.1s -> 4444.9s (-13.4%)
  5. aarch64-gnu-llvm-20-1: 3908.8s -> 3403.2s (-12.9%)
  6. armhf-gnu: 5435.4s -> 4744.4s (-12.7%)
  7. i686-gnu-2: 6104.1s -> 5387.0s (-11.7%)
  8. i686-gnu-1: 8456.9s -> 7489.8s (-11.4%)
  9. x86_64-gnu-llvm-20: 4644.9s -> 4125.3s (-11.2%)
  10. x86_64-gnu-gcc: 4038.9s -> 3663.1s (-9.3%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1e5065a): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.4% [0.3%, 0.6%] 6
Regressions ❌
(secondary)
0.7% [0.7%, 0.7%] 1
Improvements ✅
(primary)
-0.9% [-2.3%, -0.1%] 7
Improvements ✅
(secondary)
-2.3% [-2.3%, -2.3%] 1
All ❌✅ (primary) -0.3% [-2.3%, 0.6%] 13

Max RSS (memory usage)

Results (primary -7.2%, secondary 2.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.4% [4.6%, 5.7%] 5
Improvements ✅
(primary)
-7.2% [-10.8%, -3.6%] 2
Improvements ✅
(secondary)
-3.6% [-6.9%, -2.0%] 3
All ❌✅ (primary) -7.2% [-10.8%, -3.6%] 2

Cycles

Results (primary 8.6%, secondary 4.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
9.5% [3.2%, 17.3%] 12
Regressions ❌
(secondary)
4.8% [2.5%, 6.3%] 4
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 8.6% [-2.0%, 17.3%] 13

Binary size

Results (primary -0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.6%] 25
Regressions ❌
(secondary)
0.1% [0.0%, 0.2%] 9
Improvements ✅
(primary)
-0.3% [-1.0%, -0.0%] 12
Improvements ✅
(secondary)
-1.0% [-1.0%, -1.0%] 1
All ❌✅ (primary) -0.0% [-1.0%, 0.6%] 37

Bootstrap: 476.908s -> 474.687s (-0.47%)
Artifact size: 397.92 MiB -> 397.87 MiB (-0.01%)

@scottmcm
Copy link
Member Author

The cycle regression looks like html5ever bimodality -- it went back to the previous level in #151550.

(It appears that #151646 and #151674 had similar spikes.)

@scottmcm scottmcm deleted the tweak-slice-partial-eq branch January 29, 2026 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants