-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Match on two-variant enum optimizes poorly #122734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We actually do: %0 = alloca ...
%1 = load i8, ptr %0, align 1, !range <range info> Somewhere in SROA that range information seems to get lost |
Looks like this is because of the range metadata on the load: https://godbolt.org/z/bPfezGhW6. So this may be fixed after LLVM 19, since we'll be able to put range metadata on the by-value arguments. ( |
#122726 should be a similar issue. |
@erikdesjardins I'm curious why we don't put the range metadata on the non-load instruction. |
Reference: llvm/llvm-project#83171. |
Yeah, that's the original PR that added support, and there have been a few follow ups so far to have different passes check for range attributes. Since there's still quite a while until LLVM 19, I imagine that support will be added to every pass or analysis that currently uses range metadata by that time.
LLVM generally only checks for such metadata on loads and calls (e.g.), so I think adding it to other instructions won't have much of an effect. The only usage that allows any kind of instruction appears to be here in InstSimplify. It's used only to optimize comparisons that are always true or always false. |
IIUC, we can add range metadata to other instructions. We just don't do that now. We can quickly get the range of arbitrary values when the source value provides range metadata. Adding range metadata for other instructions doesn't seem necessary if that's correct. |
I'm pretty sure that's because we store to the alloca then load from it again. So a very early pass replaces the load-store with just using the argument, but in doing so doesn't preserve the It'd need to insert an |
I've tried it on #122728. I got a huge change in the perf result. 🫠 |
I'd been thinking about the possibility of LLVM doing the |
@rustbot claim |
Proof: https://alive2.llvm.org/ce/z/U-G7yV Helps: rust-lang/rust#72646 and rust-lang/rust#122734 Rust compiler's current output: https://godbolt.org/z/7E3fET6Md IPSCCP can do this transform but it does not help the motivating issue since it runs only once early in the optimization pipeline. Reimplementing this in CVP folds the motivating issue into a simple `icmp eq` instruction. Fixes #130100
@rustbot label llvm-fixed-upstream |
Proof: https://alive2.llvm.org/ce/z/U-G7yV Helps: rust-lang/rust#72646 and rust-lang/rust#122734 Rust compiler's current output: https://godbolt.org/z/7E3fET6Md IPSCCP can do this transform but it does not help the motivating issue since it runs only once early in the optimization pipeline. Reimplementing this in CVP folds the motivating issue into a simple `icmp eq` instruction. Fixes #130100
Proof: https://alive2.llvm.org/ce/z/U-G7yV Helps: rust-lang/rust#72646 and rust-lang/rust#122734 Rust compiler's current output: https://godbolt.org/z/7E3fET6Md IPSCCP can do this transform but it does not help the motivating issue since it runs only once early in the optimization pipeline. Reimplementing this in CVP folds the motivating issue into a simple `icmp eq` instruction. Fixes llvm#130100
But with llvm/llvm-project#133344 fixed upstream, after the LLVM 21 version upgrade we should revisit this and see whether we can enable this test: rust/tests/codegen/option-niche-eq.rs Lines 65 to 73 in 2a06022
|
Looks like it'll be enough! With trunk LLVM and the current no-prepopulate-passes IR from rustc, it all collapses: https://llvm.godbolt.org/z/h7TW1qnzT Still can't go back to the derive, though, since with the derived |
I looked into this some more, and found another LLVM bug: llvm/llvm-project#134024 Looks like it needs at least three variants before it impacts us, so it's not the solution to fixing the derive either :( EDIT: ...and another one for the >2 variants case, llvm/llvm-project#134028 |
https://godbolt.org/z/f3TMe3rcW
good
gets optimal codegen since the LLVM 18 bump, butbad
has had poor codegen for some time. Codegen also becomes optimal ifa
andb
are references rather than passed by value.The text was updated successfully, but these errors were encountered: