Skip to content

i32.clamp() suggested by Clippy produces worse code than i32.min().max() #141915

@Shnatsel

Description

@Shnatsel
Member

On this code in image-webp, following the Clippy lint to replace .max(0).min(255) with .clamp(0,255) on an i32 value causes a performance regression:

https://github.com/image-rs/image-webp/blob/93baf7de7df50977a1fcb3a0bb53036d4780bff3/src/vp8.rs#L994-L999

It's unfortunate that .min().max() and .clamp() are not equivalent, and doubly so when Clippy nags us to rewrite the code in a way that makes it slower.

I've posted a self-contained sample that reproduces the issue on godbolt:

Generated assembly for .min().max(): https://rust.godbolt.org/z/zr7PK8vz3
Generated assembly for .clamp(): https://rust.godbolt.org/z/b898M45vo

You can see that the .clamp() version results in far more assembly; the vectorized loop is roughly twice the amount of instructions.

I've confirmed that the issue exists in rustc 1.75, 1.82 and 1.87 which is the latest as of this writing.

Activity

added
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Jun 2, 2025
added
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
I-slowIssue: Problems and improvements with respect to performance of generated code.
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing such
and removed
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Jun 2, 2025
folkertdev

folkertdev commented on Jun 2, 2025

@folkertdev
Contributor

The issue here is the assert!(min <= max), when it is removed both versions generate the same code.

    fn clamp(self, min: Self, max: Self) -> Self
    where
        Self: Sized,
    {
        assert!(min <= max);
        if self < min {
            min
        } else if self > max {
            max
        } else {
            self
        }
    }

Unfortunately, the panic is documented behavior https://doc.rust-lang.org/std/cmp/trait.Ord.html#method.clamp, so I'm not sure what can actually be done here.

Shnatsel

Shnatsel commented on Jun 2, 2025

@Shnatsel
MemberAuthor

Oh, I can confirm that's the case on godbolt. That's weird; I expect the panic to be removed by constant propagation, and it is, but it messes with optimization anyway.

JonathanBrouwer

JonathanBrouwer commented on Jun 2, 2025

@JonathanBrouwer
Contributor

This is a simplified example:
https://rust.godbolt.org/z/avTqGq4Kj
The assert actually does not matter in this simple code (uncommenting it does not change the generated code)
I've also discovered that although the same amount of instructions is generated, the instructions themselves are different, possibly optimizing worse

samueltardieu

samueltardieu commented on Jun 2, 2025

@samueltardieu
Member

@JonathanBrouwer Inverting the order in which min and max are compared makes clamp_impl() generate the same code as clip1(), even with the assertion enabled.

okaneco

okaneco commented on Jun 2, 2025

@okaneco
Contributor

Related/duplicate of #125738

I reported the lint to clippy back when it was in the beta, I believe.
It was decided to close that issue and open an issue in this repo
rust-lang/rust-clippy#12826

LLVM upstream issue - llvm/llvm-project#104875
The PR there appears to have stalled out

hkBst

hkBst commented on Jun 5, 2025

@hkBst
Member

@Shnatsel if the issue is the order in which clamp does the comparisons, does manually exchanging the max and min calls in the manual clamp also result in a performance difference?

Shnatsel

Shnatsel commented on Jun 5, 2025

@Shnatsel
MemberAuthor

There is no performance difference between .min().max() and .max().min(), and the assembly differences are minimal as well: https://rust.godbolt.org/z/1v857sa1K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @samueltardieu@Shnatsel@JonathanBrouwer@folkertdev@hkBst

        Issue actions

          i32.clamp() suggested by Clippy produces worse code than i32.min().max() · Issue #141915 · rust-lang/rust