Skip to content

Maybe mis-optimization: Move const array initialized inplace to unname_addr const #94315

Open
@tesuji

Description

@tesuji

This happens in real word code in rust: rust-lang/rust#73825
Clang and zig generates better IR so llvm could optimize const non-mutated arrays to .rodata.

Godbolt link of LLVM-IR: https://godbolt.org/z/PGdYYx54r
Consider this:

define i32 @square(i64 noundef %x) local_unnamed_addr #0 {
  %base = alloca [256 x i8], align 16
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %base)
  store <4 x i32> <i32 67, i32 754, i32 860, i32 559>, ptr %base, align 16
  %1 = getelementptr inbounds [64 x i32], ptr %base, i64 0, i64 4
  store <4 x i32> <i32 368, i32 870, i32 548, i32 972>, ptr %1, align 16
...
  %_3 = and i64 %x, 63
  %16 = getelementptr inbounds [64 x i32], ptr %base, i64 0, i64 %_3
  %_0 = load i32, ptr %16, align 4
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %base)
  ret i32 %_0
}

I expect the above IR optimized to:

@square.base = private unnamed_addr constant <{ [256 x i8] }> <{ [256 x i8] c"C\00\00\00\F2\02\00\00\\\03\00\00/\02\00\00p\01\00\00f\03\00\00$\02\00\00\CC\03\00\00\8D\00\00\00\DB\02\00\00_\01\00\00\98\02\00\00 \00\00\00\04\00\00\00\E4\03\00\00\E5\02\00\00\CB\00\00\00$\01\00\00\ED\00\00\00\E0\01\00\00\97\00\00\00\AC\03\00\00\09\03\00\00\1C\02\00\00\8F\00\00\00K\02\00\00\EB\02\00\00A\00\00\00\98\00\00\00\05\02\00\00r\03\00\00p\03\00\00\C8\02\00\00S\02\00\00r\01\00\00\85\03\00\00\ED\00\00\005\00\00\00\15\03\00\00\11\03\00\00\90\03\00\00\8A\02\00\00\80\03\00\00o\01\00\00<\01\00\00\88\01\00\00>\00\00\00\D9\01\00\00\A3\02\00\00\B3\02\00\00\19\01\00\00\C0\00\00\00\BD\01\00\00\CA\03\00\00\E1\00\00\00\A9\01\00\00t\02\00\00D\01\00\00B\01\00\00\CE\00\00\00\90\03\00\00c\03\00\00\CE\01\00\00\\\00\00\00" }>, align 4

define i32 @square(i64 noundef %x) unnamed_addr #0 {
  %_3 = and i64 %x, 63
  %0 = getelementptr inbounds [64 x i32], ptr @square.base, i64 0, i64 %_3,
  %_0 = load i32, ptr %0, align 4, !noundef !11
  ret i32 %_0
}

I know the IR that rustc generates is poor. I try to improve it but some people wonder if LLVM could do its marvelous job again with current IR.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions