-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Open
Labels
SVEARM Scalable Vector ExtensionsARM Scalable Vector Extensionsllvm-toolsAll llvm tools that do not have corresponding tagAll llvm tools that do not have corresponding tag
Description
-
According llvm-opt-report.html, the V is used to indicate vector length
V: The loop is vectorized. The following numbers indicate the vector length and the interleave factor.
-
demo.c :
__attribute__((noinline))
void demo (double * a, int N){
for (int i = 0; i < N; i++)
a[i] = 2.0 * a[i];
}
- issure reproduce: It show the vector length is 1 with the following instruction in file demo.opt.lst
step1: clang -Ofast -march=armv8.5-a+sve -mllvm -force-vector-interleave=1 -S demo.c -fsave-optimization-record
step2: llvm-opt-report demo.opt.yaml -o demo.opt.lst
[root@localhost 2.1]# cat demo.opt.lst
< demo.c
1 | __attribute__((noinline))
2 | void demo (double * a, int N){
3 V1,1 | for (int i = 0; i < N; i++)
4 | a[i] = 2.0 * a[i];
5 | }
- We can see the indicate vector length is 2 according the final output assemble file demo.s, which is different from the above report generated by llvm-opt-report.
.LBB0_7: // =>This Inner Loop Header: Depth=1
.loc 1 4 17 // demo.c:4:17
ld1d { z0.d }, p0/z, [x0, x11, lsl 3]
.loc 1 4 15 is_stmt 0 // demo.c:4:15
fadd z0.d, z0.d, z0.d
.loc 1 4 9 // demo.c:4:9
st1d { z0.d }, p0, [x0, x11, lsl 3]
.loc 1 3 26 is_stmt 1 // demo.c:3:26
add x11, x11, x10
cmp x9, x11
b.ne .LBB0_7
- BTW: opt-viewer.py can report correctly
step1: clang -Ofast -march=armv8.5-a+sve -mllvm -force-vector-interleave=1 -S demo.c -fsave-optimization-record
step2: python3 /path-to-clang/share/opt-viewer/opt-viewer.py --output-dir
Metadata
Metadata
Assignees
Labels
SVEARM Scalable Vector ExtensionsARM Scalable Vector Extensionsllvm-toolsAll llvm tools that do not have corresponding tagAll llvm tools that do not have corresponding tag
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
fhahn commentedon Jul 18, 2024
@vfdff do you know if this works as expected for fixed-width vector factors (i.e. when dropping
+sve
?) . the vector factor for the loop with+sve
should bevscale x 2
, it is possible that the tool doesn't parse scalable VFs properlyvfdff commentedon Jul 18, 2024
Yes, it works fine for fixed-width vector factors. so it looks like your guess is correct.