-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Description
The small reproducers below are insipired by CPU2000/172.mgrid
code from resid
routine. LLVM is not able to fully optimize redundant loads in loops accessing adjacent elements, e.g. a(i-1)+a(i)+a(i+1)
.
Here is the most simple example for 1D array: https://godbolt.org/z/cacGdbhnr
Here, GVN is able to optimize only one load, but fails to optimize the second one.
A more complex example with a 2D array: https://godbolt.org/z/EboxdG5ae
Here, GVN is not able to optimize any of the loads.
The actual 172.mgrid
code uses 3D arrays.
I tried to allow Add
operations with non-constant operands in canPHITrans
, PHITransAddr::translateSubExpr
and PHITransAddr::insertTranslatedSubExpr
, but GVN bails out early during the memory dependencies computations in the 2D case.
I will appreciate any suggestions how to tackle this problem, and any hints!