You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In JuliaGPU/CUDA.jl#1895, I made the size tuple of CuDeviceArray 32 bits so that we can emit better code (lowering register pressure, making it possible to execute compute & indexing instructions in parallel, etc) However, the NVPTX back-end defaults to using 64 bits for indexing pointers, resulting in 64-bits GEPs being introduced by the front-end and optimization. I tried to change that by specifying a 32-bit pointer index size in the data layout, #444, but that breaks 64-bits indices which can still get reintroduced by optimization (see e.g. #461).
Either we try this again on LLVM 17 (where a bug has been fixed that was introducing 64-bits GEP offsets), or we instead create an optimization pass that demotes GEP indices to 32 bits if possible (e.g., if they are constants, or come from the size field of a device array).
The text was updated successfully, but these errors were encountered:
In JuliaGPU/CUDA.jl#1895, I made the size tuple of
CuDeviceArray
32 bits so that we can emit better code (lowering register pressure, making it possible to execute compute & indexing instructions in parallel, etc) However, the NVPTX back-end defaults to using 64 bits for indexing pointers, resulting in 64-bits GEPs being introduced by the front-end and optimization. I tried to change that by specifying a 32-bit pointer index size in the data layout, #444, but that breaks 64-bits indices which can still get reintroduced by optimization (see e.g. #461).Either we try this again on LLVM 17 (where a bug has been fixed that was introducing 64-bits GEP offsets), or we instead create an optimization pass that demotes GEP indices to 32 bits if possible (e.g., if they are constants, or come from the size field of a device array).
The text was updated successfully, but these errors were encountered: