You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SYCL][Fusion] Kernel Fusion support for CUDA backend (#8747)
Extend kernel fusion for the CUDA backend.
In contrast to the existing SPIR-V based backends, the default binary
format for the CUDA backend (PTX or CUBIN) is not suitable as input for
the kernel fusion JIT compiler.
This PR therefore extends the driver to **additionally** embed LLVM IR
in the fat binary if the user specifies the `-fsycl-embed-ir` during
compilation, by taking the output of the `sycl-post-link` step for the
CUDA backend.
The JIT compiler has been extended to handle LLVM IR as input format and
PTX assembly as output format (including translation via the NVPTX
backend). Target-specific parts of the fusion process have been
refactored to `TargetFusionInformation`.
The connecting logic to the JIT compiler in the SYCL RT has been
extended to produce valid PI device binaries for the CUDA backend/PI.
Heterogeneous ND ranges are not yet supported for the CUDA backend.
---------
Signed-off-by: Lukas Sommer <[email protected]>
0 commit comments