Description
When running examples of the joint_matrix extension(like sycl/test/matrix/matrix-int8-test.cpp) with "clang++ -fsycl -Xsycl-target-backend --cuda-gpu-arch=sm_75 -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4 matrix-int8-test.cpp -o matri.out" it compiles but then I get
"terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted"
I have tried many combinations of the commands and flags to compile this for cuda but most return this error when running "./matri.out".
After building sycl with cuda backend I compile the "simple-sycl-app.cpp" with no issues with "clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \ simple-sycl-app.cpp -o simple-sycl-app-cuda.exe" and run with "ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe" but any example code that uses the joint_matrix extension I cannot get to run.
Environment
- OS: Linux
- Target device and vendor: Nvidia T4
- DPC++ version: clang version 18.0.0 (https://github.com/intel/llvm 2f90192)
- CUDA : Driver Version: 525.105.17 CUDA Version: 12.0
Thanks in advance for any help.