Skip to content

Unable do Compile Rust-CUDA because missing CUDNN includes. #204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jmhal opened this issue Apr 24, 2025 · 5 comments
Open

Unable do Compile Rust-CUDA because missing CUDNN includes. #204

jmhal opened this issue Apr 24, 2025 · 5 comments

Comments

@jmhal
Copy link

jmhal commented Apr 24, 2025

Hello everyone,

I am trying to compile Rust-Cuda on Ubuntu 24.04 with Cuda Toolkit 12.8 and driver 550.120. Cargo version is cargo 1.86.0 (adf9b6ad1 2025-02-28). I have just cloned the repository from the main branch.

First, I had problems regarding the Optix SDK. cargo build returned that it could not find the SDK. After digging a bit, I found this line on the optix-sys crate:

const OPTIX_ROOT_ENVS: &[&str] = &["OPTIX_ROOT", "OPTIX_ROOT_DIR"];

So I set OPTIX_ROOT to directory where I placed the SDK, and it worked. But now I got another problem:

error: failed to run custom build command for `cudnn-sys v0.1.0 (/home/npbrust/Rust-CUDA/crates/cudnn-sys)`

Caused by:
  process didn't exit successfully: `/home/npbrust/Rust-CUDA/target/debug/build/cudnn-sys-640a99c44abeadd2/build-script-main` (exit status: 101)
  --- stderr

  thread 'main' panicked at crates/cudnn-sys/build/main.rs:7:42:
  Cannot create cuDNN SDK instance.: "Cannot find cuDNN include directory."
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

So I download the cuDNN (cudnn-linux-x86_64-9.8.0.87_cuda12-archive). After expanding the file, I copied the contents of the include and lib directories into /usr/local/cuda/include and /usr/local/cuda/lib64. cargo build gave me the same error. Then I tried setting the variables CUDNN_INCLUDE_DIR and CUDNN_LIBRARY to the include and lib directories, but that got me nowhere. The question is: how should I point cargo do find the CUDNN includes? Looking at the code for the cudnn_sys crate gave me no directions.

Best Regards.

@jorge-ortega
Copy link
Collaborator

Hi @jmhal,

You can ignore those build failures if you won't be using either optix of cudnn. Same goes for trying to run the optix examples.

cudnn-sys is new, and it seems we didn't define an environment variable to allow you to configure the location of the cudnn sdk/headers. Seems on linux, we expect to find the cudnn headers in either /usr/include or /usr/local/include. In our dev container, which is based of Nvidia's cudnn development image, /usr/include contains symlinks to the headers.

root@088ab2700488:/workspaces/rust-cuda# ls -la /usr/include/cudnn*
lrwxrwxrwx 1 root root 26 Feb 28 05:53 /usr/include/cudnn.h -> /etc/alternatives/libcudnn
lrwxrwxrwx 1 root root 29 Feb 28 05:53 /usr/include/cudnn_adv.h -> /etc/alternatives/cudnn_adv_h
lrwxrwxrwx 1 root root 33 Feb 28 05:53 /usr/include/cudnn_backend.h -> /etc/alternatives/cudnn_backend_h
lrwxrwxrwx 1 root root 29 Feb 28 05:53 /usr/include/cudnn_cnn.h -> /etc/alternatives/cudnn_cnn_h
lrwxrwxrwx 1 root root 31 Feb 28 05:53 /usr/include/cudnn_graph.h -> /etc/alternatives/cudnn_graph_h
lrwxrwxrwx 1 root root 29 Feb 28 05:53 /usr/include/cudnn_ops.h -> /etc/alternatives/cudnn_ops_h
lrwxrwxrwx 1 root root 33 Feb 28 05:53 /usr/include/cudnn_version.h -> /etc/alternatives/cudnn_version_h

You can try to do the same in your environment until we add an env var to configure this. You can also try using our containers by either building one with the included docker files in the container directory, running a devcontainer with the .devcontainer.json (which you can customize), or through our prebuild containers.

cc @adamcavendish

@jmhal
Copy link
Author

jmhal commented Apr 24, 2025

Hi @jorge-ortega,

Thank you for your quick reply. Creating links on /usr/include made the cargo build go a little bit further. Now I have several other optix-related errors like this:

error[E0433]: failed to resolve: could not find `OptixAccelRelocationInfo` in `optix_sys`
   --> crates/optix/src/acceleration.rs:224:36
    |
224 |         let mut inner = optix_sys::OptixAccelRelocationInfo::default();
    |                                    ^^^^^^^^^^^^^^^^^^^^^^^^
    |                                    |
    |                                    could not find `OptixAccelRelocationInfo` in `optix_sys`
    |                                    help: a struct with a similar name exists: `OptixRelocationInfo`

Names mismatches. But since I don't believe we need optix for the momento, I will ignore them as you advised. Instead I tried to compile one of the examples:

npbrust@pargohpc01:~/Rust-CUDA/examples/cuda/vecadd$ cargo build
   Compiling cust_raw v0.11.3 (/home/npbrust/Rust-CUDA/crates/cust_raw)
   Compiling rustc_codegen_nvvm v0.3.0 (/home/npbrust/Rust-CUDA/crates/rustc_codegen_nvvm)
warning: [email protected]: Downloading prebuilt LLVM
error: failed to run custom build command for `rustc_codegen_nvvm v0.3.0 (/home/npbrust/Rust-CUDA/crates/rustc_codegen_nvvm)`

Caused by:
  process didn't exit successfully: `/home/npbrust/Rust-CUDA/target/debug/build/rustc_codegen_nvvm-234b64fbe03128a1/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-env-changed=LLVM_CONFIG
  cargo:rerun-if-env-changed=USE_PREBUILT_LLVM
  cargo:warning=Downloading prebuilt LLVM
  cargo:rerun-if-env-changed=PREBUILT_LLVM_URL

  --- stderr

  thread 'main' panicked at crates/rustc_codegen_nvvm/build.rs:58:14:
  Unsupported target with no matching prebuilt LLVM: `x86_64-unknown-linux-gnu`, install LLVM and set LLVM_CONFIG
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...

It complains about LLVM, but I have llvm installed and even set the variable with export LLVM_CONFIG="llvm-config-7" but had no luck.

I'll try the containers now.

Best Regards.

@jorge-ortega
Copy link
Collaborator

Odd. Do you have USE_PREBUILT_LLVM set? It should have tried to run $LLVM_CONFIG --version and print additional warnings if the command failed or it did not return version 7, but I don't see those warning in your output.

@jorge-ortega
Copy link
Collaborator

The optix names mismatch might have to do with the version of Optix you have installed. I believe we currently support 7.3(?) and it will likely fail to compile with newer versions.

Also note that we require the very specific nightly rustc specified in our rust-toolchain.toml. If you've cloned our repo, you should be good to go, but if you're using our crates from a different project root, you'll need to use the same nightly version as we do. Easiest way to do that is to copy toolchain toml to your project root.

@adamcavendish
Copy link
Contributor

The optix names mismatch might have to do with the version of Optix you have installed. I believe we currently support 7.3(?) and it will likely fail to compile with newer versions.

Yeah, we don't support newer OptiX versions yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants