Skip to content

Conversation

@nnethercote
Copy link
Collaborator

With CUDA 11.x support dropped (#312) and CUDA 13.0 support added (#299), the supported compilation targets need updating, with some additions and removals. There are three different types that encode compilation targets.

  • NvvmArch: this is the main one. The first commit updates it.
  • ComputeCapability: this is an auxiliary one used only for the CUDA_ARCH environment variable, used for conditional compilation. The second commit removes it because cfg provides a better way of doing conditional compilation.
  • JitTarget: this is an auxiliary one used only for the obscure ModuleJitOptions. It can be removed because it's identical to a generated type in cust_raw.

Details in the individual commit messages.

Remove and add some `NvvmArch` variants.
  Remove `compute_35` and `compute_37`, which are no longer
  needed/supported now that CUDA 11.x support is gone (Rust-GPU#312).
- Add `compute_73`, which is supported in CUDA 12.0-12.8 but was never
  added.
- Add `compute_88` and `compute110{,f,a}`, which are new in CUDA 13.0.
CUDA C++ has the `__CUDA_ARCH__` macro for conditional compilation.
rust-cuda has a `CUDA_ARCH` environment variable that is similar, and
the `from_cuda_arch_env` method parses the environment variable's value
to produce a value of type `ComputeCapability`, which can be queried for
conditional compilation.

But `ComputeCapability` has a big problem. It's missing all the
capabilities after 80, including the 'a' and 'f' suffix ones. We could
just add them, but it implements `PartialOrd`/`Ord` and uses ordering to
determine feature availability. This was valid before the 'a' and 'f'
suffixes were added but is no longer, because some pairs of values are
incomparable. E.g. `100a` and `101a` -- each one has some features the
other doesn't, so neither is clearly larger than the other, and they're
also not equal.

So, what to do? Well, `CUDA_ARCH` was added in 2022. More recently,
another mechanism for conditional compilation was added:
`target_feature`, in Rust-GPU#239. This does work with the 'a' and 'f' suffix
targets, and it's more Rust-y.

So this commit just removes `CUDA_ARCH` and `ComputeCapability`
(removing two more places where the default compilation target is
specified) and changes the only uses (in `cuda_std/src/atomic/mid.rs`)
to use `target_feature` instead. We don't have any tests exercising
conditional compilation, alas, but I did some manual checking locally to
verify that it works the same.
It includes some now-unsupported targets and is also missing some new
targets. The obvious thing to do is update it, but it's simpler and
better to recognize that it's identical to the generated
`driver_sys::CUjit_target` type and instead use that generated type
directly, avoiding the need for manual updating in the future.
Especially given that there is a non-trivial encoding for targets with
'a' and 'f' suffixes (which involves adding 2^16 and 2^17, respectively,
to the base number).

This seems fine because this `ModuleJitOption` type is obscure and has
no existing uses in the codebase.
@nnethercote nnethercote force-pushed the update-compilation-targets branch from bba43c3 to 9cd54eb Compare November 25, 2025 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant