cuda_fp16.h is not able to include from nvrtcCompileProgram #845

tigert1998 · 2025-08-17T05:33:18Z

tigert1998
Aug 17, 2025

Is this a duplicate?

I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct

Type of Bug

Runtime Error

Component

cuda.bindings

Describe the bug

I am trying to create a fp16 kernel program with nvrtcCompileProgram but it fails to compile with the cuda-python wrapper.
The error log is the following:

RuntimeError: Cuda compile error: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5096): error: identifier "NV_IS_DEVICE" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5097): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5096): error: identifier "NV_IF_ELSE_TARGET" is undefined    

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5098): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5101): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5102): error: expected a ";"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5122): error: identifier "NV_IS_DEVICE" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5123): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5122): error: identifier "NV_IF_ELSE_TARGET" is undefined    

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5124): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5127): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5128): error: expected a ";"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5140): error: identifier "NV_IS_DEVICE" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5141): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5140): error: identifier "NV_IF_ELSE_TARGET" is undefined    

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5142): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5144): error: identifier "tr" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5148): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5149): error: expected a ";"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(264): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(265): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(264): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(266): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(269): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(271): warning #940-D: missing return statement at end of non-void function "__half2::operator=(const __half2 &&)"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(274): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(275): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(274): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(276): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(279): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(281): warning #940-D: missing return statement at end of non-void function "__half2::operator=(const __half2 &)"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(283): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(284): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(283): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(285): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(287): error: identifier "tr" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(291): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(293): warning #940-D: missing return statement at end of non-void function "__half2::operator=(const __half2_raw &)"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(296): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(297): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(296): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(300): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(303): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(305): warning #940-D: missing return statement at end of non-void function "__half2::operator __half2_raw"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: type name is not allowed

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "val" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: expression must be a modifiable lvalue        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: an asm operand must have scalar type

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "result" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "result" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: identifier "result" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(408): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(545): warning #12-D: parsing restarts here after previous syntax error

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(545): error: expected a ";"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(549): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(550): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(550): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(549): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(551): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(555): error: identifier "r" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(555): error: identifier "__internal_float2half" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(560): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(562): warning #940-D: missing return statement at end of non-void function "__float2half"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(566): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(567): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(567): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(566): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(568): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(572): error: identifier "r" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(572): error: identifier "__internal_float2half" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(577): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(579): warning #940-D: missing return statement at end of non-void function "__float2half_rn"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(583): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(584): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(584): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(583): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(585): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(589): error: identifier "r" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(589): error: identifier "__internal_float2half" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(591): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(593): warning #940-D: missing return statement at end of non-void function "__float2half_rz"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(597): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(598): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(598): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(597): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(599): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(603): error: identifier "r" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(603): error: identifier "__internal_float2half" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(608): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(610): warning #940-D: missing return statement at end of non-void function "__float2half_rd"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(614): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(615): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(615): error: expected a ")"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(614): error: identifier "NV_IF_ELSE_TARGET" is undefined   

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(616): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(620): error: identifier "r" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(620): error: identifier "__internal_float2half" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(625): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(627): warning #940-D: missing return statement at end of non-void function "__float2half_ru"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(631): error: identifier "NV_IS_DEVICE" is undefined        

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(632): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(631): error: identifier "NV_IF_ELSE_TARGET" is undefined

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(635): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(637): error: expected an expression

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(639): warning #940-D: missing return statement at end of non-void function "__float2half2_rn"

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(630): warning #177-D: variable "val" was declared but never referenced

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.hpp(644): error: identifier "NV_PROVIDES_SM_80" is undefined

Error limit reached.
100 errors detected in the compilation of "cuda_kernels\conv2d.cu".
Compilation terminated.

I also tried the cpp solution to compile the same program, it works. So I believe the bug lies in cuda-python wrapper.

#include <cuda.h>
#include <nvrtc.h>

#include <iostream>
#include <vector>

#define NVRTC_CHECK(x)                                            \
  do {                                                            \
    nvrtcResult result = x;                                       \
    if (result != NVRTC_SUCCESS) {                                \
      std::cerr << "NVRTC error: " << nvrtcGetErrorString(result) \
                << std::endl;                                     \
    }                                                             \
  } while (0)

#define CUDA_CHECK(x)                                  \
  do {                                                 \
    CUresult result = x;                               \
    if (result != CUDA_SUCCESS) {                      \
      const char* msg;                                 \
      cuGetErrorString(result, &msg);                  \
      std::cerr << "CUDA error: " << msg << std::endl; \
    }                                                  \
  } while (0)

int main() {
  const char* cuda_source = R"(
        #include <cuda_fp16.h>
        
        extern "C" __global__ void half_add_kernel(half* a, half* b, half* c, int n) {
            int idx = blockIdx.x * blockDim.x + threadIdx.x;
            if (idx < n) {
                // 半精度加法运算
                c[idx] = a[idx] + b[idx];
            }
        }
    )";

  nvrtcProgram program;
  NVRTC_CHECK(nvrtcCreateProgram(&program, cuda_source, "half_kernel.cu", 0,
                                 nullptr, nullptr));

  const char* opts[] = {
      "--gpu-architecture=compute_89", "--fmad=false",
      "-IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include"};

  std::cout << "Compiling CUDA code with half precision support..."
            << std::endl;
  NVRTC_CHECK(nvrtcCompileProgram(program, 3, opts));
  size_t log_size;
  NVRTC_CHECK(nvrtcGetProgramLogSize(program, &log_size));
  char log[1 << 10];
  NVRTC_CHECK(nvrtcGetProgramLog(program, log));
  std::cout << std::string(log, log + log_size) << std::endl;

  size_t ptx_size;
  NVRTC_CHECK(nvrtcGetPTXSize(program, &ptx_size));
  std::vector<char> ptx(ptx_size);
  NVRTC_CHECK(nvrtcGetPTX(program, ptx.data()));

  NVRTC_CHECK(nvrtcDestroyProgram(&program));

  CUdevice device;
  CUcontext context;
  CUmodule module;
  CUfunction kernel;

  CUDA_CHECK(cuInit(0));
  CUDA_CHECK(cuDeviceGet(&device, 0));
  CUDA_CHECK(cuCtxCreate(&context, 0, 0, device));

  CUDA_CHECK(cuModuleLoadDataEx(&module, ptx.data(), 0, nullptr, nullptr));
  CUDA_CHECK(cuModuleGetFunction(&kernel, module, "half_add_kernel"));

  std::cout << "Compilation successful. PTX code generated." << std::endl;

  CUDA_CHECK(cuModuleUnload(module));
  CUDA_CHECK(cuCtxDestroy(context));

  return 0;
}

How to Reproduce

My code repo: https://github.com/tigert1998/mytorch/blob/main/cuda_utils.py#L120
The kernel is located at: https://github.com/tigert1998/mytorch/blob/main/cuda_kernels/conv2d.cu

Expected behavior

"cuda_fp16.h" should be correctly included with no errors.

Operating System

Windows 11

nvidia-smi output

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|

Answered by tigert1998

Aug 17, 2025

It's my bad. Sorry for wasting your time. I put a import torch.nn as nn in somewhere of my code. And that causes the result of nvrtc being weird.

View full answer

leofang · 2025-08-17T09:05:12Z

leofang
Aug 17, 2025
Maintainer

Your C++ code is not equivalent to your Python code. Could you make sure:

You pass the include path consistently (via the env var CUDA_PATH as done in Python)
You don't set -std=c++11 (as done in C++); the NVRTC default will kick in which is c++17

4 replies

tigert1998 Aug 17, 2025
Author

I only have one CUDA toolkit installed on my windows 11. Checkout the first python error log line: RuntimeError: Cuda compile error: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cuda_fp16.h(5096): error: identifier "NV_IS_DEVICE" is undefined. The toolkit path is exactly the same with the cpp cuda path I used.
It seems that nvrtc is not using c++17 by default. Without -std=c++11, the kernel compiler cannot even recognize nullptr. After all, I make the python options -std=c++17, but the bug still remains the same.

leofang Aug 17, 2025
Maintainer

I meant you should pass either-std=c++17 to NVRTC or don't pass anything (as done in your C++ reproducer). NVRTC is inherently a C++ compiler. nullptr should be recognized by any std mode.

Another thing to check: you should also pass -I$CUDA_PATH/include/cccl to NVRTC. Starting CUDA 13.0, all CCCL headers, including nv/targets, are moved under the cccl folder:

cuda-python/cuda_core/tests/helpers.py

Lines 6 to 16 in 8f1dd40

    
           CUDA_PATH = os.environ.get("CUDA_PATH") 
        
           CUDA_INCLUDE_PATH = None 
        
           CCCL_INCLUDE_PATHS = None 
        
           if CUDA_PATH is not None: 
        
               path = os.path.join(CUDA_PATH, "include") 
        
               if os.path.isdir(path): 
        
                   CUDA_INCLUDE_PATH = path 
        
                   CCCL_INCLUDE_PATHS = (path,) 
        
                   path = os.path.join(path, "cccl") 
        
                   if os.path.isdir(path): 
        
                       CCCL_INCLUDE_PATHS = (path,) + CCCL_INCLUDE_PATHS

https://developer.nvidia.com/blog/whats-new-and-important-in-cuda-toolkit-13-0/#cccl_headers_have_moved_in_cuda_130

tigert1998 Aug 17, 2025
Author

It's my bad. Sorry for wasting your time. I put a import torch.nn as nn in somewhere of my code. And that causes the result of nvrtc being weird.

Answer selected by tigert1998

leofang Aug 17, 2025
Maintainer

Check your pip list. It could be possible that PyTorch loaded NVRTC 12.x installed in you Python environment for you, instead of using locally installed 13.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda_fp16.h is not able to include from nvrtcCompileProgram #845

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

cuda_fp16.h is not able to include from nvrtcCompileProgram #845

Uh oh!

tigert1998 Aug 17, 2025

Is this a duplicate?

Type of Bug

Component

Describe the bug

How to Reproduce

Expected behavior

Operating System

nvidia-smi output

Replies: 1 comment · 4 replies

Uh oh!

Uh oh!

leofang Aug 17, 2025 Maintainer

Uh oh!

tigert1998 Aug 17, 2025 Author

Uh oh!

leofang Aug 17, 2025 Maintainer

Uh oh!

tigert1998 Aug 17, 2025 Author

Uh oh!

leofang Aug 17, 2025 Maintainer

tigert1998
Aug 17, 2025

Replies: 1 comment 4 replies

leofang
Aug 17, 2025
Maintainer

tigert1998 Aug 17, 2025
Author

leofang Aug 17, 2025
Maintainer

tigert1998 Aug 17, 2025
Author

leofang Aug 17, 2025
Maintainer