cuda_fp16.h is not able to include from nvrtcCompileProgram #845
-
Is this a duplicate?
Type of BugRuntime Error Componentcuda.bindings Describe the bugI am trying to create a fp16 kernel program with nvrtcCompileProgram but it fails to compile with the cuda-python wrapper.
I also tried the cpp solution to compile the same program, it works. So I believe the bug lies in cuda-python wrapper.
How to ReproduceMy code repo: https://github.com/tigert1998/mytorch/blob/main/cuda_utils.py#L120 Expected behavior"cuda_fp16.h" should be correctly included with no errors. Operating SystemWindows 11 nvidia-smi outputSun Aug 17 13:32:58 2025 +-----------------------------------------------------------------------------------------+ |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
Your C++ code is not equivalent to your Python code. Could you make sure:
|
Beta Was this translation helpful? Give feedback.
It's my bad. Sorry for wasting your time. I put a
import torch.nn as nn
in somewhere of my code. And that causes the result ofnvrtc
being weird.