Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA extension error #33

Closed
wytcsuch opened this issue May 13, 2022 · 4 comments
Closed

CUDA extension error #33

wytcsuch opened this issue May 13, 2022 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@wytcsuch
Copy link

wytcsuch commented May 13, 2022

Thank you for your good job, however there is an erro when I build CUDA extension.
torch = 1.11.0
python = 3.7
cuda = 10.1

Traceback (most recent call last):
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1746, in _run_ninja_build
    env=env)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 20, in <module>
    'nattenav_cuda', [f'{this_dir}/src/nattenav_cuda.cpp', f'{this_dir}/src/nattenav_cuda_kernel.cu'], verbose=False)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1156, in load
    keep_intermediates=keep_intermediates)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1367, in _jit_compile
    is_standalone=is_standalone)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1472, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'nattenav_cuda': [1/3] /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-c++ -MMD -MF nattenav_cuda.o.d -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda.cpp -o nattenav_cuda.o
[2/3] /usr/local/cuda-10.1/bin/nvcc  -ccbin /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-cc -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda_kernel.cu -o nattenav_cuda_kernel.cuda.o
FAILED: nattenav_cuda_kernel.cuda.o
/usr/local/cuda-10.1/bin/nvcc  -ccbin /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-cc -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda_kernel.cu -o nattenav_cuda_kernel.cuda.o
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc: In instantiation of 'static std::basic_string<_CharT, _Traits, _Alloc>::_Rep* std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_S_create(std::basic_string<_CharT, _Traits, _Alloc>::size_type, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]':
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:578:28:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&, std::forward_iterator_tag) [with _FwdIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5033:20:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct_aux(_InIterator, _InIterator, const _Alloc&, std::__false_type) [with _InIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5054:24:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&) [with _InIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:656:134:   required from 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:6676:95:   required from here
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:1067:16: error: cannot call member function 'void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]' without object
       __p->_M_set_sharable();
       ~~~~~~~~~^~
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc: In instantiation of 'static std::basic_string<_CharT, _Traits, _Alloc>::_Rep* std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_S_create(std::basic_string<_CharT, _Traits, _Alloc>::size_type, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]':
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:578:28:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&, std::forward_iterator_tag) [with _FwdIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5033:20:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct_aux(_InIterator, _InIterator, const _Alloc&, std::__false_type) [with _InIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5054:24:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&) [with _InIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:656:134:   required from 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:6681:95:   required from here
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:1067:16: error: cannot call member function 'void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]' without object
ninja: build stopped: subcommand failed.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 27, in <module>
    import nattenav_cuda
ModuleNotFoundError: No module named 'nattenav_cuda'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "natten/gradcheck.py", line 11, in <module>
    from nattencuda import NATTENAVFunction, NATTENQKRPBFunction
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 30, in <module>
    raise RuntimeError("Could not load NATTEN CUDA extension. " +
RuntimeError: Could not load NATTEN CUDA extension. Please make sure your device has CUDA, the CUDA toolkit for PyTorch is installed, and that you've compiled NATTEN correctly.
@alihassanijr
Copy link
Member

Hello and thank you for your interest.

This is strange, it is failing to compile but I don't see exactly what part of the extension is causing that.
I believe we have tried both Python 3.7 and 3.8 with torch 1.11 and in both cases NATTEN compiles.
You did state that you're on CUDAv10.1, and as fas as I'm aware the latest torch version with the same version CUDA toolkit was 1.8.1, so this might be the root cause.
Can you confirm the version of your torch and the cuda toolkit with it? That would help us try to reproduce the issue on our end so we can debug it. As far as I'm seeing torch 1.11 appears to have been built for three toolkits only: v10.2, 11.3, and 11.5. Just want to confirm which one you're on.

To get those, you can simply run:

python3 -c "import torch; print(torch.__version__); print(torch._C._cuda_getCompiledVersion())"

and this to get the actual CUDA driver version:

nvcc --version

I can confirm that I tried a docker image with CUDA v10.1, installed pytorch, and all the other requirements directly using the requirements.txt file, and it built successfully.

@alihassanijr alihassanijr added the help wanted Extra attention is needed label May 13, 2022
@wytcsuch
Copy link
Author

wytcsuch commented May 14, 2022

python3 -c "import torch; print(torch.version); print(torch._C._cuda_getCompiledVersion())"

Thank you very much for your reply!
微信截图_20220514110304
微信截图_20220514110601

And I have changed the torch version to 1.8.0, and this error also happened.This is indeed a very strange erro

@alihassanijr
Copy link
Member

I would recommend staying on torch 1.11, since we've found it to yield the best performance.
As far as this issue goes, I believe it's been a know issue that PyTorch users have had when using ninja on CUDA v 10.1.105, which is what you happen to be on.
Can you try some of the fixes reported in PyTorch issue # 1893, specifically this suggestion?

@wytcsuch
Copy link
Author

wytcsuch commented May 21, 2022

I would recommend staying on torch 1.11, since we've found it to yield the best performance. As far as this issue goes, I believe it's been a know issue that PyTorch users have had when using ninja on CUDA v 10.1.105, which is what you happen to be on. Can you try some of the fixes reported in PyTorch issue # 1893, specifically this suggestion?

Hi, I change CUDA to version 10.2, which seems to solve the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants