-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault (core dumped) #39
Comments
Could you please provide the full log? |
OK, the log is as follows: |
And the demo process log is as follows: It remains the state all the time, thanks for help with the project. |
From the log, I can just see that the problem shows up after this line, AlphAction/alphaction/engine/trainer.py Line 30 in 99acc16
You can add some |
Could you provide the video so that I could locate the problem more easily? |
Yes, I have printed the logs, and I found it in the roi_aligh_3d.py, import alphaction._custom_cuda_ext as _C, when I import the package in the shell, it shows 'undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E'. But I have installed the project and with no errors. |
OK, the video is used that in AVA2.2 testset, bNP8Q_8u89A.webm, I have tried some videos, it shows the same. |
It means that you have not successfully install this project yet. The demo could stuck because of the same problem. Please refer to |
but the installation process did not report errors. |
Just try re-install it. Attach the installation log if you believe there is no error reported. |
OK, I try again. Thanks a lot! |
Obtaining file:///AlphAction-master The above is the installation log. It still has the errors. |
Could you please remove |
OK, I will try. Thanks. |
python setup.py build develop /anaconda3/lib/python3.7/site-packages/torch/include/c10/core/TensorTypeSet.h(44): warning: integer conversion resulted in a change of sign /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/Codes/AlphAction/alphaction/csrc -I/anaconda3/lib/python3.7/site-packages/torch/include -I/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/anaconda3/include/python3.7m -c /Codes/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu -o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_custom_cuda_ext -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=sm_61 -std=c++11 /anaconda3/lib/python3.7/site-packages/torch/include/c10/core/TensorTypeSet.h(44): warning: integer conversion resulted in a change of sign /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/Codes/AlphAction/alphaction/csrc -I/anaconda3/lib/python3.7/site-packages/torch/include -I/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/anaconda3/include/python3.7m -c /Codes/AlphAction/alphaction/csrc/cuda/ROIPool3d_cuda.cu -o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/ROIPool3d_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_custom_cuda_ext -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=sm_61 -std=c++11 /anaconda3/lib/python3.7/site-packages/torch/include/c10/core/TensorTypeSet.h(44): warning: integer conversion resulted in a change of sign /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/Codes/AlphAction/alphaction/csrc -I/anaconda3/lib/python3.7/site-packages/torch/include -I/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/anaconda3/include/python3.7m -c /Codes/AlphAction/alphaction/csrc/cuda/SoftmaxFocalLoss_cuda.cu -o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/SoftmaxFocalLoss_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_custom_cuda_ext -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=sm_61 -std=c++11 /anaconda3/lib/python3.7/site-packages/torch/include/c10/core/TensorTypeSet.h(44): warning: integer conversion resulted in a change of sign g++ -pthread -shared -B /anaconda3/compiler_compat -L/anaconda3/lib -Wl,-rpath=/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/vision.o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/ROIAlign3d_cuda.o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/ROIPool3d_cuda.o build/temp.linux-x86_64-3.7/Codes/AlphAction/alphaction/csrc/cuda/SoftmaxFocalLoss_cuda.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.7/alphaction/custom_cuda_ext.cpython-37m-x86_64-linux-gnu.so /anaconda3/lib/python3.7/site-packages/torch/include/c10/core/TensorTypeSet.h(44): warning: integer conversion resulted in a change of sign g++ -pthread -shared -B /anaconda3/compiler_compat -L/anaconda3/lib -Wl,-rpath=/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/detector/nms/src/nms_cuda.o build/temp.linux-x86_64-3.7/detector/nms/src/nms_kernel.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.7/detector/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so Installed /Codes/AlphAction Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages Using /anaconda3/lib/python3.7/site-packages The above is the installation logs, while when I use the training script, it shows '_custom_cuda_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs'. |
What is your gcc version? And the output of |
The gcc version is 5.3.1, the output of python -c "import torch;print(torch.file)" is '/anaconda3/lib/python3.7/site-packages/torch/init.py' |
emm..., no idea. How about |
I have tried this, and the version that the pytorch compiled is the same with the cuda 10.0. |
This seems not to be a problem specific to this project, I think you may find some helpful information in other repositories. |
Thanks a lot again for helping solve the problems! By the way, could you please provide the UCF dataset? And is it the same process with AVA? |
Sorry, we do not plan to provide the model trained on UCF dataset. Most of our experiments are conducted on AVA dataset, which is much larger and more challenging. For UCF dataset downloading, you could check https://github.com/gurkirt/realtime-action-detection. |
That's OK! Thanks a lot! |
The official download link is invalid, could you please provide an alternative link for downloading the dataset and annotations? |
For frames and corrected annotations, you could find in the repository https://github.com/gurkirt/realtime-action-detection. |
Thanks a lot! |
hi, do you solve the problem now? I met the same problem...and I have tried many times, it didn't report any errors when I build. Traceback (most recent call last): |
It looks like the cuda version problem, you can check with the cuda and pytorch version. And make sure that you have built the project successfully. |
same problem |
您好,请问您最后解决这个错误了吗
|
我是在另外一个project遇到的。最后是换torch到1.10,cuda到11.1解决了 |
这个错误是cuda和pytorch版本不一致导致的,可以核对软件版本后,再重新编译安装试试 |
Hi, when i run the training command on a single GPU, it shows Segmentation fault (core dumped), and I have modified the multiprocess setting to 0, but it still remains the same. Could you please help to address the problem?
And when i run the training command on multi GPUs, it shows another fault:
subprocess.CalledProcessError: Command '***(the command)' died with <Signals.SIGSEGV: 11>.
My environment is:
Python 3.7.6
PyTorch 1.3.1 built for Cuda 10.0
Cuda runtime version 10.0.
Thanks.
Besides, when I run the demo.py, it remains the state 'Tracker Progress: 1004 frame [02:42, 6.14 frame/s]' for a long time, is it normal?
The text was updated successfully, but these errors were encountered: