cudaCheckError() failed: unspecified launch failure #28

eneserdo · 2022-07-07T15:40:32Z

Hi, when I run ./experiments/scripts/demo.sh, I am getting the following error:

...
object 0, class 019_pitcher_base, z 0.5498372912406921, z new 0.6564772725105286
object 1, class 008_pudding_box, z 0.6858921647071838, z new 0.7096845507621765
object 2, class 002_master_chef_can, z 0.5724276304244995, z new 0.6095401048660278
object 3, class 052_extra_large_clamp, z 0.6460237503051758, z new 0.6376593708992004
object 4, class 011_banana, z 0.6999950408935547, z new 0.7429261207580566
/opt/conda/conda-bld/pytorch_1591914742272/work/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple)
sdf 27338 points for object 0, class 10 019_pitcher_base
sdf 6888 points for object 1, class 6 008_pudding_box
sdf 16038 points for object 2, class 0 002_master_chef_can
sdf 11314 points for object 3, class 18 052_extra_large_clamp
sdf 3385 points for object 4, class 9 011_banana
sdf with 64963 points
cudaCheckError() failed: unspecified launch failure

I tried on multiple machines. Here is the full error log

My setup:
Ubuntu 20.4
CUDA 10.1
PyTorch 1.4

Any help will be appreciated.

The text was updated successfully, but these errors were encountered:

tsrobcvai · 2023-02-28T08:05:33Z

Hi, when I run ./experiments/scripts/demo.sh, I am getting the following error:

...
object 0, class 019_pitcher_base, z 0.5498372912406921, z new 0.6564772725105286
object 1, class 008_pudding_box, z 0.6858921647071838, z new 0.7096845507621765
object 2, class 002_master_chef_can, z 0.5724276304244995, z new 0.6095401048660278
object 3, class 052_extra_large_clamp, z 0.6460237503051758, z new 0.6376593708992004
object 4, class 011_banana, z 0.6999950408935547, z new 0.7429261207580566
/opt/conda/conda-bld/pytorch_1591914742272/work/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple)
sdf 27338 points for object 0, class 10 019_pitcher_base
sdf 6888 points for object 1, class 6 008_pudding_box
sdf 16038 points for object 2, class 0 002_master_chef_can
sdf 11314 points for object 3, class 18 052_extra_large_clamp
sdf 3385 points for object 4, class 9 011_banana
sdf with 64963 points
cudaCheckError() failed: unspecified launch failure

I tried on multiple machines. Here is the full error log

My setup: Ubuntu 20.4 CUDA 10.1 PyTorch 1.4

Any help will be appreciated.

I also met this problem. Ubuntu 20.4, CUDA 11.1, PyTorch 1.8. Have you solved this problem?

eneserdo · 2023-03-12T17:51:47Z

Nope

wetoo-cando · 2023-08-09T06:58:45Z

Same problem here when I run ./experiments/scripts/dex_ycb_test_s0.sh 0 with
Ubuntu 20.04
Cuda 11.1
torch 1.10.1+cu111.

I am in a python-venv inside a docker container based on https://hub.docker.com/r/nvidia/cudagl.

@eneserdo @mcgilltaosun could you solve this?

eneserdo · 2023-08-09T18:40:25Z

I dropped my job because of this error. Please do not tag me anymore. Why nvidia, why are you not reproducible

wetoo-cando · 2023-08-13T06:05:35Z

A little more print debugging shows the exact location of the error:

object 0, class 025_mug, z 0.7601078152656555, z new 0.8151350021362305
object 1, class 003_cracker_box, z 0.9675762057304382, z new 1.0729904174804688
object 2, class 002_master_chef_can, z 0.7824445962905884, z new 0.7986501455307007
sdf 5599 points for object 0, class 13 025_mug
sdf 10896 points for object 1, class 1 003_cracker_box
sdf 8007 points for object 2, class 0 002_master_chef_can
sdf with 24502 points
sdf_matching_loss_kernel.cu: cudaCheckError() failed (cudaDeviceSynchronize): unspecified launch failure

It happens inside the function sdf_loss_cuda_forward() at line 276 in the sdf_matching_loss_kernel.cu file.

No idea what to look for / how to debug further though. Any help would be appreciated.

namGGG · 2024-01-10T01:59:48Z

I'm stuck in the middle...
GPU RTX 3090
Ubuntu 20.04
CUDA 11.1
Pytorch 1.8.2 LTS

/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:3454: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  warnings.warn(
cudaGraphicsGLRegisterImage failed: 304
cudaGraphicsMapResources failed: 400
cudaGraphicsSubResourceGetMappedArray failed: 400
cudaMemcpy2DFromArray failed: 709
cudaGraphicsUnmapResources failed: 400
cudaGraphicsGLRegisterImage failed: 304
cudaGraphicsMapResources failed: 400
cudaGraphicsSubResourceGetMappedArray failed: 400
cudaMemcpy2DFromArray failed: 709
cudaGraphicsUnmapResources failed: 400
cudaGraphicsGLRegisterImage failed: 304
cudaGraphicsMapResources failed: 400
cudaGraphicsSubResourceGetMappedArray failed: 400
cudaMemcpy2DFromArray failed: 709
cudaGraphicsUnmapResources failed: 400
object 0, class 019_pitcher_base, z 0.5521460175514221, z new -0.2002505660057068
object 1, class 008_pudding_box, z 0.6852722764015198, z new -0.021693646907806396
object 2, class 002_master_chef_can, z 0.5711051225662231, z new -0.11909496784210205
object 3, class 052_extra_large_clamp, z 0.6500653028488159, z new 0.3591681122779846
object 4, class 011_banana, z 0.7000908255577087, z new 0.8758471608161926
sdf 0 points for object 0, class 10 019_pitcher_base, no refinement
sdf 0 points for object 1, class 6 008_pudding_box, no refinement
sdf 0 points for object 2, class 0 002_master_chef_can, no refinement
sdf 0 points for object 3, class 18 052_extra_large_clamp, no refinement
sdf 499 points for object 4, class 9 011_banana
sdf with 499 points
cudaCheckError() failed: unspecified launch failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudaCheckError() failed: unspecified launch failure #28

cudaCheckError() failed: unspecified launch failure #28

eneserdo commented Jul 7, 2022 •

edited

Loading

tsrobcvai commented Feb 28, 2023

eneserdo commented Mar 12, 2023

wetoo-cando commented Aug 9, 2023

eneserdo commented Aug 9, 2023

wetoo-cando commented Aug 13, 2023

namGGG commented Jan 10, 2024 •

edited

Loading

cudaCheckError() failed: unspecified launch failure #28

cudaCheckError() failed: unspecified launch failure #28

Comments

eneserdo commented Jul 7, 2022 • edited Loading

tsrobcvai commented Feb 28, 2023

eneserdo commented Mar 12, 2023

wetoo-cando commented Aug 9, 2023

eneserdo commented Aug 9, 2023

wetoo-cando commented Aug 13, 2023

namGGG commented Jan 10, 2024 • edited Loading

eneserdo commented Jul 7, 2022 •

edited

Loading

namGGG commented Jan 10, 2024 •

edited

Loading