matlab\src\bits\impl\pooling_gpu.cu

这是配置一个老版本matcovnet,2016年5月份的代码的版本,应该是beta20。在cuda8下,设定了-gencode=arch=compute_61,code=\"sm_61,compute_61\,出现了下面的问题。

错误使用 vl_compilenn>nvcc_compile (line 522) Command "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc" -c "C:\Program Files\MATLAB\R2017b\ResNet-Matconvnet\dependencies\matconvnet\matlab\src\bits\impl\pooling_gpu.cu" -DNDEBUG -DENABLE_GPU -DENABLE_DOUBLE -D__SSSE3__ -gencode=arch=compute_61,code=\"sm_61,compute_61\"  -I"C:\Program Files\MATLAB\R2017b\extern\include" -I"C:\Program Files\MATLAB\R2017b\toolbox\distcomp\gpu\extern\include"  -gencode=arch=compute_61,code=\"sm_61,compute_61\"  -Xcompiler /MD -o "C:\Program Files\MATLAB\R2017b\ResNet-Matconvnet\dependencies\matconvnet\matlab\mex\.build\bits\impl\pooling_gpu.obj" failed.

出错 vl_compilenn (line 467)       nvcc_compile(opts, srcs{i}, objfile, flags.nvcc) ;

 

找到了解决方法 Compiling with CUDA 8.0

I don't know if it's a proper fix or not, but I ended up doing the following:

pooling_gpu.cu, line 163
(commented out atomicadd)

bilinearsampler_gpu.cu, line 25
(commented out atomicadd)

This made the compiling work and I ran vl_testnn without any problems after compiling. Posting this here in case anyone else has the same problem and also if anyone has any further insights as to what caused the problem and what issues may arise from me commenting out the lines above.

Thanks!
-Justin

 
@missilzolair
 

missilzolair commented on 12 Jun 2016

Hi,

I just encounter the same issue, except that I am using a gtx970, with compute capability 5.2. Apparently, it's related to the fact that with Pascal (compute capability 6.0) a native double version of atomicAdd has been added.

In my case, if I simply comment the overloaded definition of atomicAdd, I will still get an error. The right solution (http://stackoverflow.com/questions/37566987/cuda-atomicadd-for-doubles-definition-error/37569519) is to use the following macro:

#if !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 600
#else
<... place here your own pre-pascal atomicAdd definition ...>
#endif

I added the aforementioned macro at the two locations that Justin mentioned and the code compiles just fine with CUDA 8.0 on gpu architecture < 6.0.

Cheers,
Ben

posted @ 2018-06-06 09:23  菜鸡一枚  阅读(324)  评论(0)    收藏  举报