1pytorch的版本问题
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument Traceback (most recent call last): File "capsulenet.py", line 254, in train(model, train_loader, test_loader, args) File "capsulenet.py", line 160, in train y_pred, x_recon = model(x, y) # forward File "/home/deeplab/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "capsulenet.py", line 63, in forward x = self.digitcaps(x) File "/home/deeplab/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/deeplab/Tracy/pytorchcode/CapsNet-Pytorch-master/capsulelayers.py", line 54, in forward x_hat = torch.squeeze(torch.matmul(self.weight, x[:, None, :, :, None]), dim=-1) RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:450
解决方法就是将pytorch重新安装,并且采用离线安装的方式
pip3 install https://download.pytorch.org/whl/cu100/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
2运行之后,发现出现下面的问题
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
解决方法就是采用
1、修改链接路径
因为之前安装过cuda9 不知是否这个原因导致它竟然去找了cuda9的库
[root@localhost ~]# locate libcublas.so.9.0
/home/cuda_9/lib64/libcublas.so.9.0
/home/cuda_9/lib64/libcublas.so.9.0.176
因此,我当前的解决办法是直接修改LD_LIBRARY_PATH环境变量 让你成功找到
测试一下
vim ~/.bashrc
#写入下面内容 export CUDA_HOME=/usr/local/cuda-10.0 export PATH= P A T H : {PATH}: PATH:{CUDA_HOME}/bin export
LD_LIBRARY_PATH= L D L I B R A R Y P A T H : {LD_LIBRARY_PATH}: LDLIBRARYPATH:{CUDA_HOME}/lib64:/home/cuda_9/lib64/
#结束 source ~/.bashrc
可以运行了。