疑难杂症:torchvision0.3+CUDA10.0+PyTorch1.2+ubuntu18.03 出现ImportError: libcudart.so.9.0:cannot open

疑难杂症:torchvision0.3+CUDA10.0+PyTorch1.2+ubuntu18.03 出现ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

运行

from torchvision import transforms

出现问题:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-1519560278f3> in <module>
      4 import matplotlib.pyplot as plt
      5 import shutil
----> 6 from torchvision import transforms
      7 from torchvision import models
      8 import torch

~/venv/pytorch/lib/python3.6/site-packages/torchvision/__init__.py in <module>
----> 1 from torchvision import models
      2 from torchvision import datasets
      3 from torchvision import ops
      4 from torchvision import transforms
      5 from torchvision import utils

~/venv/pytorch/lib/python3.6/site-packages/torchvision/models/__init__.py in <module>
      9 from .shufflenetv2 import *
     10 from . import segmentation
---> 11 from . import detection

~/venv/pytorch/lib/python3.6/site-packages/torchvision/models/detection/__init__.py in <module>
----> 1 from .faster_rcnn import *
      2 from .mask_rcnn import *
      3 from .keypoint_rcnn import *

~/venv/pytorch/lib/python3.6/site-packages/torchvision/models/detection/faster_rcnn.py in <module>
      5 import torch.nn.functional as F
      6 
----> 7 from torchvision.ops import misc as misc_nn_ops
      8 from torchvision.ops import MultiScaleRoIAlign
      9 

~/venv/pytorch/lib/python3.6/site-packages/torchvision/ops/__init__.py in <module>
----> 1 from .boxes import nms, box_iou
      2 from .roi_align import roi_align, RoIAlign
      3 from .roi_pool import roi_pool, RoIPool
      4 from .poolers import MultiScaleRoIAlign
      5 from .feature_pyramid_network import FeaturePyramidNetwork

~/venv/pytorch/lib/python3.6/site-packages/torchvision/ops/boxes.py in <module>
      1 import torch
----> 2 from torchvision import _C
      3 
      4 
      5 def nms(boxes, scores, iou_threshold):

ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

情况:

import torch是没问题

import tensorflow as tf是没问题的(无意中在tensorflow的虚拟环境中也发现这个问题,噢我tensorflow中也安装了Pytorch)

两个调用GPU进行运算都是没问题的

查到的直接相关的资料:

[1] libcudart.so.9.0: cannot open shared object file: No such file or directory—也有人遇到了一样的问题,但还无人解答

[2] libcudart.so.9.0: cannot open shared object file: No such file or directory—情况基本一致,采取的方式是将torchvision将到0.2.2,即可解决;但我不想退版本,感觉不是直接相关的原因

问题原因:

torchvision0.3支持CUDA9,不支持10,更新至torchvison0.4即可;

直接更新会连带更新PyTorch,使用如下更新即可;

解决方案:

pip install torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

解决思路:

  1. 所安装的PyTorch1.2 会不会和cuda10.0不匹配

    不是的,因为现在出问题的是,torchvison; 而且有见到更低PyTorch版本都没问题

  2. 会不会是torchvision本身的BUG?

    尝试更新torchvison

    执行pip install -U torchvision

    (pytorch) nicken@lll:~$ pip install -U torchvision
    Collecting torchvision
      Downloading https://files.pythonhosted.org/packages/fc/23/d418c9102d4054d19d57ccf0aca18b7c1c1f34cc0a136760b493f78ddb06/torchvision-0.4.1-cp36-cp36m-manylinux1_x86_64.whl (10.1MB)
         |████████████████████████████████| 10.1MB 270kB/s 
    Requirement already satisfied, skipping upgrade: six in ./venv/pytorch/lib/python3.6/site-packages (from torchvision) (1.12.0)
    Collecting torch==1.3.0
      Downloading https://files.pythonhosted.org/packages/ae/05/50a05de5337f7a924bb8bd70c6936230642233e424d6a9747ef1cfbde353/torch-1.3.0-cp36-cp36m-manylinux1_x86_64.whl (773.1MB)
         |█████                           | 121.6MB 8.2kB/s eta 22:11:25ERROR: Exception:
    
    

    结果,连带PyTorch也要更新,就先行中断了;

    尝试卸载0.3, 再安装指定0.3;

    pip uninstall torchvison
    pip install torchvison
    

    问题依旧未解决

  3. 会不会是CUDA环境没配置好

    检查CUDA环境配置,通过参考[1],检查显示CUDA没问题

    cd /usr/local/cuda/samples/1_Utilities/deviceQuery #由自己电脑目录决定
    sudo make
    sudo ./deviceQuery
    

    查看各版本

    cat /usr/local/cuda/version.txt
    cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 
    

    在检查cudnn的时候,发现并未配置cudnn (汗

    配置cudnn, 略

    然,配置完成后,原问题并未解决;

  4. 受参考[2]启发

    执行

sudo cp /usr/local/cuda/lib64/libcudart.so.10.0 /usr/local/lib/libcudart.so.10.0 && sudo ldconfig
sudo cp /usr/local/cuda/lib64/libcublas.so.10.0 /usr/local/lib/libcublas.so.10.0 && sudo ldconfig
sudo cp /usr/local/cuda/lib64/libcurand.so.10.0 /usr/local/lib/libcurand.so.10.0 && sudo ldconfig

​ 然,问题未为解决

  1. 受参考[3]启发

    执行

    sudo vim ~/.bashrc
    

    加入

    export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    
    export CUDA_HOME=/usr/local/cuda
    

    激活

    source ~/.bashrc
    

    检查 /usr/local/cuda-10.0/lib64 下是否有 libcublas.so.10.0,执行

    sudo ldconfig /usr/local/cuda-10.0/lib64
    

​ 然,原问题并未解决

  1. 受参考[6]启发,安装cudatoolkit-10.0

经发现,pip无法安装cudatoolkit,只能用conda安装,暂放弃

  1. 考虑重装PyTorch

去官网,下载,历史版1.2,发现conda 都安装cudatoolkit

而使用PyTorch是这么安装的:

pip install torch==1.2.0 torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

最终尝试单独安装

pip  torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

问题解决

参考:

[1] Ubuntu如何查看计算机安装好Cuda

[2] libcudart.so.8.0: cannot open shared object file: No such file or directory

[3] ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

[4] https://blog.csdn.net/weixin_33910460/article/details/91722002

[5] https://blog.csdn.net/zqun817/article/details/88750321

[6] 【安装pytorch1.0 + cuda10.1】问题:ImportError:/usr/lib/libcudart.so.10.0:version ‘libcudart.so.10.0’ not…

[7] https://pytorch.org/get-started/previous-versions/

cudnn:

[1] Ubuntu18.04安装CUDA10、CUDNN

[2] https://developer.nvidia.com/rdp/cudnn-download

  • 2
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值