cuda10.1安装tensorflow-gpu1.x版本时ImportError: DLL load failed: 找不到指定的模块问题解决

5 篇文章 3 订阅
5 篇文章 0 订阅

问题描述

最近在跑语义分割的经典网络,SegNet,PSPNet等,在github上发现2017年左右的论文代码都是基于tensorflow1.x的,于是使用anaconda新配置了一个虚拟环境,预期使用keras2.1.5+tensorflow1.13.2,前期的keras等基本包使用pip均可以正常安装,可是tensorflow-gpu在安装后无法导入,提示ImportError: DLL load failed: Failed to load the native TensorFlow runtime根本无法使用,尝试重装cuda,降级cuda到9.2(10.1直接降级到9.2这个真的是败笔,当时如果试一下10.0也许就没那么多事儿了),更换tensorflow1.x版本均失败。

问题定位

在tensorflow官网上看到,tensorflow1.3版本要求的cuda版本是10.0,而我的cuda版本是10.1,理论上高版本应该兼容低版本,可是这个出了问题,因此解决这个问题,有两个方案:

  1. 直接重装cuda为10.0版本,并重装原来的cudnn到相应的匹配版本,这个方法比较麻烦,重装cuda是个技术活儿,并且如果你的显卡最多只支持到cuda9.x的版本,那么此方法是不可行的。
  2. 安装cuda辅助工具cudatoolkit10.0.130,这个工具可以看作是一个精简版本的cuda,tensorflow在调用cuda包时优先找当前环境下的cudatoolkit,再去调用系统中的cuda模块。(也有一说实际上系统的cuda最终还是通过cudatoolkit的方式供虚拟环境调用)

问题解决

定位到了问题,那么解决就变得容易了,一下为使用cudatoolkit的解决方案:
首先使用如下命令在terminal中查询cudatoolkit的版本:

conda search cudatoolkit

回车后看到查询结果为:
cudatoolkit版本
为了和tensorflow1.13.2版本对应,我们需要安装cudatoolkit10.0.130版本,使用以下命令安装:

conda install cudatoolkit==10.0.130

回车后即开始安装,安装完毕后而已正常使用了。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.支持mkl,无MPI; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: hp@dla:~/work/ts_compile/tensorflow$ bazel build --config=opt --config=mkl --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值