tensorflow2.0安装 CUDA和CUDNN并运行成功

8 篇文章 0 订阅
6 篇文章 0 订阅

1.先查看是否安装nvidia的驱动

nvidia-smi

如果报错 -bash: nvidia: command not found

则登陆 nvidia 驱动官方下载

找到对应版本下载后

sudo chmod +x NVIDIA-Linux-x86_64-470.63.01.run
sudo sh NVIDIA-Linux-x86_64-470.63.01.run

报错:

ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution’s documentation for details on how to correctly disable the Nouveau kernel driver.

参考博客

报错:

Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the ‘kernel-source’ or ‘kernel-devel’ RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the ‘–kernel-source-path’ command line option.

参考 centos7.5英伟达驱动问题
在这里插入图片描述参考
1.CentOS7–manually upgrade the kernel to the specified version
2.lernel-devel下载
3.ERROR: Unable to find the kernel source tree
4.kernel devel 安装与卸载

rpm -qa | grep -E “kernel-devel|kernel-headers”

在这里插入图片描述
发现 kernel-devel和kernel-headers 版本不一致 ,通过 参考1 和参考2 下载对应kernel-devel版本 放入服务器

yum localinstall kernel-devel-3.10.0-957.27.2.el7.x86_64.rpm

删除多余的kernel-devel

yum remove kernel-devel-3.10.0-1160.41.1.el7.x86_64

再次开始

sudo sh NVIDIA-Linux-x86_64-470.63.01.run

报warning:

WARNING: nvidia-installer was forced to guess the X library path ‘/usr/lib64’ and X module path ‘/usr/lib64/xorg/modules’; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the pkg-config utility and the X.Org SDK/development package for your distribution and reinstall the driver.

不用管,一直 yes。

最后使用 nvidia-smi 查看。

成功!!

2.查看 CUDA 版本:

cat /usr/local/cuda/version.txt

或者:

nvcc -V

查看 CUDNN 版本:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

查看能否使用gpu:

jupyter输入:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) 

如果版本不对:

Num GPUs Available: 0

tensorflow官方地址查看对应版本TF CUDA和cudnn
在这里插入图片描述

https://tensorflow.google.cn/install/source

经查阅:

tensorflow2.0.0需要安装cuda10.0和cudnn7.6:

2.安装cuda10.0

卸载之前的cuda(可选)

cd /usr/local/cuda-XX/bin
sudo ./uninstall_cuda_toolkit_XX.pl

下载对应的cuda10.0:

下载地址:https://developer.nvidia.com/cuda-toolkit-archive

首先查看Linux系统版本:

cat /etc/redhat-release

显示为 CentOS Linux release 7.7

再看架构:

uname -a

显示为: x86_64

下载对应版本:

拷贝到服务器上,进行安装:

sudo chmod +x cuda_10.1.105_418.39_linux.run
sudo sh cuda_10.1.105_418.39_linux.run

选项参考:
https://www.freesion.com/article/6641492348/

报错:
2. An NVIDIA kernel module ‘nvidia-drm’ appears to already be loaded in your kernel…

安装驱动时报的错误。

解决方案:

sudo service lightdm stop
禁用图形目标
sudo systemctl isolate multi-user.target
卸载Nvidia驱动程序
modprobe -r nvidia-drm

安装完毕查看:

cat /usr/local/cuda/version.txt

显示:CUDA Version 10.0.130

加入环境变量

sudo vim ~/.bashrc

添加:

export PATH="/usr/local/cuda-10.0/bin:$PATH" 
export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"

source ~/.bashrc

查看:

nvcc -V

结果:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

安装CUDA完毕。

3.安装cudnn-10.0

官网下载:https://developer.nvidia.com/rdp/cudnn-archive

选择对应版本

我选择的是: cudnn-10.0-linux-x64-v7.6.5.32.tgz

拷贝到服务器:

tar -xvf cudnn-10.0-linux-x64-v7.6.5.32.tgz

解压后 出现一个cuda 文件夹

拷贝:

sudo cp cuda/include/cudnn.h /usr/local/cuda-10.0/include # 填写对应的版本的cuda路径
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-10.0/lib64 # 填写对应的版本的cuda路径
sudo chmod a+r /usr/local/cuda-10.0/include/cudnn.h /usr/local/cuda-10.0/lib64/libcudnn*

查看:cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

在这里插入图片描述
cudnn 安装完成!

4.jupyter内查看GPU是否可用

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices(“GPU”)))在这里插入图片描述

4.tensorflow2.0使用gpu

import tensorflow as tf
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

如果打印出现GPU和CPU 则使用了GPU
只出现CPU 则未启动GPU

5.查看keras与tensorflow对应关系和当前版本

对应关系查询网址:点击这里
在这里插入图片描述tensorflow2.0对应keras版本为2.3.1

import keras 

print(keras.__version__)

显示版本为:2.2.5

重新安装:

cd xxx/xxx/anaconda3/bin 

./pip install keras==2.3.1 

安装完毕!

参考:
https://www.freesion.com/article/9245510937/
https://blog.csdn.net/sinat_23619409/article/details/84202651
https://blog.csdn.net/kingfoulin/article/details/98872965

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值