安装gpu driver cuda cuDNN unbuntu16.04

1.Install NVIDIA Graphics Driver via runfile

参考:https://gist.github.com/wangruohui/df039f0dc434d6486f5d4d098aa52d07#install-nvidia-graphics-driver-via-runfile

1.1卸载之前的老版本:

zutnlp@YQ2:/opt/nvidia$ sudo apt-get purge nvidia*

1.2下载cuda Driver

https://www.nvidia.com/Download/index.aspx?lang=en-us
找到对应的版本
应为之前已经有机器下载了,只需要scp过去即可:

zutnlp@YQ1:~/.ssh$ sudo scp zutnlp@10.63.3.31:~/Downloads/NVIDIA-Linux-x86_64-418.87.01.run zutnlp@10.63.3.32:/opt/nvidia

如果遇到 scp 权限问题,请修改提示文件的权限为777即:

zutnlp@YQ2:/opt$ sudo chmod  777 nvidia/
[sudo] password for zutnlp: 

1.3禁用Nouveau Driver

因为之前操作过进行过设置,所以几乎不需要任何操作,如果不是下列形式,请使用vim 进行编辑并且添加内容如下。

zutnlp@YQ2:~$ cat  /etc/modprobe.d/blacklist-nouveau.conf 
blacklist nouveau
options nouveau modset=0

1.4关闭x-server 相关服务

sudo service lightdm stop

1.5执行run file

不知道什么原因-dkms 报一下错误,但是如果不再添加这个参数-dkms ,以后kenel更新就会造成驱动失效,需要重新装驱动,最后还是找到问题了,由于缺少依赖造成的:

zutnlp@YQ2:/opt/nvidia$ sudo sh  NVIDIA-Linux-x86_64-418.87.01.run -s --dkms   --no-opengl-files
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 418.87.01..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

ERROR: Failed to find dkms on the system!
ERROR: Failed to install the kernel module through DKMS. No kernel module was
       installed; please try installing again without DKMS, or check the DKMS
       logs for more information.


ERROR: Installation has failed.  Please see the file
       '/var/log/nvidia-installer.log' for details.  You may find suggestions
       on fixing installation problems in the README available on the Linux
       driver download page at www.nvidia.com.

最后得知是少了一些依赖,所以保险起见增加一些依赖库如下:

 sudo apt-get update 
 sudo apt-get install dkms build-essential linux-headers-generic
 sudo apt-get install gcc-multilib xorg-dev
 sudo apt-get install freeglut3-dev libx11-dev libxmu-dev install libxi-dev  libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

然后重新执行后成功:

zutnlp@YQ2:/opt/nvidia$ sudo sh  NVIDIA-Linux-x86_64-418.87.01.run -s --dkms   --no-opengl-files
参数说明:
--no-opengl-files:表示只安装驱动文件,不安装OpenGL文件。这个参数不可省略,否则会导致登陆界面死循环,英语一般称为”login loop”或者”stuck in login”。 **必选参数解释**:因为NVIDIA的驱动默认会安装OpenGL,而Ubuntu的内核本身也有OpenGL、且与GUI显示息息相关,一旦NVIDIA的驱动覆写了OpenGL,在GUI需要动态链接OpenGL库的时候就引起问题。
–no-x-check:表示安装驱动时不检查X服务,非必需,我们已经禁用图形界面。
–no-nouveau-check:表示安装驱动时不检查nouveau,非必需,我们已经禁用驱动。
-Z, –disable-nouveau:禁用nouveau。此参数非必需,因为之前已经手动禁用了nouveau。
-A:查看更多高级选项。
-dkms( 建议开启 )  在 kernel 自行更新时将驱动程序安装至模块中,从而阻止驱动程序重新安装。** 在 kernel 更新期间,dkms 触发驱动程序重编译至新的 kernel 模块堆栈。
-s   is used for silent installation which should used for batch installation. For installation on a single computer, this option should be turned off for more installtion information.

以下证明驱动安装成功:

zutnlp@YQ2:/opt/nvidia$ nvidia-smi
Sun Nov 17 18:47:07 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:04:00.0 Off |                    0 |
| N/A   34C    P0    29W / 250W |      0MiB / 12198MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000000:82:00.0 Off |                    0 |
| N/A   35C    P0    24W / 250W |      0MiB / 12198MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

2.install cuda

2.1 具体操作步骤

https://developer.nvidia.com/cuda-downloads
cuda 安装比较简单,只需要执行run 脚本即可:
选择安装方式

wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
sudo sh cuda_10.1.243_418.87.00_linux.run

应为文件比较大,所以建议从舆情1拷贝到这个机器是比较快的:

zutnlp@YQ1:~/.ssh$ scp /usr/local/cuda_10.1.243_418.87.00_linux.run  zutnlp@10.63.3.32:/opt/nvidia

以下是这个过程:

────────────────────────────────────────────────────────────────────────────┐
│  End User License Agreement                                                  │
│  --------------------------                                                  │
│                                                                              │
│                                                                              │
│  Preface                                                                     │
│  -------                                                                     │
│                                                                              │
│  The Software License Agreement in Chapter 1 and the Supplement              │
│  in Chapter 2 contain license terms and conditions that govern               │
│  the use of NVIDIA software. By accepting this agreement, you                │
│  agree to comply with all the terms and conditions applicable                │
│  to the product(s) included herein.                                          │
│                                                                              │
│                                                                              │
│  NVIDIA Driver                                                               │
│                                                                              │
│                                                                              │
│  Description                                                                 │
│                                                                              │
│  This package contains the operating system driver and                       │
│──────────────────────────────────────────────────────────────────────────────│
│ Do you accept the above EULA? (accept/decline/quit):                         │
│ accept                                                                       │

空格选中,前方有x 代表选中,此处只选中cuda toolkit 即可:

│ CUDA Installer                                                               │
│ - [ ] Driver                                                                 │
│      [ ] 418.87.00                                                           │
│ + [X] CUDA Toolkit 10.1                                                      │
│   [ ] CUDA Samples 10.1                                                      │
│   [ ] CUDA Demo Suite 10.1                                                   │
│   [ ] CUDA Documentation 10.1                                                │
│   Options                                                                    │
│   Install                                                                                                                                                                                        

选中yes

──────────────────────────────────────────────────────────────────────────────┐
│ A symlink already exists at /usr/local/cuda. Update to this installation?    │
│ Yes                                                                          │
│ No                                                                          

等待一会出现如下代表安装完成:

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.1/
Samples:  Not Selected

Please make sure that
 -   PATH includes /usr/local/cuda-10.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.1/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.1/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 418.00 is required for CUDA 10.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

输入以下命令验证是否成功:

zutnlp@YQ2:/opt/nvidia$ cat /usr/local/cuda/version.txt 
CUDA Version 10.1.243

2.2 配置运行库

sudo bash -c "echo /usr/local/cuda/lib64/ > /etc/ld.so.conf.d/cuda.conf"
sudo ldconfig

把 /usr/local/cuda/bin 添加到系统的环境变量path 中,使用以下命令:

zutnlp@YQ2:/opt/nvidia$ vim /etc/environment 
zutnlp@YQ2:/opt/nvidia$ sudo vim /etc/environment 
zutnlp@YQ2:/opt/nvidia$ source /etc/environment 

2.3 解决nvcc -V 没有起作用

发现已经升级了cuda 但是nvcc 还是显示以前的cuda 9.0:

zutnlp@YQ2:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

所以需要如下代码添加到 /etc/profile中末尾中,操作完成后如下:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

操作完成后可以查看如下:

zutnlp@YQ2:~$ cat  /etc/profile
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).

if [ "$PS1" ]; then
  if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ]; then
    # The file bash.bashrc already sets the default PS1.
    # PS1='\h:\w\$ '
    if [ -f /etc/bash.bashrc ]; then
      . /etc/bash.bashrc
    fi
  else
    if [ "`id -u`" -eq 0 ]; then
      PS1='# '
    else
      PS1='$ '
    fi
  fi
fi

if [ -d /etc/profile.d ]; then
  for i in /etc/profile.d/*.sh; do
    if [ -r $i ]; then
      . $i
    fi
  done
  unset i
fi

xrandr --newmode 1920x1080 173.00  1920 2048 2248 2576  1080 1083 1088 1120 -hsync +vsync
xrandr --addmode VGA-1 1920x1080
xrandr --output VGA-1 --mode 1920x1080

export PYTHONPATH=/home/yuqing/data/tf/models:/home/zutnlp/quanyou.chang/models/tf/slim
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

操作步骤如下:

zutnlp@YQ2:~$ vim /etc/profile
zutnlp@YQ2:~$ sudo vim /etc/profile
[sudo] password for zutnlp: 
zutnlp@YQ2:~$ source /etc/profile
Can't open display 
Can't open display 
Can't open display 

这时可以测试是否配置成功如下:

zutnlp@YQ2:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

3.install cuDnn

引用这个文章的install cuDNN部分:https://medium.com/repro-repo/install-cuda-and-cudnn-for-tensorflow-gpu-on-ubuntu-79306e4ac04e

3.1到官网下载选择对应的版本,注意有三个包:

https://developer.nvidia.com/rdp/cudnn-download
在这里插入图片描述
官网

zutnlp@YQ1:~/wueryong/nvidia/cuDNNN$ ll
total 336524
drwxrwxr-x 2 zutnlp zutnlp      4096 11月 17 15:50 ./
drwxrwxr-x 3 zutnlp zutnlp      4096 11月 17 15:50 ../
-rw-r--r-- 1 zutnlp zutnlp 180962466 11月 16 21:19 libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
-rw-r--r-- 1 zutnlp zutnlp 159185716 11月 16 21:18 libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
-rw-r--r-- 1 zutnlp zutnlp   4428908 11月 16 21:03 libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb

3.2 执行安装命令

zutnlp@YQ1:~/wueryong/nvidia/cuDNNN$ sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb 
Selecting previously unselected package libcudnn7.
(Reading database ... 351236 files and directories currently installed.)
Preparing to unpack libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7 (7.6.5.32-1+cuda10.1) ...
Setting up libcudnn7 (7.6.5.32-1+cuda10.1) ...
Processing triggers for libc-bin (2.23-0ubuntu11) ...
zutnlp@YQ1:~/wueryong/nvidia/cuDNNN$ sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb 
Selecting previously unselected package libcudnn7-dev.
(Reading database ... 351242 files and directories currently installed.)
Preparing to unpack libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7-dev (7.6.5.32-1+cuda10.1) ...
Setting up libcudnn7-dev (7.6.5.32-1+cuda10.1) ...
update-alternatives: using /usr/include/x86_64-linux-gnu/cudnn_v7.h to provide /usr/include/cudnn.h (libcudnn) in auto mode
zutnlp@YQ1:~/wueryong/nvidia/cuDNNN$ sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb 
Selecting previously unselected package libcudnn7-doc.
(Reading database ... 351248 files and directories currently installed.)
Preparing to unpack libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7-doc (7.6.5.32-1+cuda10.1) ...
Setting up libcudnn7-doc (7.6.5.32-1+cuda10.1) ...

3.3进项编译测试:

zutnlp@YQ1:/usr/src/nvidia-418.87.01$ cp -r /usr/src/cudnn_samples_v7/ ~
zutnlp@YQ1:/usr/src/nvidia-418.87.01$ cd ~/cudnn_samples_v7/mnistCUDNN
zutnlp@YQ1:~/cudnn_samples_v7/mnistCUDNN$ make clean && make
rm -rf *o
rm -rf mnistCUDNN
Linking agains cublasLt = true
CUDA VERSION: 10010
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 30 35 50 53 60 61 62 70 72 75
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

zutnlp@YQ1:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7605 , CUDNN_VERSION from cudnn.h : 7605 (7.6.5)
Host compiler version : GCC 4.9.4
There are 2 CUDA capable devices on your machine :
device 0 : sms 56  Capabilities 6.0, SmClock 1328.5 Mhz, MemSize (Mb) 12198, MemClock 715.0 Mhz, Ecc=1, boardGroupID=0
device 1 : sms 56  Capabilities 6.0, SmClock 1328.5 Mhz, MemSize (Mb) 12198, MemClock 715.0 Mhz, Ecc=1, boardGroupID=1
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.031264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.068768 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.079424 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.095104 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.101568 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.024544 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.059488 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.068256 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.076192 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.090816 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

4.官方参考文档

官方文档:https://docs.nvidia.com/deeplearning/sdk/cudnn-install/

  • 4
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
安装Ubuntu 16.04,您可以按照以下步骤进行操作: 1. 准备一台空白的U盘,并确保其中没有重要数据。您可以参考中的图文教程制作Ubuntu 16.04系统U盘启动器。 2. 在开始安装之前,建议您断开计算机的网络连接,以避免安装过程中的网络问题。这一点在中也有提到。 3. 将制作好的Ubuntu 16.04系统U盘启动器插入计算机的USB接口。 4. 重新启动计算机,并在启动过程中进入BIOS设置。不同的计算机品牌可能有不同的按键组合进入BIOS设置界面,您可以在计算机启动时屏幕上的提示中找到正确的按键。 5. 在BIOS设置界面中,将U盘设置为首选启动设备。这样,计算机会从U盘启动而不是从硬盘启动。 6. 保存设置并退出BIOS界面,计算机将重新启动并从U盘启动。 7. 此时,您将进入Ubuntu 16.04系统的安装界面。您可以按照屏幕上的指示进行安装,包括选择安装语言、分区磁盘、设置用户名和密码等。 8. 完成安装后,您可以重新启动计算机并进入已安装的Ubuntu 16.04系统。 请按照上述步骤进行操作,这样您就可以成功安装Ubuntu 16.04了。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [【图文教程】Ubuntu16.04安装全过程图文教程(亲测有效)](https://blog.csdn.net/u010736662/article/details/88735409)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

aijava1

请我喝咖啡!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值