CentOS下安装nvidia+cuda+cudnn

1. 安装NVIDIA驱动

1.1 查询显卡驱动版本

  安装基本命令,进行显卡信息查询。

yum install -y lshw
lshw -numeric -C display

  显卡结果如下:

  *-display                 
       description: 3D controller
       product: GK180GL [Tesla K40c] [10DE:1024]
       vendor: NVIDIA Corporation [10DE]
       physical id: 0
       bus info: pci@0000:02:00.0
       logical name: /dev/fb0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list fb
       configuration: depth=32 driver=nouveau latency=0 mode=1024x768 visual=truecolor xres=1024 yres=768
       resources: iomemory:383f0-383ef iomemory:383f0-383ef irq:99 memory:d2000000-d2ffffff memory:383fe0000000-383fefffffff memory:383ff0000000-383ff1ffffff

1.2 驱动下载

  到英伟达官网下载对应驱动。

  网址:https://www.nvidia.com/Download/index.aspx?lang=en-us

  我这里显示驱动为[Tesla K40c],所以我下载了驱动NVIDIA-Linux-x86_64-460.91.03.run.

1.3 屏蔽系统自带的nouveau

  修改dist-blacklist.conf文件

vim /lib/modprobe.d/dist-blacklist.conf

## 屏蔽
#blacklist nvidiafb

## 新增
blacklist nouveau
options nouveau modeset=0

  保存文件后,重启系统reboot

1.4 重建initramfs image步骤

mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)

# 修改运行级别为文本模式
systemctl set-default multi-user.target

  完成以上操作后,重启系统reboot

1.5 驱动安装

  先进行依赖安装,权限赋予:

yum install -y gcc && gcc-c++ && make && kernel-devel && kernel-headers
chmod a+x NVIDIA-Linux-x86_64-460.91.03.run

  执行安装:

./NVIDIA-Linux-x86_64-460.91.03.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.42.2.el7.x86_64 -k $(uname -r)

1.6 驱动验证

  完成安装后,使用nvidia-smi命令进行驱动检查,结果如下:

Tue Oct  5 22:20:52 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K40c          Off  | 00000000:02:00.0 Off |                    0 |
| 23%   36C    P0    66W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 00000000:03:00.0 Off |                    0 |
| 23%   35C    P0    66W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K40c          Off  | 00000000:83:00.0 Off |                    0 |
| 23%   34C    P0    64W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K40c          Off  | 00000000:84:00.0 Off |                    0 |
| 23%   37C    P0    68W / 235W |      0MiB / 11441MiB |     39%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

2. 安装CUDA

2.1 驱动下载

  cuda驱动下载地址:https://developer.nvidia.com/cuda-toolkit-archive

  由于我的nvidia信息中CUDA Version: 11.2所以我直接安装了该版本。

  驱动下载及安装命令如下:

## 驱动下载
wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run

2.2 驱动安装

## 权限赋予
chmod a+x cuda_11.2.2_460.32.03_linux.run

## 驱动安装
sudo sh cuda_11.2.2_460.32.03_linux.run

## 安装信息
===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.2/
Samples:  Installed in /root/, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-11.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.2/lib64, or, add /usr/local/cuda-11.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.2/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 460.00 is required for CUDA 11.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

2.3 环境变量配置

vim ~/.bashrc

## 在文本末尾加如下参数
export CUDA_HOME=/usr/local/cuda-11.2
export PATH=$CUDA_HOME/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=$CUDA_HOME/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

## 立即生效
source ~/.bashrc

2.4 驱动验证

nvcc -V
## 展示版本信息
vcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

2.5 CUDA测试

  编译并测试设备deviceQuery

cd /usr/local/cuda-11.2/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery

  执行结果如下:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 4 CUDA Capable device(s)

Device 0: "Tesla K40c"
  CUDA Driver Version / Runtime Version          11.2 / 11.2
  CUDA Capability Major/Minor version number:    3.5
  Total amount of global memory:                 11441 MBytes (11996954624 bytes)
  (15) Multiprocessors, (192) CUDA Cores/MP:     2880 CUDA Cores
  GPU Max Clock rate:                            745 MHz (0.75 GHz)
  Memory Clock rate:                             3004 Mhz
  Memory Bus Width:                              384-bit
...
...
...
Device PCI Domain ID / Bus ID / location ID:   0 / 132 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU1) : Yes
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU2) : No
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU3) : No
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU0) : Yes
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU2) : No
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU3) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU0) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU1) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU3) : Yes
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU0) : No
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU1) : No
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU2) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 11.2, NumDevs = 4
Result = PASS

  编译并测试带宽bandwidthTest

cd ../bandwidthTest
sudo make
./bandwidthTest

  执行结果如下:

[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Tesla K40c
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			7.3

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			6.5

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			184.9

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

  如果测试的最后结果都是Result = PASS,说明CUDA安装成功。

3. 安装cuDNN

  官方下载地址为:https://developer.nvidia.com/rdp/cudnn-archive#a-collapse810-111

  根据驱动和系统版本进行下载,然后进行依赖拷贝与授权操作。

## 解压
tar -xzvf cudnn-11.2-linux-x64-v8.1.0.77.tgz

## 复制
cp cuda/include/cudnn.h /usr/local/cuda-11.2/include/
cp cuda/lib64/libcudnn* /usr/local/cuda-11.2/lib64/

## 授权
sudo chmod a+r /usr/local/cuda-11.2/include/cudnn.h /usr/local/cuda-11.2/lib64/libcudnn*
  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
### 回答1: 安装CUDA10.1和Cudnn所需的步骤如下: 1. 安装CUDA10.1: a) 下载安装包:从NVIDIA官网下载适合自己系统的CUDA10.1安装包。 b) 安装CUDA:使用命令行或图形界面按照CUDA安装向导完成CUDA10.1的安装。 c) 设置环境变量:在.bashrc文件中添加以下环境变量: export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH 2. 安装Cudnn: a) 下载Cudnn:从NVIDIA官网下载适合自己系统的Cudnn安装包。 b) 安装Cudnn:使用命令行或图形界面按照Cudnn安装向导完成Cudnn安装。 c) 设置环境变量:在.bashrc文件中添加以下环境变量: export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH 3. 检查安装: a) 使用命令行检查CUDA版本:输入“nvcc -V”,如果输出正常则表示CUDA安装成功。 b) 使用命令行检查Cudnn版本:输入“cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR”,如果输出正常则表示Cudnn安装成功。 安装完毕后,即可开始进行深度学习等相关任务的工作。 ### 回答2: 在CentOS 7上安装CUDA 10.1和CUDNN需要以下步骤: 1.下载CUDA 10.1和CUDNN安装文件 首先需要从NVIDIA官方网站下载CUDA 10.1和CUDNN安装文件。推荐下载runfile方式的安装文件。 2.关闭X服务器 sudo systemctl set-default multi-user.target sudo systemctl isolate multi-user.target 3.安装CUDA 10.1 使用root权限执行以下命令: chmod +x cuda_10.1.105_418.39_linux.run sudo ./cuda_10.1.105_418.39_linux.run --override 安装过程中会询问一些选项,推荐以下设置: 1)安装目录 /usr/local/cuda-10.1 2)安装cuda工具/驱动等全部 3)不安装CUDA示例 安装完成后,需要将CUDA路径加入PATH环境变量中: echo "export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}" >> ~/.bashrc echo "export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}" >> ~/.bashrc source ~/.bashrc 4.安装CUDNN 将下载好的cudnn安装文件拷贝到/usr/local/cuda-10.1目录下: sudo tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz sudo cp cuda/include/cudnn.h /usr/local/cuda-10.1/include/ sudo cp cuda/lib64/libcudnn* /usr/local/cuda-10.1/lib64/ sudo chmod a+r /usr/local/cuda-10.1/include/cudnn.h /usr/local/cuda-10.1/lib64/libcudnn* 5.检查安装 输入以下命令检查安装情况: nvcc -V cat /usr/local/cuda-10.1/include/cudnn.h | grep CUDNN_MAJOR -A 2 如果输出包含CUDA版本和CUDNN版本号,则安装成功。 6.打开X服务器 sudo systemctl set-default graphical.target sudo systemctl isolate graphical.target 以上就是在CentOS 7上安装CUDA 10.1和CUDNN的全部步骤。使用CUDACUDNN时,记得将需要使用GPU的程序设置为使用CUDA。 ### 回答3: CentOS7系统作为一款优秀的操作系统,不仅支持Linux操作系统的所有功能与特性,而且还能够安装CUDA10.1和CUDNN这样的深度学习框架,这样就可以利用之前开发的深度学习框架和工具包来进行更好的人工智能和深度学习的研究和开发。下面是具体的操作步骤和安装方法。 一、安装CUDA10.1 1. 准备工作:在CentOS7中首先需要安装一些必要的库文件和依赖库,包括GCC和G++编译器、CUDA存储库的RPM包、CUDANVIDIA驱动程序的repo等等,具体安装方法可以在NVIDIA官网上面找到,以CentOS7.x为例,指令如下: sudo yum install epel-release sudo yum install cuda sudo yum install cuda-drivers 2. 安装过程:在CentOS7中安装CUDA10.1非常容易,只需要执行以下命令即可。 sudo rpm -i cuda-repo-rhel7-10.1.168-1.x86_64.rpm sudo yum clean expire-cache sudo yum -y install cuda 二、安装CUDNN 1. 下载CUDNN:首先必须到NVIDIA官方网站上下载CUDNN软件包,然后将其解压缩到本地文件夹中,解压后的文件夹需要将其加入LD_LIBRARY_PATH。 2. 复制文件:在解压后的目录中,找到适合当前操作系统版本的.lib和.include文件,并将其复制到对应的CUDA文件夹中。比如: sudo cp cuda/include/cudnn.h /usr/local/cuda-10.1/include sudo cp cuda/lib64/libcudnn_* /usr/local/cuda-10.1/lib64 sudo chmod a+r /usr/local/cuda-10.1/include/cudnn.h /usr/local/cuda10.1/lib64/libcudnn_* 3. 配置环境变量:最后需要配置CUDNN库的环境变量,以便于在后续的开发中调用这些库函数。 echo 'export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc 以上就是在CentOS7上安装CUDA10.1和CUDNN的完整操作步骤和方法,如果按照以上步骤操作不出现错误,那么就可以开始利用这些工具来进行更深入的人工智能和深度学习的开发和研究了。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小爱玄策

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值