
1. 安装NVIDIA驱动

1.1 查询显卡驱动版本


yum install -y lshw
lshw -numeric -C display


       description: 3D controller
       product: GK180GL [Tesla K40c] [10DE:1024]
       vendor: NVIDIA Corporation [10DE]
       physical id: 0
       bus info: pci@0000:02:00.0
       logical name: /dev/fb0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list fb
       configuration: depth=32 driver=nouveau latency=0 mode=1024x768 visual=truecolor xres=1024 yres=768
       resources: iomemory:383f0-383ef iomemory:383f0-383ef irq:99 memory:d2000000-d2ffffff memory:383fe0000000-383fefffffff memory:383ff0000000-383ff1ffffff

1.2 驱动下载



  我这里显示驱动为[Tesla K40c],所以我下载了驱动NVIDIA-Linux-x86_64-460.91.03.run.

1.3 屏蔽系统自带的nouveau


vim /lib/modprobe.d/dist-blacklist.conf

## 屏蔽
#blacklist nvidiafb

## 新增
blacklist nouveau
options nouveau modeset=0


1.4 重建initramfs image步骤

mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)

# 修改运行级别为文本模式
systemctl set-default multi-user.target


1.5 驱动安装


yum install -y gcc && gcc-c++ && make && kernel-devel && kernel-headers
chmod a+x NVIDIA-Linux-x86_64-460.91.03.run


./NVIDIA-Linux-x86_64-460.91.03.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.42.2.el7.x86_64 -k $(uname -r)

1.6 驱动验证


Tue Oct  5 22:20:52 2021       
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K40c          Off  | 00000000:02:00.0 Off |                    0 |
| 23%   36C    P0    66W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
|   1  Tesla K40c          Off  | 00000000:03:00.0 Off |                    0 |
| 23%   35C    P0    66W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
|   2  Tesla K40c          Off  | 00000000:83:00.0 Off |                    0 |
| 23%   34C    P0    64W / 235W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
|   3  Tesla K40c          Off  | 00000000:84:00.0 Off |                    0 |
| 23%   37C    P0    68W / 235W |      0MiB / 11441MiB |     39%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

2. 安装CUDA

2.1 驱动下载


  由于我的nvidia信息中CUDA Version: 11.2所以我直接安装了该版本。


## 驱动下载
wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run

2.2 驱动安装

## 权限赋予
chmod a+x cuda_11.2.2_460.32.03_linux.run

## 驱动安装
sudo sh cuda_11.2.2_460.32.03_linux.run

## 安装信息
= Summary =

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.2/
Samples:  Installed in /root/, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-11.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.2/lib64, or, add /usr/local/cuda-11.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.2/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 460.00 is required for CUDA 11.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

2.3 环境变量配置

vim ~/.bashrc

## 在文本末尾加如下参数
export CUDA_HOME=/usr/local/cuda-11.2
export PATH=$CUDA_HOME/bin${PATH:+:${PATH}}

## 立即生效
source ~/.bashrc

2.4 驱动验证

nvcc -V
## 展示版本信息
vcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

2.5 CUDA测试


cd /usr/local/cuda-11.2/samples/1_Utilities/deviceQuery
sudo make


./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 4 CUDA Capable device(s)

Device 0: "Tesla K40c"
  CUDA Driver Version / Runtime Version          11.2 / 11.2
  CUDA Capability Major/Minor version number:    3.5
  Total amount of global memory:                 11441 MBytes (11996954624 bytes)
  (15) Multiprocessors, (192) CUDA Cores/MP:     2880 CUDA Cores
  GPU Max Clock rate:                            745 MHz (0.75 GHz)
  Memory Clock rate:                             3004 Mhz
  Memory Bus Width:                              384-bit
Device PCI Domain ID / Bus ID / location ID:   0 / 132 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU1) : Yes
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU2) : No
> Peer access from Tesla K40c (GPU0) -> Tesla K40c (GPU3) : No
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU0) : Yes
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU2) : No
> Peer access from Tesla K40c (GPU1) -> Tesla K40c (GPU3) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU0) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU1) : No
> Peer access from Tesla K40c (GPU2) -> Tesla K40c (GPU3) : Yes
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU0) : No
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU1) : No
> Peer access from Tesla K40c (GPU3) -> Tesla K40c (GPU2) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 11.2, NumDevs = 4
Result = PASS


cd ../bandwidthTest
sudo make


[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Tesla K40c
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			7.3

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			6.5

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			184.9

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

  如果测试的最后结果都是Result = PASS,说明CUDA安装成功。

3. 安装cuDNN



## 解压
tar -xzvf cudnn-11.2-linux-x64-v8.1.0.77.tgz

## 复制
cp cuda/include/cudnn.h /usr/local/cuda-11.2/include/
cp cuda/lib64/libcudnn* /usr/local/cuda-11.2/lib64/

## 授权
sudo chmod a+r /usr/local/cuda-11.2/include/cudnn.h /usr/local/cuda-11.2/lib64/libcudnn*
  • 1
  • 2
    觉得还不错? 一键收藏
  • 打赏
  • 0


  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则




¥1 ¥2 ¥4 ¥6 ¥10 ¥20



钱包余额 0


