Centos7安装NVIDIA驱动并在docker容器中使用GPU
Install NVIDIA Driver
1.查看是否禁用nouveau
lsmod | grep nouveau
vim /etc/modprobe.d/blacklist.conf
添加如下两行
blacklist nouveau
options nouveau modeset=0
2.安装elrepo源
yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
3.安装nvidia-detect(检查合适的驱动版本)
yum -y install nvidia-detect
检查驱动版本 nvidia-detect -v
4.安装显卡驱动
yum -y install kmod-nvidia
5.查看安装的版本是否和检测的版本一致
6.检查驱动是否可用
nvidia-smi
Setting up NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
yum-config-manager --enable libnvidia-container-experimental
yum clean expire-cache
sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+