-
【Linux】【Docker】 Centos 使用GPU的 Docker
官方文档: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
- 安装使用步骤:
1. 安装 docker-ce
官方源: sudo yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
阿里源: sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sudo yum install docker-ce -y
sudo systemctl start docker
sudo systemctl enable docker
sudo systemctl status docker --查看服务状态
2. 要使docker容器能调用本地的gpu, 三种方法:
一是,安装nvidia-docker,该方法已被官方舍弃
二,安装nvidia-container-toolkit,又名nvidia-docker2,后添加—gpus参数来使用
三,安装nvidia-container-runtime,在首次运行时添加—runtime=nvidia参数,后续启动、结束都不需要再加。
官方安装:
# 使用nvidia-container-toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-container-toolkit
# 使用nvidia-container-runtime
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-container-runtime
使用:
docker pull nvidia/cuda:10.1-cudnn7-devel-centos7
docker run -it --gpus all nvidia/cuda:10.1-cudnn7-devel-centos7 /bin/bash
使用问题整理:
# docker run -it --gpus all nvidia/cuda:10.1-cudnn8-devel-centos7 /bin/bash
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
解决:
1. 重新安装
yum install -y nvidia-container-toolkit
yum list installed | grep nvidia
# libnvidia-container-tools.x86_64 1.5.0-1 @libnvidia-container
# libnvidia-container1.x86_64 1.5.0-1 @libnvidia-container
# nvidia-container-toolkit.x86_64 1.5.1-2 @nvidia-container-runtime
2. 重启 docker
systemctl restart docker