1.安装docker
删除旧版本:
sudo apt-get remove docker docker-engine docker.io
更新软件源:
sudo apt-get update
安装apt-transport-https等软件包支持https协议的源:
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
添加源的gpg密钥:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
添加docker官方软件源:
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
再次更新软件包缓存:
sudo apt-get update
安装docker-ce:
sudo apt-get install docker-ce
查看docker版本:
sudo docker version
免sudo使用docker
sudo groupadd docker
sudo gpasswd -a ${USER} docker
sudo service docker restart
newgrp - docker
2.安装nvidia-docker:
官方安装教程
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
移除nvidia-docker1.0:
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge nvidia-docker
设置仓库和gpg秘钥:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
(可选)如果是A100(穷人不配了):
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
安装nvidia-docker2和依赖:
sudo apt-get update
sudo apt-get install -y nvidia-docker2
重启docker:
sudo systemctl restart docker