安装docker和nvidia-docker的过程中经历了一些坑,记录下来。
最大的坑源:安装nvidia-docker版本错误
按照下面的命令默认安装的nvidia-docker1,在执行第二条命令时报错
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
卸载docker和nvidia-docker,重新安装
卸载:
1.卸载docker
sudo apt-get remove docker docker-engine docker.io
2.卸载nvidia-docker
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge nvidia-docker
安装:
1.安装docker
参考https://yeasy.gitbooks.io/docker_practice/install/ubuntu.html
$ sudo apt-get update
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
$ curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository \
"deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"
$ sudo apt-get update
$ sudo apt-get install docker-ce
启动docker
$ sudo systemctl enable docker
$ sudo systemctl start docker
测试docker
$ docker run hello-world
2.安装nvidia-docker2
参考https://github.com/NVIDIA/nvidia-docker和https://blog.csdn.net/zh_jessica/article/details/79644544
# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerd
测试nvidia-docker
docker run --runtime=nvidia --rm nvidia/cuda:8.0-devel nvidia-smi
成功显示nvidia-smi显示的驱动信息。