安装前说明:
在生产的过程中,docker中的镜像或者容器会越来越多,占用的空间越来越大,因此需要设立一个单独的磁盘来保存docker相关资源。在设置docker的相关资源的存放路径之前,一定要现挂载磁盘,然后再pull镜像。磁盘挂载可参阅https://blog.csdn.net/wuqingshan2010/article/details/111663416
nvidia-docker安装参考网址:https://nvidia.github.io/nvidia-docker/
1. 卸载原有的docker
sudo apt-get remove docker docker-engine
若不成功,使用下列命令卸载:
sudo apt-get remove docker docker-engine docker-ce docker.io
sudo apt-get purge docker
sudo apt-get autoremove docker
sudo rm -rf /var/lib/docker
2. 安装相关依赖工具
sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
3. 添加Docker的官方GPG密钥
sudo curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
或者
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
结果
pub rsa4096 2017-02-22 [SCEA]
9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
uid [ unknown] Docker Release (CE deb) <docker@docker.com>
sub rsa4096 2017-02-22 [S]
4. 添加docker-ce官方仓库
sudo add-apt-repository "deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/ $(lsb_release -cs) stable"
sudo apt-get update
或者
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
说明:
ubuntu 2004上要想使用load -i
加载nvidia/cuda:10.2-devel-ubuntu18.04
版本的docker
,必须安装基于ubuntu-18.04
的docker-ce
,因此"
更改为sudo add-apt-repository "deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/bionic stable"
或者
sudo gedit /etc/apt/sources.list
deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/ubuntu bionic stretch stable
5. 查看docker-ce的版本,安装制定版本
sudo apt-cache madison docker-ce
sudo apt install docker-ce
6. 查看安装版本信息
sudo docker version
7.安装nvidia-docker2的秘钥和添加nvidia-docker2的仓库
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
即/etc/apt/sources.list.d/nvidia-docker.list
下添加如下内容:
deb https://nvidia.github.io/libnvidia-container/stable/ubuntu20.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu20.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu20.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu20.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu20.04/$(ARCH) /
或者
deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
sudo apt-get update
8.查看版本
sudo apt-cache madison nvidia-docker2
sudo apt-cache madison nvidia-container-runtime
9. 安装制定版本nvidia-docker2以及nvidia-container-runtime
sudo apt-get install nvidia-docker2=2.0.3+docker18.06.2-1
sudo apt-get install nvidia-container-runtime=2.0.0+docker18.06.2-1
sudo apt install nvidia-docker2
针对想load ubuntu-18.04
版本的docker
,请参考第四步说明,定制版本例如:
sudo apt install docker-ce=5:18.09.7~3-0~ubuntu-bionic nvidia-container-runtime=2.0.0+docker18.09.7-3 nvidia-docker2=2.0.3+docker18.09.7-3
10. 验证nvidia-smi
sudo docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi
11. 修改docker存储路径
- 查看详细信息:
sudo docker info
- 查看配置
默认情况下Docker的存放位置为:/var/lib/docker - 首先停掉Docker服务:
systemctl stop docker
service docker stop
- 移动根目录和建立软连接
sudo mv /var/lib/docker /home/work/docker_root
sudo ln -s /home/work/docker_root /var/lib/docker
- 重启服务
sudo systemctl daemon-reload
sudo systemctl restart docker
12. pull 官方基于cuda10.0的ubuntu18.04
sudo docker pull nvidia/cuda:10.0-devel-ubuntu18.04
查看网址:https://hub.docker.com/r/nvidia/cuda
13. 修改镜像
- 进入nvidia/cuda:10.0-devel-ubuntu18.04镜像:
sudo docker run -it nvidia/cuda:10.0-devel-ubuntu18.04 /bin/bash
- 安装openssh-server并更换清华源
apt install openssh-server
apt update
apt install vim
vim /etc/apt/sources.list
apt install libopencv-dev
14. 配置ssh,方便Clion连接
vi /etc/ssh/sshd_config
取消 Port 22
注释
调整 PermitRootLogin
参数值为yes
将 PermitEmptyPasswords
参数值修改为yes
apt install gdb
apt install rsync
passwd root
/etc/init.d/ssh restart
15. 保存镜像
exit
sudo docker commit 容器ID nvidia/cuda:10.0-devel-ubuntu18.04
16. 创建容器
sudo docker container run --gpus all --restart=always --privileged -d --name PC_cuda100_20100 -p 20100:22 -it --ipc=host -i -t -v /home:/home nvidia/cuda:10.0-devel-ubuntu18.04 /usr/sbin/sshd -D