nvidia-docker环境搭建和Clion远程调试

安装前说明:
在生产的过程中,docker中的镜像或者容器会越来越多,占用的空间越来越大,因此需要设立一个单独的磁盘来保存docker相关资源。在设置docker的相关资源的存放路径之前,一定要现挂载磁盘,然后再pull镜像。磁盘挂载可参阅https://blog.csdn.net/wuqingshan2010/article/details/111663416
nvidia-docker安装参考网址:https://nvidia.github.io/nvidia-docker/

1. 卸载原有的docker

sudo apt-get remove docker docker-engine

若不成功,使用下列命令卸载:

sudo apt-get remove docker docker-engine docker-ce docker.io
sudo apt-get purge docker
sudo apt-get autoremove docker
sudo rm -rf /var/lib/docker

2. 安装相关依赖工具

sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common

3. 添加Docker的官方GPG密钥

sudo curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -

或者

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88

结果

pub   rsa4096 2017-02-22 [SCEA]
      9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <docker@docker.com>
sub   rsa4096 2017-02-22 [S]

4. 添加docker-ce官方仓库

sudo add-apt-repository  "deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/ $(lsb_release -cs)  stable"
sudo apt-get update

或者

sudo add-apt-repository  "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs)  stable"

说明:
ubuntu 2004上要想使用load -i加载nvidia/cuda:10.2-devel-ubuntu18.04版本的docker,必须安装基于ubuntu-18.04docker-ce,因此"更改为sudo add-apt-repository "deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/bionic stable"
或者

sudo gedit /etc/apt/sources.list
deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/ubuntu bionic stretch stable

5. 查看docker-ce的版本,安装制定版本

sudo apt-cache madison docker-ce
sudo apt install docker-ce

6. 查看安装版本信息

sudo docker version

7.安装nvidia-docker2的秘钥和添加nvidia-docker2的仓库

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

/etc/apt/sources.list.d/nvidia-docker.list下添加如下内容:

deb https://nvidia.github.io/libnvidia-container/stable/ubuntu20.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu20.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu20.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu20.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu20.04/$(ARCH) /

或者

deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
sudo apt-get update

8.查看版本

sudo apt-cache madison nvidia-docker2
sudo apt-cache madison  nvidia-container-runtime

9. 安装制定版本nvidia-docker2以及nvidia-container-runtime

sudo apt-get install nvidia-docker2=2.0.3+docker18.06.2-1
sudo apt-get install nvidia-container-runtime=2.0.0+docker18.06.2-1
sudo apt install nvidia-docker2

针对想load ubuntu-18.04版本的docker,请参考第四步说明,定制版本例如:

sudo apt install docker-ce=5:18.09.7~3-0~ubuntu-bionic nvidia-container-runtime=2.0.0+docker18.09.7-3 nvidia-docker2=2.0.3+docker18.09.7-3

10. 验证nvidia-smi

sudo docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi

11. 修改docker存储路径

  • 查看详细信息:
sudo docker info
  • 查看配置
    默认情况下Docker的存放位置为:/var/lib/docker
  • 首先停掉Docker服务:
systemctl stop docker
service docker stop
  • 移动根目录和建立软连接
sudo mv /var/lib/docker  /home/work/docker_root
sudo ln -s /home/work/docker_root  /var/lib/docker
  • 重启服务
sudo systemctl daemon-reload
sudo systemctl restart docker

12. pull 官方基于cuda10.0的ubuntu18.04

sudo docker pull nvidia/cuda:10.0-devel-ubuntu18.04

查看网址:https://hub.docker.com/r/nvidia/cuda

13. 修改镜像

  • 进入nvidia/cuda:10.0-devel-ubuntu18.04镜像:

sudo docker run -it nvidia/cuda:10.0-devel-ubuntu18.04 /bin/bash

  • 安装openssh-server并更换清华源
apt install openssh-server
apt update
apt install vim
vim /etc/apt/sources.list
apt install libopencv-dev

14. 配置ssh,方便Clion连接

vi /etc/ssh/sshd_config

取消 Port 22注释
调整 PermitRootLogin参数值为yes
PermitEmptyPasswords参数值修改为yes

apt install gdb
apt install rsync
passwd root
/etc/init.d/ssh restart

15. 保存镜像

exit
sudo docker commit 容器ID nvidia/cuda:10.0-devel-ubuntu18.04

16. 创建容器

sudo docker container run --gpus all --restart=always --privileged -d --name PC_cuda100_20100 -p 20100:22 -it --ipc=host -i -t -v /home:/home  nvidia/cuda:10.0-devel-ubuntu18.04 /usr/sbin/sshd -D
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值