Docker学习教程（二）—— Docker安装、GPU加速和使用说明

努力发paper

已于 2024-08-27 15:31:13 修改

阅读量2k

点赞数 27

分类专栏： Docker学习教程文章标签： docker 学习笔记

于 2024-04-19 10:51:37 首次发布

本文链接：https://blog.csdn.net/lizjiwei/article/details/137957160

版权

Docker学习教程专栏收录该内容

3 篇文章

订阅专栏

Docker安装、GPU加速和使用说明

1. Docker安装

Docker的安装又简单又便捷，已经支持全平台的使用了。具体可以参照官方链接。下面也给出了安装说明和安装脚本。

1.1 Docker和UFW

在Ubuntu系统中，如果使用UFW，那还得需要进行一点修改才能让Docker工作，需要对/etc/default/ufw文件进行一些改动。注意将下面的DROP改为ACCEPT。DEFAULT_FORWARD_POLICY="DROP"改为
DEFAULT_FORWARD_POLICY="ACCEPT"

1.2 系统要求

Docker 支持以下版本的 Ubuntu 操作系统：

Ubuntu Hirsute 22.04
Ubuntu Groovy 20.10
Ubuntu Focal 20.04 (LTS)
Ubuntu Bionic 18.04 (LTS)

1.3 卸载旧版本

旧版本的 Docker 称为 docker 或者 docker-engine，使用以下命令卸载旧版本：

sudo apt-get remove docker docker-engine docker.io

1.4 使用APT安装

由于 apt 源使用 HTTPS 以确保软件下载过程中不被篡改。因此，我们首先需要添加使用 HTTPS 传输的软件包以及 CA 证书。

sudo apt-get update

sudo apt-get install apt-transport-https \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

鉴于国内网络问题，强烈建议使用国内源，官方源请在注释中查看。

为了确认所下载软件包的合法性，需要添加软件源的 GPG 密钥。

curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

然后，我们需要向 sources.list 中添加 Docker 软件源

echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

更新 apt 软件包缓存，并安装 docker-ce：

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

1.5 使用脚本自动安装

在测试或开发环境中 Docker 官方为了简化安装流程，提供了一套便捷的安装脚本，Ubuntu 系统上可以使用这套脚本安装，另外可以通过 --mirror 选项使用国内源进行安装：

$ curl -fsSL get.docker.com -o get-docker.sh
$ sudo sh get-docker.sh --mirror Aliyun

执行这个命令后，脚本就会自动的将一切准备工作做好，并且把 Docker 的稳定(stable)版本安装在系统中。

1.6 启动Docker

sudo systemctl enable docker
sudo systemctl start docker

1.7 建立 docker 用户组

默认情况下，docker 命令会使用 Unix socket 与 Docker 引擎通讯。而只有 root 用户和 docker 组的用户才可以访问 Docker 引擎的 Unix socket。出于安全考虑，一般 Linux 系统上不会直接使用 root 用户。因此，更好地做法是将需要使用 docker 的用户加入 docker 用户组。

建立 docker 组：

sudo groupadd docker

将当前用户加入 docker 组

sudo usermod -aG docker $USER

退出当前终端并重新登录，进行如下测试。

1.8 测试 Docker 是否安装正确

$ docker run --rm hello-world

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
b8dfde127a29: Pull complete
Digest: sha256:308866a43596e83578c7dfa15e27a73011bdd402185a84c5cd7f32a88b501a24
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

2. Docker使用

2.1 确保Docker已经就绪

第一步查看Docker程序存在，功能是否正常。

$ sudo docker info
Client: Docker Engine - Community
 Version:    24.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.13.0
    Path:     /usr/lib/docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.0.5
    Path:     /usr/lib/docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.16
    Path:     /usr/lib/docker/cli-plugins/docker-extension
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /usr/lib/docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.22.0
    Path:     /usr/lib/docker/cli-plugins/docker-scan

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
···

2.2 Docker的应用

2.2.1 容器的指令

在容器创建之后，就不需要在此使用sudo docker run的指令了，就可以使用直接开始、关闭、重启、附着到容器等一系列操作。

创建并启动容器     - sudo docker run 
查找当前所有的容器  - sudo docker ps -a
启动容器          - sudo docker start
终止容器          - sudo docker stop
重启容器          - sudo docker restart
进入容器          - sudo docker attach
                 - sudo docker exec
导出容器          - sudo docker export
导入容器          - sudo docker import
删除容器          - sudo docker rm
查看日志          - sudo docker logs

2.2.2 镜像的指令

检索镜像          - sudo docker search
获取镜像          - sudo docker pull
列出镜像          - sudo docker images
                 - sudo docker image ls
删除镜像          - sudo docker rmi
                 - sudo docker image rm
导出镜像          - sudo docker save
导入镜像          - sudo docker load
构建镜像          - sudo docker commit

2.2.3 Docker系统服务

查看docker版本详细信息   - sudo docker version
查看docker简要信息      - sudo docker -v
启动docker             - sudo systemctl start docker
关闭docker             - sudo systemctl stop  docker
设置开机启动            - sudo systemstl enable docker
重启docker服务         - sudo service docker restart
关闭docker服务         - sudo service docker stop

2.3 运行第一个容器

2.3.1 解析Docker run 指令

使用docker run 命令进行第一个容器的运行。

$ sudo docker run --help

Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Create and run a new container from an image

Aliases:
  docker container run, docker run

Options:
      --add-host list                  Add a custom host-to-IP mapping
                                       (host:ip)
      --annotation map                 Add an annotation to the container
                                       (passed through to the OCI
                                       runtime) (default map[])
  -a, --attach list                    Attach to STDIN, STDOUT or STDERR
      --blkio-weight uint16            Block IO (relative weight),
                                       between 10 and 1000, or 0 to
                                       disable (default 0)
      --blkio-weight-device list       Block IO weight (relative device
                                       weight) (default [])
      --cap-add list                   Add Linux capabilities
      --cap-drop list                  Drop Linux capabilities
      --cgroup-parent string           Optional parent cgroup for the
                                       container
      --cgroupns string                Cgroup namespace to use
                                       (host|private)
                                       'host':    Run the container in
                                       the Docker host's cgroup namespace
                                       'private': Run the container in
                                       its own private cgroup namespace
                                       '':        Use the cgroup
                                       namespace as configured by the
                                                  default-cgroupns-mode
                                       option on the daemon (default)
      --cidfile string                 Write the container ID to the file
      --cpu-period int                 Limit CPU CFS (Completely Fair
                                       Scheduler) period
      --cpu-quota int                  Limit CPU CFS (Completely Fair
                                       Scheduler) quota
      --cpu-rt-period int              Limit CPU real-time period in
                                       microseconds
      --cpu-rt-runtime int             Limit CPU real-time runtime in
                                       microseconds
  -c, --cpu-shares int                 CPU shares (relative weight)
      --cpus decimal                   Number of CPUs
      --cpuset-cpus string             CPUs in which to allow execution
                                       (0-3, 0,1)
      --cpuset-mems string             MEMs in which to allow execution
                                       (0-3, 0,1)
  -d, --detach                         Run container in background and
                                       print container ID
      --detach-keys string             Override the key sequence for
                                       detaching a container
      --device list                    Add a host device to the container
      --device-cgroup-rule list        Add a rule to the cgroup allowed
                                       devices list
      --device-read-bps list           Limit read rate (bytes per second)
                                       from a device (default [])
      --device-read-iops list          Limit read rate (IO per second)
                                       from a device (default [])
      --device-write-bps list          Limit write rate (bytes per
                                       second) to a device (default [])
      --device-write-iops list         Limit write rate (IO per second)
                                       to a device (default [])
      --disable-content-trust          Skip image verification (default true)
      --dns list                       Set custom DNS servers
      --dns-option list                Set DNS options
      --dns-search list                Set custom DNS search domains
      --domainname string              Container NIS domain name
      --entrypoint string              Overwrite the default ENTRYPOINT
                                       of the image
  -e, --env list                       Set environment variables
      --env-file list                  Read in a file of environment variables
      --expose list                    Expose a port or a range of ports
      --gpus gpu-request               GPU devices to add to the
                                       container ('all' to pass all GPUs)
      --group-add list                 Add additional groups to join
      --health-cmd string              Command to run to check health
      --health-interval duration       Time between running the check
                                       (ms|s|m|h) (default 0s)
      --health-retries int             Consecutive failures needed to
                                       report unhealthy
      --health-start-period duration   Start period for the container to
                                       initialize before starting
                                       health-retries countdown
                                       (ms|s|m|h) (default 0s)
      --health-timeout duration        Maximum time to allow one check to
                                       run (ms|s|m|h) (default 0s)
      --help                           Print usage
  -h, --hostname string                Container host name
      --init                           Run an init inside the container
                                       that forwards signals and reaps
                                       processes
  -i, --interactive                    Keep STDIN open even if not attached
      --ip string                      IPv4 address (e.g., 172.30.100.104)
      --ip6 string                     IPv6 address (e.g., 2001:db8::33)
      --ipc string                     IPC mode to use
      --isolation string               Container isolation technology
      --kernel-memory bytes            Kernel memory limit
  -l, --label list                     Set meta data on a container
      --label-file list                Read in a line delimited file of labels
      --link list                      Add link to another container
      --link-local-ip list             Container IPv4/IPv6 link-local
                                       addresses
      --log-driver string              Logging driver for the container
      --log-opt list                   Log driver options
      --mac-address string             Container MAC address (e.g.,
                                       92:d0:c6:0a:29:33)
  -m, --memory bytes                   Memory limit
      --memory-reservation bytes       Memory soft limit
      --memory-swap bytes              Swap limit equal to memory plus
                                       swap: '-1' to enable unlimited swap
      --memory-swappiness int          Tune container memory swappiness
                                       (0 to 100) (default -1)
      --mount mount                    Attach a filesystem mount to the
                                      xhost container
      --name string                    Assign a name to the container
      --network network                Connect a container to a network
      --network-alias list             Add network-scoped alias for the
                                       container
      --no-healthcheck                 Disable any container-specified
                                       HEALTHCHECK
      --oom-kill-disable               Disable OOM Killer
      --oom-score-adj int              Tune host's OOM preferences (-1000
                                       to 1000)
      --pid string                     PID namespace to use
      --pids-limit int                 Tune container pids limit (set -1
                                       for unlimited)
      --platform string                Set platform if server is
                                       multi-platform capable
      --privileged                     Give extended privileges to this
                                       container
  -p, --publish list                   Publish a container's port(s) to
                                       the host
  -P, --publish-all                    Publish all exposed ports to
                                       random ports
      --pull string                    Pull image before running
                                       ("always", "missing", "never")
                                       (default "missing")
  -q, --quiet                          Suppress the pull output
      --read-only                      Mount the container's root
                                       filesystem as read only
      --restart string                 Restart policy to apply when a
                                       container exits (default "no")
      --rm                             Automatically remove the container
                                       when it exits
      --runtime string                 Runtime to use for this container
      --security-opt list              Security Options
      --shm-size bytes                 Size of /dev/shm
      --sig-proxy                      Proxy received signals to the
                                       process (default true)
      --stop-signal string             Signal to stop the container
      --stop-timeout int               Timeout (in seconds) to stop a
                                       container
      --storage-opt list               Storage driver options for the
                                       container
      --sysctl map                     Sysctl options (default map[])
      --tmpfs list                     Mount a tmpfs directory
  -t, --tty                            Allocate a pseudo-TTY
      --ulimit ulimit                  Ulimit options (default [])
  -u, --user string                    Username or UID (format:
                                       <name|uid>[:<group|gid>])
      --userns string                  User namespace to use
      --uts string                     UTS namespace to use
  -v, --volume list                    Bind mount a volume
      --volume-driver string           Optional volume driver for the
                                       container
      --volumes-from list              Mount volumes from the specified
                                       container(s)
  -w, --workdir string                 Working directory inside the container

下面就一些常用的options进行说明

-a stdin: 指定标准输入输出内容类型，可选 STDIN/STDOUT/STDERR 三项；
-d: 后台运行容器，并返回容器ID；
-i: 以交互模式运行容器，通常与 -t 同时使用；
-P: 随机端口映射，容器内部端口随机映射到主机的端口
-p: 指定端口映射，格式为：主机(宿主)端口:容器端口
-t:为容器重新分配一个伪输入终端，通常与 -i 同时使用；
--name="nginx-lb":为容器指定一个名称；
--dns 8.8.8.8: 指定容器使用的DNS服务器，默认和宿主一致；
--dns-search example.com:指定容器DNS搜索域名，默认和宿主一致；
-h "mars": 指定容器的hostname；
-e username="ritchie": 设置环境变量；
--env-file=[]: 从指定文件读入环境变量；
--cpuset="0-2" or --cpuset="0,1,2": 绑定容器到指定CPU运行；
-m :设置容器使用内存最大值；
--net="bridge":指定容器的网络连接类型，支持 bridge/host/none/container: 四种类型；
--link=[]:添加链接到另一个容器；
--expose=[]: 开放一个端口或一组端口；
--volume , -v: 绑定一个卷在 
--user : 指定创建容器的用户

2.3.2 拉取镜像

这里以楼主常用的ros-melodic-desktop-full镜像为例，进行讲解。首先对sudo docker pull进行讲解。

$ sudo docker pull --help

Usage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]

Download an image from a registry

Aliases:
  docker image pull, docker pull

Options:
  -a, --all-tags                Download all tagged images in the repository
      --disable-content-trust   Skip image verification (default true)
      --platform string         Set platform if server is multi-platform
                                capable
  -q, --quiet                   Suppress verbose output

翻译

-a : 拉取所有 tagged 镜像
--disable-content-trust : 忽略镜像的校验,默认开启

可以看到sudo docker pull 相对简单，主要是拉取镜像用的，参数设置也较为简单。
首先拉取需要的镜像ros-melodic-desktop-full

sudo docker pull osrf/ros:melodic-desktop-full

2.3.3 创建容器

如果需要Docker使用Rviz，Gazebo等一系列可视化工具，需要在创建容器之前，一定要先在terminal内输入xhost +命令，以解除Xserver服务器的限制，成功后会打印

xhost +
access control disabled, clients can connect from any host

而后使用如下指令创建容器

sudo docker run -it -v [/home/xxx/Projects:/home/xxx/Projects] --device=/dev/dri --group-add video --volume=/tmp/.X11-unix:/tmp/.X11-unix  --env="DISPLAY=$DISPLAY"   -e GDK_SCALE -e GDK_DPI_SCALE --name=[ros-display] [IMAGE_ID]  /bin/bash

其中--device=/dev/dri --group-add video --volume=/tmp/.X11-unix:/tmp/.X11-unix --env="DISPLAY=$DISPLAY" -e GDK_SCALE -e GDK_DPI_SCALE是使用Rviz等可视化需要完成的文件目录映射，注意需要根据自己的需要对命令行中输入的参数进行调整。主要是[ ] 的参数进行调整，适应自己的电脑。

创建后默认会直接进入Contanier，如果没有，可通过以下命令启动并进入

sudo docker start [ros-display] 
sudo docker exec -it [ros-display] /bin/bash

恭喜你已经开始了docker的启动。

2.3.4 操作容器

启动 - start
守护态运行 -d 以后台运行
终止 - stop
进入容器

 docker attach 243c
 docker exec -it 69d1 bash

导出和导入

 docker export 7691a814370e > ubuntu.tar
 cat ubuntu.tar | docker import - test/ubuntu:v1.0

删除

 $ docker rm trusting_newton
 trusting_newton
 $ docker container prune

3.宿主机安装显卡驱动

具体安装教程见链接

4. 安装 NVIDIA Container Toolkit和 nvidia-docker2

NVIDIA Container Toolkit使用户能够构建和运行GPU加速的容器。该工具包包括一个容器运行库和实用程序，用于自动配置容器以利用NVIDIA GPU。

4.1 配置存储库并更新

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

可选择将版本库配置为使用实验软件包：

sudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

4.2 更新软件列表

sudo apt-get update

4.3 安装 NVIDIA Container toolkit

sudo apt-get install -y nvidia-container-toolkit

4.4使用`nvidia-ctk`命令配置container runtime

sudo nvidia-ctk runtime configure --runtime=docker

4.5重启docker系统:

sudo systemctl restart docker

4.6运行nvidia cuda 容器进行测试

ubuntu20.04 可以使用下面指令进行测试，docker会自动从nvidia/cuda拉取11.0.3-base-ubuntu20.04镜像，并创建一个运行一次即删除的容器

sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

5. 设置非独显机器用Nvidia渲染

# ubuntu配置只使用独显  
sudo apt install nvidia-settings  
sudo apt install nvidia-prime  
sudo prime-select nvidia  
# 重启生效  
  
# 平常使用为了省电切回intel核显  
sudo prime-select intel  
  
# 选择混合模式  
nvidia-settings  
prime勾选中间那个

6. intel XE显卡无阴影和暗淡问题

# 在docker中安装xserver  
sudo apt-get install xserver-xorg-core  
  
# 仍然有问题，直接在world文件把太阳的shadows置零，解决  
<light name='sun' type='directional'>  
      <cast_shadows>0</cast_shadows>  
  
# 仍有概率变暗  
# 解决方法: 左上角scene,下面阴影手动切换一次

7. 创建新的容器

sudo docker run -it -v [/home/xxx/Projects:/home/xxx/Projects] --device=/dev/dri --group-add video --volume=/tmp/.X11-unix:/tmp/.X11-unix  --env="DISPLAY=$DISPLAY"   -e GDK_SCALE -e GDK_DPI_SCALE --privileged=true --network=host -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --env="QT_X11_NO_MITSHM=1" --gpus="all" --name=[ros-display] [IMAGE_ID] /bin/bash

8.修改Docker镜像的储存路径

8.1 查看原始路径

首先查看当前docker镜像默认的存储位置，如下命令

docker info | grep "Docker Root Dir"

一般情况下，在没有特意设置的情况下，默认的保存路径为

/var/lib/docker

8.2 关闭所有运行的docker容器

docker ps | awk '{print $1}' |xargs docker stop

8.3 停止docker服务

systemctl stop docker

8.4 创建新的路径

在新新增的磁盘挂载点上新建目录，并将原有的docker容器和镜像全部拷贝过来，比如这里新增磁盘的挂载点为 /data/，则参照如下命令操作

mkdir -p /data/var/lib/docker/
cd /data/var/lib/docker/
cp -r /var/lib/docker/* /data/var/lib/docker/

8.5 设置docker的配置文件

设置docker的配置文件，并指定存储路径，如果文件不存在则直接创建一个

vi /etc/docker/daemon.json

添加如下内容

{
	"data-root": "/data/var/lib/docker",
	"registry-mirrors": ["https://ooe7wn09.mirror.aliyuncs.com"]
	}

8.6 然后重启docker服务

systemctl daemon-reload
systemctl start docker

至此就完成了docker容器和镜像默认路径的修改

9. 容器的导出和导入

由于初入手的人可能对Dockerfile不太熟悉，需要利用手动导出容器生成新的黑盒镜像或者本地文件，编译镜像的使用。

容器的导出主要分为两种，一种是生成新的黑盒镜像，一种是生成压缩文件可以直接在电脑之间拷贝

9.1 容器导出镜像

容器导出镜像主要是指自己建立的容器需要手动导出为镜像，提交DockerHub等，供他人使用，其内容较为简单，指令如下

docker commit [选项] <容器ID或容器名> [<仓库名>[:<标签>]]

例如如下指令

$ docker commit \
    --author "Xxx <xxxxxxx@gmail.com>" \
    --message "XXXXXXXXXXX" \
    webserver \
    nginx:v2

其中 --author 是指定修改的作者，而 --message 则是记录本次修改的内容。这点和 git 版本控制相似，不过这里这些信息可以省略留空。

9.2 容器导出压缩文件

如果要导出本地某个容器作为压缩文件，可以使用 docker export 命令。具体指令如下

docker export [OPTIONS] CONTAINER

具体操作是

$ docker container ls -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                    PORTS               NAMES
7691a814370e        ubuntu:18.04        "/bin/bash"         36 hours ago        Exited (0) 21 hours ago                       test
$ docker export 7691a814370e > ubuntu.tar

其中 7691a814370e是容器的id，ubuntu.tar是导出的压缩文件,>文件输出定向符。

9.3 压缩文件导入镜像

可以使用 docker import 从容器快照文件中再导入为镜像，例如

$ cat ubuntu.tar | docker import - test/ubuntu:v1.0
$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
test/ubuntu         v1.0                9d37a6082e97        About a minute ago   171.3 MB

其中 ubuntu.tar是容器快照文件，test/ubuntu:v1.0是[<仓库名>[:<标签>]。

10.镜像的导出与导入

Docker 提供了 docker save 和 docker load 命令，用以将镜像保存为一个文件，然后传输到另一个位置上，再加载进来。这是在没有 Docker Registry 时的做法，现在已经不推荐，镜像迁移应该直接使用 Docker Registry，无论是直接使用 Docker Hub 还是使用内网私有 Registry 都可以。

10.1 镜像的导出

保存镜像的命令为:

docker save [容器名或容器id] -o [保存的文件名]

其样例为

docker save alpine -o filename.tar

10.2 镜像的导入

导入镜像的命令为:

docker load [OPTIONS]