安装
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
为了防止后面权限有问题,执行下面这个
sudo groupadd docker
sudo gpasswd -a $USER docker
newgrp docker
基本操作
获得镜像
docker pull hello-world
运行
docker run hello-world
查看当前拥有的镜像
docker image ls
查看当前拥有的container
docker container ls -a
交互运行
docker run -ti --rm python:3.7
交互运行,执行其他命令
docker run -ti --rm python:3.7 bash
展示所有运行中的容器
docker ps
展示所有所有的容器,包括未运行的容器
docker ps -a
停止所有容器
docker stop $(docker ps -a -q)
删除所有容器(删除一个容器,只需要把最后面改成对应的CONTAINER ID就行)
docker rm $(docker ps -a -q)
删除镜像(xxx是IMAGE ID)
docker rmi xxx
nvidia-docker
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
接着测试一下
docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
创建image
编写Dockerfile
FROM nvidia/cuda:11.6.0-base-ubuntu20.04
# FROM pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime
COPY . /
RUN sed -i 's/security.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN apt-get update
RUN apt-get install python3 python3-pip -y
RUN pip3 install --upgrade pip
RUN pip3 config set global.index-url https://mirrors.bfsu.edu.cn/pypi/web/simple
RUN pip3 install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
RUN pip3 install -r requirements.txt
CMD ["sh", "predict.sh"]
由于COPY .会把所有的文件复制进去,所以可以编写.dockerignore,语法与.gitignore一样
.idea
*.pth
output
workspace
test_set
接着就是build
进入Dockerfile所在目录,然后执行
(最后一个是image名字)
docker build . -t nightmare4214
image上传dockerhub
登入
docker login
查看一下image id
docker image ls
其中148ec92d55ab 是image id,nightmare4214是dockerhub账号,icml-hqu是仓库名字
docker tag 148ec92d55ab nightmare4214/icml-hqu
docker push nightmare4214/icml-hqu
具体语法如下,tagname像我一样不写就是latest
docker tag local-image:tagname new-repo:tagname
docker push new-repo:tagname
拉取
docker pull nightmare4214/icml-hqu
docker container run --gpus "device=0" -m 28g --name icml-hqu --rm -v $PWD/test_set/:/workspace/inputs/ -v $PWD/nightmare4214_outputs/:/workspace/outputs/ nightmare4214/icml-hqu:latest /bin/bash -c "sh predict.sh"
参考
https://nbviewer.org/github/ericspod/ContainersForCollaboration/blob/master/ContainersForCollaboration.ipynb
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
https://docs.docker.com/engine/reference/commandline/container_run/