Docker启动GPU容器
可参考安装NVIDIA Container Toolkit
以常用的ubuntu 20.04 GPU A10为例
- 安装Nvidia驱动,CUDA驱动,https://www.nvidia.cn/drivers/lookup/
wget https://cn.download.nvidia.cn/tesla/550.54.15/nvidia-driver-local-repo-ubuntu2004-550.54.15_1.0-1_amd64.deb
apt install ./nvidia-driver-local-repo-ubuntu2004-550.54.15_1.0-1_amd64.deb -y
apt update
apt install nvidia-driver-550 -y
- 安装Docker
# step 1: 安装必要的一些系统工具
sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
# step 2: 安装GPG证书
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
# Step 3: 写入软件源信息
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
# Step 4: 更新并安装Docker-CE
sudo apt-get -y update
sudo apt-get -y install docker-ce
- 安装配置Nvidia 容器运行时
# 配置生产存储库
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# 从存储库更新包列表
sudo apt-get update
# 安装 NVIDIA Container Toolkit 软件包
sudo apt-get install -y nvidia-container-toolkit
# 使用 nvidia-ctk 命令配置容器运行时
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
- 启动GPU容器
docker run --net=host --gpus all --rm pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel nvidia-smi