nvidia-docker2 升级文档
说明:鉴于当前应用版本均需支持cuda10,而对于docker环境,只有nvidia-docker2的版
本才可支持cuda10。特给出以下升级步骤,安装包。
docker环境要求
-
**docker 版本 **
docker-ce 18.06+
-
**nvidia-docker 版本 **
nvidia-docker2+
检查nvidia-docker版本
$ nvidia-docker version
升级docker及nvidia-docker版本
-
**卸载nvidia-docker **
$ yum remove nvidia‐docker
-
**下载安装包 **
-
**拷贝安装包nvidia-docker2.zip至/opt/,然后解压 **
$ unzip nvidia-docker2.zip
-
**安装 **
# 进入安装包目录 $ cd nvidia-docker2 $ rpm ‐i libnvidia‐container1‐1.0.5‐1.x86_64.rpm $ rpm ‐i libnvidia‐container‐tools‐1.0.5‐1.x86_64.rpm $ rpm ‐i nvidia‐container‐runtime‐3.1.4‐1.x86_64.rpm $ rpm ‐i nvidia‐container‐toolkit‐1.0.5‐2.x86_64.rpm $ rpm ‐i nvidia‐docker2‐2.2.2‐1.noarch.rpm
参考链接:内部资料
运行镜像出错
docker: Error response from daemon: Unknown runtime specified nvidia.
解决方法
原来是nvidia-docker 没有注册:
具体的:
To register the nvidia runtime, use the method below that is best suited to your environment.
You might need to merge the new argument with your existing configuration.
请先检查本地是否有对应的配置文件,查看其中的值,然后再进行操作。以免误操作。
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo tee /etc/systemd/system/docker.service.d/override.conf <<EOF
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --host=fd:// --add-runtime=nvidia=/usr/bin/nvidia-container-runtime
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo tee /etc/docker/daemon.json <<EOF
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
EOF
sudo pkill -SIGHUP dockerd
sudo systemctl restart docker
参考链接:https://blog.csdn.net/weixin_32820767/article/details/80538510