Ubuntu安装nvidia-container-toolkit以便docker容器里能使用nvidia的显卡

nvidia-container-toolkit安装

安装前提

  1. GNU/Linux x86_64 with kernel version > 3.10

  2. Docker >= 19.03 (recommended, but some distributions may include older versions of Docker. The minimum supported version is 1.12)

  3. NVIDIA GPU with Architecture >= Kepler (or compute capability 3.0)

  4. NVIDIA Linux drivers >= 418.81.07 (Note that older driver releases or branches are unsupported.)

Ubuntu在线安装nvidia-container-toolkit

1.使用xshell的root用户登录服务器

#查看docker版本
docker -v
#添加对应的库成功会提示ok
 curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add 

#设置软件包存储库和 GPG 密钥:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt-get update
#安装nvidia-container-toolkit
sudo apt-get install -y nvidia-container-toolkit
#重启docker
sudo systemctl restart docker
  • 注意在容器外执行相关命令
  • 错误1:
    执行:
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

出现:

Unsupported distribution!

Check https://nvidia.github.io/nvidia-docker

解决:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

docker版本小于19解决

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.lis

sudo apt-get update
#安装nvidia-docker
sudo apt-get install nvidia-docker

#重启docker
sudo systemctl restart docker
  • 注意在容器外执行相关命令

Ubuntu离线安装nvidia-container-toolkit安装

软件包的依赖关系:

├─ nvidia-container-toolkit (version)
│ ├─ libnvidia-container-tools (>= version)
│ └─ nvidia-container-toolkit-base (version)

├─ libnvidia-container-tools (version)
│ └─ libnvidia-container1 (>= version)
└─ libnvidia-container1 (version)

  • 安装顺序:
  1. libnvidia-container1
  2. libnvidia-container-tools
  3. nvidia-container-toolkit

获取软件包

  1. 下载以下软件包
libnvidia-container1_1.9.0-1_amd64.deb					
libnvidia-container-tools_1.9.0-1_amd64.deb		
nvidia-container-toolkit_1.9.0-1_amd64.deb			

安装软件包

  1. 上传软件包
  2. cd进入软件包目录
  3. 使用命令安装(注意安装顺序)
#需要先安装container1
sudo dpkg -i ./libnvidia-container1_1.9.0-1_amd64.deb
#再安装libnvidia-container-tools
sudo dpkg -i ./libnvidia-container-tools_1.9.0-1_amd64.deb
#最后安装nvidia-container-toolkit
sudo dpkg -i ./nvidia-container-toolkit_1.9.0-1_amd64.deb
  1. 重启docker服务
#重启docker
sudo systemctl restart docker

卸载nvidia-container-toolkit

  1. 卸载安装命令
sudo apt remove nvidia-container-toolkit
sudo apt remove libnvidia-container-tools
sudo apt remove libnvidia-container1
  1. 没有安装nvidia-container-toolkit报错
docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH.
ERRO[0000] error waiting for container: context canceled 

其他错误

E: Conflicting values set for option Signed-By regarding source https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg != E: The list of sources could not be read.>
#解决 备份源
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup
sudo cp -r /etc/apt/sources.list.d/ /etc/apt/sources.list.d.backup
#尝试删除与 NVIDIA 相关的软件源配置文件和签名键
sudo rm /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
sudo rm /etc/apt/sources.list.d/*nvidia*
#重新创建 NVIDIA 的软件源配置文件。
echo "deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /" | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt update
#重新安装
sudo apt-get install nvidia-docker

References
[1] nvidia-container-toolkit安装
[2] NVIDIA Linux drivers
[3] 获取软件包

  • 7
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值