解决docker容器无法驱动GPU问题:
1. 检查本机GPU是否存在问题
nvidia-smi
检查结果
Wed Jul 10 17:46:53 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti Off | 00000000:01:00.0 On | N/A |
| 0% 53C P8 2W / 160W | 536MiB / 8188MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1925 G /usr/lib/xorg/Xorg 238MiB |
| 0 N/A N/A 2119 G /usr/bin/gnome-shell 30MiB |
| 0 N/A N/A 2824 G ...99,262144 --variations-seed-version 197MiB |
| 0 N/A N/A 5932 G ...erProcess --variations-seed-version 60MiB |
+---------------------------------------------------------------------------------------+
出现上述结果说明无误
2. 安装NVIDIA Container Toolkit 软件包
sudo apt-get install -y nvidia-container-toolkit
3. 配置docker的显卡支持属性
sudo nvidia-ctk runtime configure --runtime=docker
检查配置是否成功
{
"registry-mirrors": [
"https://xxxx.mirror.aliyuncs.com"
],
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
该文件是处理docker image加速时创建的,一般采用阿里云加速,加速方法见
4.docker容器创建
总体
sudo docker run -dit \
--gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=all \
--name=melodic_docker2 \
-v /home/yearner:/home/yearner \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /dev/dri:/dev/dri \
--device=/dev/snd \
--device=/dev/dri/renderD128 \
-e DISPLAY=unix$DISPLAY \
-w /home/yearner fishros2/ros:melodic-desktop-full
其中
--gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=all
是让容器支持GPU
-v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=unix$DISPLAY --device=/dev/dri/renderD128 -v /dev/dri:/dev/dri
这里是驱动docker容器调用本地xhost接口显示