在Linux系统下,使用conda环境安装GPU版本的Paddle,安装后使用官方检测程序python -c "import paddle; paddle.utils.run_check()"
检测GPU是否工作正常,出现如下报错:The third-party dynamic library (libcudnn.so) that Paddle depends on is not configured correctly.
W0322 10:44:47.675211 290774 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.6
W0322 10:44:47.675591 290774 dynamic_loader.cc:307] The third-party dynamic library (libcudnn.so) that Paddle depends on is not configured correctly. (error code is /usr/local/cuda/lib64/libcudnn.so: cannot open shared object file: No such file or directory)
Suggestions:
1. Check if the third-party dynamic library (e.g. CUDA, CUDNN) is installed correctly and its version is matched with paddlepaddle you installed.
2. Configure third-party dynamic library environment variables as follows:
- Linux: set LD_LIBRARY_PATH by `export LD_LIBRARY_PATH=...`
- Windows: set PATH by `set PATH=XXX;
……
PreconditionNotMetError: Cannot load cudnn shared library. Cannot invoke method cudnnGetVersion.
[Hint: cudnn_dso_handle should not be null.] (at /paddle/paddle/phi/backends/dynload/cudnn.cc:60)
[operator < fill_constant > error]
分析报错内容,Paddle依赖的第三方链接库配置错误,根据建议,添加环境变量即可。
解决方案:
查看路径
我是用的是非root用户创建的环境,命名为paddle,环境路径为~/.conda/envs/paddle
,对应的第三方动态链接库地址为~/.conda/envs/paddle/lib
,根据你建立的环境名称,对应的路径为~/.conda/envs/[虚拟环境名称]/lib
若不清楚安装路径,可使用conda activate [环境名]
进入环境,运行python -c "import paddle; print(paddle.__file__)"
输出安装路径,我的输出结果为
/home/ubuntu/.conda/envs/paddle/lib/python3.8/site-packages/paddle/__init__.py
对应的路径为
/home/ubuntu/.conda/envs/paddle/lib
或
~/.conda/envs/paddle/lib #建议使用相对路径
添加环境变量
- 临时方案
每次在程序运行前设置环境变量export LD_LIBRARY_PATH=~/.conda/envs/paddle/lib python xxx.py
- 永久方案
将环境变量添加到~/.bashrc
文件
添加后需要关闭终端重新打开或者登录。echo "export LD_LIBRARY_PATH=~/.conda/envs/paddle/lib">>~/.bashrc
再次运行,错误消失。
(paddle) ubuntu@ThinkStation:~$ python -c "import paddle; paddle.utils.run_check()"
Running verify PaddlePaddle program ...
W0322 11:10:42.037217 309060 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.6
W0322 11:10:42.042984 309060 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4.
PaddlePaddle works well on 1 GPU.
W0322 11:10:47.978128 309060 fuse_all_reduce_op_pass.cc:79] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2.
PaddlePaddle works well on 2 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.