完整报错如下:
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 paddle::pybind::eager_api_gaussian(_object*, _object*, _object*)
1 gaussian_ad_func(paddle::experimental::IntArrayBase<paddle::Tensor>, float, float, int, phi::DataType, phi::Place)
2 paddle::experimental::gaussian(paddle::experimental::IntArrayBase<paddle::Tensor> const&, float, float, int, phi::DataType, phi::Place const&)
3 paddle::experimental::GetDeviceContextByBackend(paddle::experimental::Backend)
4 paddle::experimental::DeviceContextPool::Get(phi::Place const&)
5 phi::DeviceContextPool::Get(phi::Place const&)
6 std::__future_base::_Deferred_state<std::thread::_Invoker<std::tuple<std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > (*)(phi::Place const&, bool, int), phi::Place, bool, int> >, std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > >::_M_complete_async()
7 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)
8 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > >, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > (*)(phi::Place const&, bool, int), phi::Place, bool, int> >, std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > > >::_M_invoke(std::_Any_data const&)
9 std::unique_ptr<phi::DeviceContext, std::default_delete<phi::DeviceContext> > paddle::platform::CreateDeviceContext<phi::GPUContext>(phi::Place const&, bool, int)
10 std::enable_if<std::is_same<phi::GPUContext, phi::GPUContext>::value, phi::GPUContext*>::type paddle::platform::ConstructDevCtx<phi::GPUContext>(phi::Place const&, int)
11 phi::GPUContext::GPUContext(phi::GPUPlace const&, bool, int)
12 phi::InitGpuProperties(phi::Place, int*, int*, int*, int*, int*, int*, std::array<int, 3ul>*)
13 cudnnGetVersion
----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
[TimeInfo: *** Aborted at 1715825536 (unix time) try "date -d @1715825536" if you are using GNU date ***]
[SignalInfo: *** SIGABRT (@0x3e800007280) received by PID 29312 (TID 0x7f09e90703c0) from PID 29312 ***]
LAUNCH INFO 2024-05-16 10:12:17,545 Exit code -6
解决方法:
1.先找到libcudnn_ops_infer.so.8所在位置,执行下列指令:
sudo find / -name libcudnn_ops_infer.so.8
在你的虚拟环境中,如果有libcudnn_ops_infer.so.8文件,该文件应该是在/home/lab/anaconda3/envs/xxx/lib/python3.8/site-packages/nvidia/cudnn/lib路径下。
2.找到后设置环境变量,执行下列指令:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/lab/anaconda3/envs/xxx/lib/python3.8/site-packages/nvidia/cudnn/lib
将其中的/home/lab/anaconda3/envs/xxx/lib/python3.8/site-packages/nvidia/cudnn/lib替换为自己的路径。
再重新运行程序即可,问题解决!