pytorch terminate called after throwing an instance of ‘c10::HIPError‘

最新推荐文章于 2023-11-16 09:23:28 发布

农民小飞侠

最新推荐文章于 2023-11-16 09:23:28 发布

阅读量2.5k

点赞数

分类专栏： pytorch 文章标签： pytorch

本文链接：https://blog.csdn.net/w5688414/article/details/124572084

版权

pytorch 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

今天在跑PPO程序的时候，出现了下面的错误：

terminate called after throwing an instance of 'c10::HIPError'
  what():  HIP error: hipErrorNoDevice
HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing HIP_LAUNCH_BLOCKING=1.
Exception raised from deviceCount at /pytorch/aten/src/ATen/hip/impl/HIPGuardImplMasqueradingAsCUDA.h:102 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f839e38ec72 in /root/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x4b3eaa (0x7f82bb4c2eaa in /root/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_hip.so)
frame #2: torch::autograd::Engine::start_device_threads() + 0x21a (0x7f83082beafa in /root/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0xf907 (0x7f83c61fe907 in /lib/x86_64-linux-gnu/libpthread.so.0)
......

我的torch版本是：

torch                              1.11.0+rocm4.5.2
cuda 10.2

解决方法

回退torch版本到1.5

pip install torch==1.5

然后就可以运行了。

农民小飞侠

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
pytorch terminate called after throwing an instance of ‘c10::HIPError‘

今天在跑PPO程序的时候，出现了下面的错误：terminate called after throwing an instance of 'c10::HIPError' what(): HIP error: hipErrorNoDeviceHIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.For debugging
复制链接

扫一扫