程序执行
python3 mynet.py
程序输出
net: torch.Size([8, 20, 128, 128])
Segmentation fault
GDB调试
1 执行命令
$ gdb python3
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python3...done.
(gdb) r mynet.py
.....
[Thread 0x7fffc3b4a700 (LWP 32239) exited]
[Thread 0x7fffbf349700 (LWP 32238) exited]
[Thread 0x7fffbcb48700 (LWP 32237) exited]
[New Thread 0x7ffe2a0c2700 (LWP 32290)]
[New Thread 0x7ffe2c8c3700 (LWP 32299)]
[New Thread 0x7ffe2d0c4700 (LWP 32300)]
net: torch.Size([8, 20, 128, 128])
Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007fffee906cd2 in ?? () from /home/anaconda3/envs/pytorch1.7/lib/python3.7/site-packages/torch/lib/../../../../libcudart.so.11.0
3 原因分析
主要问题在 这一句 0x00007fffee906cd2 in ?? () from /home/anaconda3/envs/pytorch1.7/lib/python3.7/site-packages/torch/lib/../../../../libcudart.so.11.0
由于本机上有多个cuda版本 (cuda10.0 cuda11.0),发现显卡驱动是cuda10.0,装pytorch 包时用了cuda11.0 导致出错,重新安装pytorch 及其依赖包
nvcc -V
conda list
重装包后
python3 mynet.py
错误解决!