MPI中可能会出现的错误

1.p1_xxxxx:  p4_error: interrupt SIGSEGV: 11

这个错误可能是因为某个进程中出现了段错误引起的,
自己编程中曾出现过的错误:
1.只在一个进程中给指针申请空间,而在其他进程没有申请,所以在广播的时候出错
2.在一个进程中联接mysql数据库,而在所有的进程中断开数据库的联接

网上有个人说的很好:
"There are 2 things to check.
  * Run one of the test programs like pi3.f or cpi.c to see whether your cluster's OK.
  * if it is, the fault is in your code. See if you're exceeding array bounds or accessing memory which you haven't allocated, There's a SIGSEGV error - that's a segmentation violation. That might explain stuff like
                bm_list_21829:  p4_error: interrupt SIGINT: 2
Once you have a seg. violation, all the 4 processors are sent a signal to interrupt the process (SIGINT). Signals are defined in /usr/include/sys/signal.h (at least on the SGIs; might be
different on other systems). "

2. p1_10401:  p4_error: : 14
1 - MPI_BCAST : Message truncated
[1]  Aborting program !
[1] Aborting program!

这个也是由于mpi_bcast的接收空间不够引起的,要在mpi_bcast之前分配足够大的空间,这样就不会truncated了

3.p4_error: alloc_p4_msg failed:

p0_6773: (7.828703) xx_shmalloc: returning NULL; requested 1048616 bytes
p0_6773: (7.828762) p4_shmalloc returning NULL; request = 1048616 bytes
内存空间没分配足,可以通过设置环境变量P4_GLOBMEMSIZE (in bytes)来增大程序需要的内存空间
export P4_GLOBMEMSIZE=32000000 (for bash users) 
setenv P4_GLOBMEMSIZE 32000000 (for csh or tcsh users)
 
4.libcprts.so.5: cannot open shared object file: No such file or directory 
 
/home/jbrandt/tests/test.exe: error while loading shared libraries:
libcprts.so.5: cannot open shared object file: No such file or directory
p0_792: p4_error: Child process exited while making connection to remote
process on compute-0-0.local: 0
/opt/mpich/intel/bin/mpirun: line 1: 792 Broken pipe /home/jbrandt/tests/test.exe -
p4pg /home/jbrandt/tests/PI646 -p4wd /home/jbrandt/tes
 
没有用-static静态的连接,用-static重新编译就好了
 
评论 12
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值