在初次调试MPI程序,调用Send&Recv函数时,出现报错:
$ mpirun -n 4 -f /home/mpi_config_file ./mpi_SendRecv
Abort(702667023) on node 2 (rank 2 in comm 0): Fatal error in PMPI_Send: Other MPI error, error stack:
PMPI_Send(157).......: MPI_Send(buf=0x7fff9fa6ac84, count=1, MPI_INT, dest=3, tag=0, MPI_COMM_WORLD) failed
MPID_Send(467).......:
MPIDI_send_unsafe(39):
(unknown)(): Other MPI error
原因是客户端的防火墙没有关,需要把所有节点的防火墙都进行关闭操作,方可保证各节点进行通信。
1.关闭防火墙
[root@centos7 ~]# systemctl stop firewalld.service
[root@centos7 ~]# systemctl disable firewalld.service
[root@centos7 ~]# firewall-cmd --state
not running
2.关闭selinux
临时关闭
[root@centos7 ~]# setenforce 0
永久关闭
[root@centos7 ~]# sed -i 's/^ *SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
重启后配置生效
[root@centos7 ~]# sestatus
SELinux status: disabled
切记,以上命令一定要在所有节点上都操作一遍!!!
再次运行程序,就能够正常跑通了!
[centos@node1 mpi_share]$ mpirun -n 4 -f /home/mpi_config_file ./mpi_SendRecv
Process 1 received number -1 from process 0
Process 3 received number 100 from process 2
参考资料
Centos7下NFS服务器搭建及客户端连接配置-腾讯云开发者社区-腾讯云
(73条消息) MPI集群环境搭建_mpi server_威成天下的博客-CSDN博客
(73条消息) MPI安装+CentOs6.5多机环境下MPI并行编程+MPI矩阵并行计算(超详细)_mpi环境配置_灬小柒灬s的博客-CSDN博客