在使用OpenMPI执行跨节点计算任务时,遇到了这样的报错:
ORTE_ERROR_LOG: Data unpack had inadequate space in file base/plm_base_launch_support.c at line 1200
ORTE_ERROR_LOG: Data unpack had inadequate space in file base/plm_base_launch_support.c at line 1200
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
原因是我之前在conda中安装了多余的openmpi,与系统原有的openmpi相冲突了。
解决方法,卸载掉conda环境中安装的openmpi:
conda uninstall openmpi
然后就搞定咯。