安装pacemakerrpm包后,遇到启动失败的情况,原因和动态链接库的加载有关,以下是详细。
问题
编译生成pacemaker 1.1.15的rpm包,然后在其它机器上安装pacemaker rpm包后,启动失败。
[root@srdsdevapp73 ~]# service pacemaker start
Starting Pacemaker Cluster Manager [FAILED]
环境
CentOS 6.3 64bit
原因
通过strace发现pacemaker启动失败由于加载库libcoroipcc.so.4失败
[root@srdsdevapp73 ~]# strace -f service pacemaker start
...
[pid 19960] writev(2, [{"pacemakerd", 10}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"libcoroipcc.so.4", 16}, {": ", 2}, {"cannot open shared object file", 30}, {":
...
再用ldd检查pacemakerd,发现总共有3个库找不到
[root@srdsdevapp73 ~]# ldd /usr/sbin/pacemakerd
linux-vdso.so.1 => (0x00007fffc4c9f000)
libcrmcluster.so.4 => /usr/lib/libcrmcluster.so.4 (0x0000003cbac00000)
libstonithd.so.2 => /usr/lib/libstonithd.so.2 (0x0000003cba400000)
libcrmcommon.so.3 => /usr/lib/libcrmcommon.so.3 (0x0000003cb4c00000)
libm.so.6 => /lib64/libm.so.6 (0x0000003cb3c00000)
libcpg.so.4 => /usr/lib64/libcpg.so.4 (0x00007f3f72199000)
libcfg.so.6 => /usr/lib64/libcfg.so.6 (0x00007f3f71f95000)
libcmap.so.4 => /usr/lib64/libcmap.so.4 (0x00007f3f71d8f000)
libquorum.so.5 => /usr/lib64/libquorum.so.5 (0x00007f3f71b8b000)
libgnutls.so.26 => /usr/lib64/libgnutls.so.26 (0x0000003cb8800000)
libcorosync_common.so.4 => /usr/lib64/libcorosync_common.so.4 (0x00007f3f71988000)
libplumb.so.2 => /usr/lib64/libplumb.so.2 (0x00007f3f71754000)
libpils.so.2 => /usr/lib64/libpils.so.2 (0x00007f3f7154b000)
libqb.so.0 => /usr/lib64/libqb.so.0 (0x00007f3f712e6000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cb2c00000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x0000003cb7000000)
libxslt.so.1 => /usr/lib64/libxslt.so.1 (0x0000003cb4800000)
libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x0000003cb6000000)
libc.so.6 => /lib64/libc.so.6 (0x0000003cb2400000)
libuuid.so.1 => /lib64/libuuid.so.1 (0x0000003cb5000000)
libpam.so.0 => /lib64/libpam.so.0 (0x0000003cb6c00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003cb3000000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003cb2800000)
libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x0000003cb3800000)
libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x0000003cb8400000)
libcoroipcc.so.4 => not found
libcfg.so.4 => not found
libconfdb.so.4 => not found
libtasn1.so.3 => /usr/lib64/libtasn1.so.3 (0x0000003cb7800000)
libz.so.1 => /lib64/libz.so.1 (0x0000003cb3400000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x0000003cb7400000)
/lib64/ld-linux-x86-64.so.2 (0x0000003cb2000000)
libaudit.so.1 => /lib64/libaudit.so.1 (0x0000003cb6400000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003cb5c00000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x0000003cb6800000)
libfreebl3.so => /lib64/libfreebl3.so (0x0000003cb5800000)
上面有一段"/usr/lib/libcrmcluster.so.4"比较奇怪,确认后发现文件不对,是以前安装的版本(不清楚当初怎么安装的了)。 正确的库位置应该是"/usr/lib64/libcrmcluster.so.4"。 将老版本的pacemaker删除后,一切正常
[root@srdsdevapp73 ~]# rm -f /usr/lib/libcrm*
[root@srdsdevapp73 ~]# rm -f /usr/lib/libstonithd.*
[root@srdsdevapp73 ~]# ldd /usr/sbin/pacemakerd
linux-vdso.so.1 => (0x00007fff9a3ff000)
libcrmcluster.so.4 => /usr/lib64/libcrmcluster.so.4 (0x00007f849a1fc000)
libstonithd.so.2 => /usr/lib64/libstonithd.so.2 (0x00007f8499fea000)
libcrmcommon.so.3 => /usr/lib64/libcrmcommon.so.3 (0x00007f8499d93000)
libm.so.6 => /lib64/libm.so.6 (0x0000003cb3c00000)
libcpg.so.4 => /usr/lib64/libcpg.so.4 (0x00007f8499b8c000)
libcfg.so.6 => /usr/lib64/libcfg.so.6 (0x00007f8499988000)
libcmap.so.4 => /usr/lib64/libcmap.so.4 (0x00007f8499782000)
libquorum.so.5 => /usr/lib64/libquorum.so.5 (0x00007f849957e000)
libgnutls.so.26 => /usr/lib64/libgnutls.so.26 (0x0000003cb8800000)
libcorosync_common.so.4 => /usr/lib64/libcorosync_common.so.4 (0x00007f849937b000)
libplumb.so.2 => /usr/lib64/libplumb.so.2 (0x00007f8499147000)
libpils.so.2 => /usr/lib64/libpils.so.2 (0x00007f8498f3e000)
libqb.so.0 => /usr/lib64/libqb.so.0 (0x00007f8498cd9000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cb2c00000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x0000003cb7000000)
libxslt.so.1 => /usr/lib64/libxslt.so.1 (0x0000003cb4800000)
libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x0000003cb6000000)
libc.so.6 => /lib64/libc.so.6 (0x0000003cb2400000)
libuuid.so.1 => /lib64/libuuid.so.1 (0x0000003cb5000000)
libpam.so.0 => /lib64/libpam.so.0 (0x0000003cb6c00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003cb3000000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003cb2800000)
libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x0000003cb3800000)
libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x0000003cb8400000)
libtasn1.so.3 => /usr/lib64/libtasn1.so.3 (0x0000003cb7800000)
libz.so.1 => /lib64/libz.so.1 (0x0000003cb3400000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x0000003cb7400000)
/lib64/ld-linux-x86-64.so.2 (0x0000003cb2000000)
libaudit.so.1 => /lib64/libaudit.so.1 (0x0000003cb6400000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003cb5c00000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x0000003cb6800000)
libfreebl3.so => /lib64/libfreebl3.so (0x0000003cb5800000)
[root@srdsdevapp73 ~]# service pacemaker start
Starting Pacemaker Cluster Manager [ OK ]
总结
Linux下查找动态链接库的默认路径(未在/etc/ld.so.conf中设置,动态链接库加载时会优先查找/etc/ld.so.cache中库)的顺序如下,如果有同名的库文件挡在前面,可能导致动态链接库加载失败。
/lib
/usr/lib
/lib64
/usr/lib64