症状 (Heartbeat启动一段时间后自杀,进程消失):
5729 heartbeat[4864]:
2010/04/09_17:58:40 CRIT: Emergency Shutdown: Attempting to kill
everything ourselves
5730 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBREAD process 5764 with signal
9
5731 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBWRITE process 5765 with signal
9
5732 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBREAD process 5766 with signal
9
5733 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBFIFO process 5757 with signal
9
5734 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBWRITE process 5761 with signal
9
5735 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBREAD process 5762 with signal
9
5736 heartbeat[4864]:
2010/04/09_17:58:40 info: killing HBWRITE process 5763 with signal
9
Heartbeat日志显示错误:
heartbeat[7341]:
2010/04/09_20:49:45 debug: displaying uuid table
heartbeat[7341]:
2010/04/09_20:49:45 debug:
uuid=f33f33a8-80ff-459a-9e56-5f706e2e0f9b,
name=ha-04
heartbeat[7341]:
2010/04/09_20:49:47 WARN: nodename ha-03 uuid changed to
ha-04
heartbeat[7341]:
2010/04/09_20:49:47 debug: displaying uuid table
heartbeat[7341]:
2010/04/09_20:49:47 debug:
uuid=f33f33a8-80ff-459a-9e56-5f706e2e0f9b,
name=ha-03
heartbeat[7341]:
2010/04/09_20:49:47 ERROR: should_drop_message: attempted replay
attack [ha-04]? [gen = 1270099870, curgen =
1270099871]
heartbeat[7341]:
2010/04/09_20:49:47 ERROR: should_drop_message: attempted replay
attack [ha-04]? [gen = 1270099870, curgen =
1270099871]
heartbeat[7341]:
2010/04/09_20:49:47 ERROR: should_drop_message: attempted replay
attack [ha-04]? [gen = 1270099870, curgen =
1270099871]
heartbeat[7341]:
2010/04/09_20:49:47 WARN: nodename ha-04 uuid changed to
ha-03
错误原因: 使用Vmware拷贝造成UUID一致:
解决办法:
rm
/var/lib/heartbeat/hb_uuid