基于SBD Fencing机制的SUSE HA集群的两个节点在异常断电启动时,其中一个节点的sbd服务无法启动,导致集群服务也无法启动,报以下错误:
# systemctl status pacemaker
○ pacemaker.service - Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: man:pacemakerd
https://clusterlabs.org/pacemaker/doc/
May 20 17:29:31 hanadb01 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
May 20 17:29:31 hanadb01 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result ‘dependency’.
# systemctl status sbd
x sbd.service - Shared-storage based fencing daemon
Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor preset: disabled)
Active: activating (start) since Mon 2024-05-20 17:28:01 CST; 49s ago
Docs: man:sbd(8)
Cntrl PID: 1124 (sbd)
Tasks: 1
CGroup: /system.slice/sbd.service
└─ 1124 /usr/sbin/sbd -p /run/sbd.pid watch
May 20 17:28:50 hanadb01 sbd[1124]: error: valid_header: Header magic does not match.
May 20 17:28:50 hanadb01 sbd[1124]: error: header_get: header on device 3 is not valid.
May 20 17:28:50 hanadb01 sbd[1124]: error: valid_header: Header magic does not match.
May 20 17:28:50 hanadb01 sbd[1124]: error: header_get: header on device 3 is not valid.
May 20 17:28:50 hanadb01 sbd[1124]: error: valid_header: Header magic does not match.
May 20 17:28:50 hanadb01 sbd[1124]: error: header_get: header on device 3 is not valid.
May 20 17:28:50 hanadb01 sbd[1124]: error: valid_header: Header magic does not match.
May 20 17:28:50 hanadb01 sbd[1124]: error: header_get: header on device 3 is not valid.
May 20 17:28:50 hanadb01 sbd[1124]: error: valid_header: Header magic does not match.
May 20 17:28:50 hanadb01 sbd[1124]: error: header_get: header on device 3 is not valid.
重新格式化sbd分区,重启节点后sbd服务可以正常启动,集群服务也随之正常启动。
1)查看当前的sbd分区:
# cat /etc/sysconfig/sbd
SBD_DEVICE=“/dev/sdc;/dev/sdd”
2)重新格式化sbd分区
# sbd -d /dev/sdc -d /dev/sdd create
3)重启节点。