场景:更换oracle 数据库节点2服务器主板,服务器网口名称发生变化,服务器开机后,crs启动失败
报错:
CRS-4640: oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
初次检查:
su - grid | kfod disk=all
对比节点1及节点2磁盘,显示正常
ping心跳IP,互通
查看日志:
crs alert日志,su - grid | cd $ORACLE_HOME/log/hostname/
2022-04-24 21:25:08.433:
[/oracle/grid_home/product/11.2.0/bin/cssdagent(15577)]CRS-5818:Aborted command ‘start’ for resource ‘ora.cssd’. Details at (:CRSAGF00113:) {0:0:2} in /oracle/grid_home/product/11.2.0/log/ocrm2/agent/ohasd/
oracssdagent_root//oracssdagent_root.log.
2022-04-24 21:25:08.433:
[cssd(15604)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /oracle/grid_home/product/11.2.0/log/ocrm2/cssd/ocssd.log
查看 /oracle/grid_home/product/11.2.0/log/ocrm2/agent/ohasd/oracssdagent_root//oracssdagent_root.log
以及 /oracle/grid_home/product/11.2.0/log/ocrm2/cssd/ocssd.log
/oracle/grid_home/product/11.2.0/log/ocrm2/cssd/ocssd.log 内容如下:
2022-04-24 21:34:28.510: [ CSSD][1946154752]clssnmvDHBValidateNcopy: node 1, ocrm1, has a disk HB, but no network HB, DHB has rcfg 341498679, wrtcnt, 1041086062, LATS 314244, lastSeqNo 1041086057, unique
ness 1639190279, timestamp 1650807268/2015106330
日志显示心跳不通
处置:
由于网口名称变化,集群认为心跳不通
参照节点1,修改节点二服务器网口名称(ifconfig 对比网卡mac进行修改)
1.修改 /etc/udev/rules.d/70-persistent-net.rules 文件中网卡名
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:10:51:ff”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth1”
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:10:52:01”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth3”
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:11:2e:3d”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth8”
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:11:2e:3f”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth9”
修改如下:
#PCI device 0x8086:0x1521 (igb)
#SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:10:51:ff”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth1”
#PCI device 0x8086:0x1521 (igb)
#SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:10:52:01”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth3”
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:11:2e:3d”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth1”
#PCI device 0x8086:0x1521 (igb)
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}“6c:92:bf:11:2e:3f”, ATTR{type}“1”, KERNEL=="eth", NAME=“eth3”
2.ethtool -i eth2 查看网卡驱动模块名
3.modprobe -r e1000 卸载网卡模块
4.modprobe e1000 重新加载网卡模块
5.修改/etc/sysconfig/network-scripts中网卡的配置文件(使其格式为ifcfg-网卡名)
网卡配置信息中device需修改,网卡配置文件名需要修改
ifcfg-eth9–>ifcfg-eth3
6.重启网卡服务/etc/inint.d/network restart(或者重启服务器,可不做2.3.4)