一次Hp宕机导致系统crash的描述

     今天,在进行hp参数修改后,需要重启主机,随即执行cmhaltnode,由于没有仔细观察输出的结果(umount filesystem没有成功),直接执行shutdown -ry now,长时间机器没有反应,经检查发现系统正在做crash。特将本次事故回溯如下:

1:查阅syslog

Apr 24 12:12:38 kdyy1 syslog: cmhaltnode -f
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from root on node kdyy1 to halt the cluster on this node
Apr 24 12:11:04 kdyy1 xntpd[4458]: synchronisation lost
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from node kdyy1 to disable node switching for package pkg_db1 on node kdyy1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Disabled package pkg_db1 on node kdyy1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from node kdyy1 to disable global switching for package pkg_db1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Disabled switching for package pkg_db1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Halting package pkg_db1 on node kdyy1 as requested by user.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from node kdyy1 to begin the halting process for package pkg_db1 on node kdyy1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from node kdyy1 to halt package pkg_db1 on node kdyy1.
Apr 24 12:12:39 kdyy1 cmcld[7718]: Request from root on node kdyy1 to halt the cluster on this node
Apr 24 12:12:39 kdyy1  above message repeats 2 times
Apr 24 12:12:39 kdyy1 cmcld[7718]: Executing '/etc/cmcluster/pkg_db1/pkg_db1.cntl  stop' for package pkg_db1, as service PKG*43265.
Apr 24 12:13:01 kdyy1 cmclconfd[3763]: Client provided hostname kdyy2 which does not match resolved name kdyy1.
Apr 24 12:16:06 kdyy1 LVM[7624]: vgchange -a n /dev/ora_yyvg1
Apr 24 12:16:06 kdyy1 LVM[7648]: vgchange -a n /dev/ora_yyvg2

Apr 24 12:17:31 kdyy1  above message repeats 5 times
Apr 24 12:17:31 kdyy1 syslog: /usr/sbin/cmhaltnode -vf
Apr 24 12:17:31 kdyy1 cmcld[7718]: Request from root on node kdyy1 to halt the cluster on this node
Apr 24 12:17:31 kdyy1 cmlvmd[7737]: Volume group fsvg_yy is still active.
Apr 24 12:18:03 kdyy1 HP-PRM: [10469]: prmconfig: configuration reset
Apr 24 12:17:49 kdyy1 syslog: /usr/sbin/cmhaltnode -vf
Apr 24 12:18:45 kdyy1  above message repeats 3 times
Apr 24 12:18:45 kdyy1 /usr/sbin/envd[4592]: 1;7{:E 15 VP6O
Apr 24 12:17:49 kdyy1 cmcld[7718]: Request from root on node kdyy1 to halt the cluster on this node
Apr 24 12:18:45 kdyy1  above message repeats 3 times
Apr 24 12:18:45 kdyy1 diagmond[4580]: Exit due to user requested abort
Apr 24 12:17:49 kdyy1 cmlvmd[7737]: Volume group fsvg_yy is still active.
Apr 24 12:18:46 kdyy1  above message repeats 3 times
Apr 24 12:18:46 kdyy1 sshd[3426]: Received signal 15; terminating.
Apr 24 12:19:17 kdyy1 cimserver[11538]: PGS10013:  SHUTDOWN TIME-OUT EXPIRED.  A FORCED SHUTDOWN OF THE CIM SERVER IS INITIATED.
Apr 24 12:19:26 kdyy1 inetd[3628]: Going down on signal 15
Apr 24 12:19:27 kdyy1 rpcbind: terminate: rpcbind terminating on signal. Restart with "rpcbind -w"
Apr 24 12:19:28 kdyy1 su: + tty?? root-sfmdb
Apr 24 12:19:32 kdyy1 syslogd: going down on signal 15

2:查看pkg_db1.cntl.log

        ########### Node "kdyy1": Package start completed at 2008年4月11日 星期五, 18:38:33 ###########

        ########### Node "kdyy1": Halting package at 2008年4月24日 星期四, 12:12:39 ###########
4月 24 12:12:39 - Node "kdyy1": Deactivating volume group /dev/fsvg_yy
vgchange: Failed to notify clvm daemon about volume group deactivation - 设备忙
不能取消活动卷组 "/dev/fsvg_yy":
设备忙
        4月 24 12:13:47 - vgchange -a n /dev/fsvg_yy failed, trying again.
vgchange: Failed to notify clvm daemon about volume group deactivation - 设备忙
不能取消活动卷组 "/dev/fsvg_yy":
设备忙
        4月 24 12:14:57 - vgchange -a n /dev/fsvg_yy failed, trying again.
vgchange: Failed to notify clvm daemon about volume group deactivation - 设备忙
不能取消活动卷组 "/dev/fsvg_yy":
设备忙
        ERROR:  Function deactivate_volume_group
        ERROR:  Failed to deactivate /dev/fsvg_yy
4月 24 12:16:06 - Node "kdyy1": Deactivating volume group /dev/ora_yyvg1
Deactivated volume group in Shared Mode.
卷组 "/dev/ora_yyvg1" 已经改变。
4月 24 12:16:06 - Node "kdyy1": Deactivating volume group /dev/ora_yyvg2
Deactivated volume group in Shared Mode.

综上所述,后续在执行shutdown时,必须保证cmhaltnode是成功的,建议检查pkg.log。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/13132547/viewspace-254572/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/13132547/viewspace-254572/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值