[AIX]经验共享-修复无法启动的机器过程!(LED: 0518)

今天 aix p550 作ha 的有一台 显示: 0518 ,从网上找来下面的解决方法:

 

几年前aix433 维护工作,,一个人的孤单!

 

环境:
硬件 两台F85 + 7133 T40
软件 oracle817 ops + hacmp 4.4 es
系统 aix 433 ml09 两个concurrent vg
特老系统,好久没做过系统备份

 

几天前发生的事
环境:
硬件 两台F85 + 7133 T40
软件 oracle817 ops + hacmp 4.4 es
系统 aix 433 ml09 两个concurrent vg
几年前的系统,好久没做过系统备份!

两台机器的hacmp 4.4 补丁
lslpp -l | grep cluster 最高到 4.4.1.16
系统的补丁
instfix -i | grep ML 到09 (有必要升级)

两台机器的hacmp 配置了三个资源组,分别是
sj85srv1
sj85srv2
sj85zcls

其中sj85srv1
node relationship cascading
participating node names/default node priority sj85_1 sj85_2
service ip label sj85_1srv

application servers appsrv1

sj85srv2
node relationship cascading
participating node names/default node priority sj85_2 sj85_1
service ip label sj85_2srv

application servers appsrv2

sj85zcls
node relationship concurrent
participating node names/default node priority sj85_1 sj85_2
service ip label 空

application servers 空

可以看出两个资源组是负责ip地址的接管,另一个负责卷组。


前几天添加数据文件,我的思路用hacmp直接添加的lv,
smitty hacmp---cluster system management------两边应该能够自动同步。做了之后,一边lv状态正常,但在另一边则
lsvg -l zdatavg1 其中type 一项应该是jfs ,显示的是 ??,于是我手工同步
synclvodm -v lvname 状态正常,数据库也能够同时认出两边的数据文件。此时没有向数据文件添加数据,过了几天,同事说数据文件两边不能同时用,有问题。当时想可能这种添加方式有问题,不行的话估计还得用古老的方式,一边建,importvg。此时正赶上我出差,让同事帮着看看,我就走了

过几天回来情况还是这样,我想可能是系统的补丁太低,以后升级应该能高定,但现在没有盘,先把数据库弄好,思路是
1、宕数据库
2、宕ha
3、varyonvg (不是-c模式)
4、建lv
5、varyoffvg
6、在另一边importvg,varyon

照着这个思路作,1 ok ,2 有一台机器ha宕不下来,我是两边分别执行的smitty clstop 思路
1、clstop graceful
2、clstop force
3、clstop force 连续两次,一般应该能搞定

1 2 3 都作后无法宕ha,没有想到更好的办法,重新启动机器,一台启动了,另一台主机无法启动,到检测设备,白屏出现后 0518
挂起文件系统时 hang 。

思路
是否跟另一台机器有关?重起几次能过去?

试过都不行

思路
用光盘修复文件系统、替代etc/filesystems文件

具体步骤见附件 ,还是不行!此时郁闷,起不来了,还没有以前的备份,这时同事出主意说恢复另一台机器的系统应该好使,想想应该可行,做!

另一台机器 mksysb
在这台机器上用磁带恢复,之前为了避免ip地址的冲突,先把机器的ip地址改走, smitty chinet
恢复ok,思路
1、改hostname
2、改ip地址
3、同步ha
4、试oracle是否好使
5、停ha
6、正常varyonvg
7、建lv
8、varyoffvg
9、importvg
10、启动数据库,建表测试
11、lsnrctl start

照着这个思路,解决了,但还没有根本解决问题,因为以后每次都得这么做,不现实,过几天升级一下系统不定看看行不行!


附:
1、etc/hosts
10.64.60.3 sj85_2srv sj85_2
10.10.10.2 sj85_2std
10.64.60.5 sj85_2boot

10.64.60.2 sj85_1srv sj85_1
10.10.10.1 sj85_1std
10.64.60.4 sj85_1boot

2、0518 代码 解决步骤


Repairing File Systems with fsck in AIX V4 and V5 (LED 517 or 518)
This document covers the use of the fsck (file system check) command in
Maintenance mode to repair inconsistencies in file systems. The procedure
described is useful when file system corruption in the primary root file
systems is suspected or, in many cases, to correct an IPL hang at LED value
517, 518, or LED value 555.

This document applies to AIX V4 and V5.

--------------------------------------------------------------------------------

Recovery procedure
Boot your system into a limited function maintenance shell (Service, or
Maintenance mode) from AIX bootable media to perform file system checks on your
root file systems.
Please refer to your system user's or installation and service guide for
specific IPL procedures related to your type and model of hardware. You can also refer to the document titled "Booting in Service Mode," available at
http://techsupport.services.ibm.com...r/aix.techTips.

With bootable media of the same version and level as the system, boot the
system. The bootable media can be any ONE of the following:
Bootable CD-ROM NON_AUTOINSTALL mksysb Bootable Install Tape
Follow the screen prompts to the following menu:

Welcome to Base Operating System
Installation and Maintenance

Choose Start Maintenance Mode for System Recovery (Option 3).
The next screen displays the Maintenance menu.


Choose Access a Root Volume Group (Option 1).
The next screen displays a warning that indicates you will not be able to
return to the Base OS menu without rebooting.


Choose 0 continue.
The next screen displays information about all volume groups on the system.


Select the root volume group by number.

Choose Access this volume group and start a shell before mounting file systems
(Option 2).
If you get errors from the preceding option, do not continue with the rest of
this procedure. Correct the problem causing the error. If you need assistance
correcting the problem causing the error, contact one of the following:

local branch office your point of sale your AIX support center
If no errors occur, proceed with the following steps.


Run the following commands to check and repair file systems.
NOTE: The -y option gives fsck permission to repair file system corruption when necessary. This flag can be used to avoid having to manually answer multiple confirmation prompts, however, use of this flag can cause permanent, unnecessary data loss in some situations.

fsck /dev/hd4
fsck /dev/hd2
fsck /dev/hd3
fsck /dev/hd9var
fsck /dev/hd1

To format the default jfslog for the rootvg Journaled File System (JFS) file
systems, run the following command:
/usr/sbin/logform /dev/hd8

Answer yes when asked if you want to destroy the log.

If your system is hanging at LED 517 or 518 during a Normal mode boot, it is
possible the /etc/filesystems file is corrupt or missing. To temporarily
replace the disk-based /etc/filesystems file, run the following commands:
mount /dev/hd4 /mnt
mv /mnt/etc/filesystems /mnt/etc/filesystems.[MMDDYY]
cp /etc/filesystems /mnt/etc/filesystems
umount /mnt

MMDDYY represents the current two-digit representation of the Month, Day and Year, respectively.

Type exit to exit from the shell. The file systems should automatically mount after you type exit. If you receive error messages, reboot into a limited function maintenance shell again to attempt to address the failure causes.

If you have user-created file systems in the rootvg volume group, run fsck on them now. Enter:
fsck /dev/[LVname]

LVname is the name of your user-defined logical volume.

If you used the preceding procedure to temporarily replace the /etc/filesystems
file, and you have user-created file systems in the rootvg volume group, you
must also run the following command:
imfs -l /dev/[LVname]

If you have file systems in a volume group other than rootvg, run fsck on them now. Enter:
varyonvg [VGname]
fsck /dev/[LVname]

VGname is the name of your user-defined volume group.

If you used the preceding procedure to temporarily replace the /etc/filesystems file, also run the following command:
imfs [VGname]

The preceding commands can be repeated for each user-defined volume group on the system.

If your system was hanging at LED 517 or 518 and you are unable to activate non-rootvg volume groups in Service mode, you can manually edit the /etc/filesystems file and add the appropriate entries.
The file /etc/filesystems.MMDDYY saved in the preceding steps may be used as a reference if it is readable. However, the imfs method is preferred since it uses information stored in the logical volume control block to re-populate the /etc/filesystems file.

If your system has a mode select key, turn it to the Normal position.

Reboot the system into Normal mode using the following command:
sync;sync;sync;reboot

If your system still halts at the LED 517 or 518 display, in many cases, it is
faster and more cost-effective to reinstall from a recent system backup.
Attempting to isolate the cause of the problem can be very time-consuming and often results in the determination that a reinstall is required to correct the problem anyway.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值