环境:系统是两个节点的RAC集群,昨天一个节点的HBA损坏了,更换新的HBA,并在存储上做了映射。
故障:在重启服务器后,发现问题如下:
[grid@glddb2 ~]$ /u01/app/11.2.0/grid/bin/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE INTERMEDIATE glddb2 Startup Initiated
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE ONLINE glddb2
ora.cssdmonitor
1 ONLINE ONLINE glddb2
ora.ctssd
1 ONLINE ONLINE glddb2 ACTIVE:0
ora.diskmon
1 ONLINE ONLINE glddb2
ora.drivers.acfs
1 ONLINE OFFLINE
ora.evmd
1 ONLINE ONLINE glddb2
ora.gipcd
1 ONLINE ONLINE glddb2
ora.gpnpd
1 ONLINE ONLINE glddb2
ora.mdnsd
1 ONLINE ONLINE glddb2
[grid@glddb2 ~]$
[grid@glddb2 ~]$ asmcmd
Connected to an idle instance.
ASMCMD>
在asmcmd 执行startup 报错如下
ORA-00600: internal error code, arguments: [kmgs_component_init_3], [3], [6], [17], [], [], [], [], [], [], [], []
我使用fdisk检查
root@glddb2 mpath]# fdisk -l
Disk /dev/cciss/c0d0: 146.7 GB, 146778685440 bytes
255 heads, 32 sectors/track, 35132 cylinders
Units = cylinders of 8160 * 512 = 4177920 bytes
Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 * 1 25 101984 83 Linux
/dev/cciss/c0d0p2 26 3856 15630480 82 Linux swap / Solaris
/dev/cciss/c0d0p3 3857 35132 127606080 83 Linux
Disk /dev/sda: 999 MB, 999997440 bytes
31 heads, 62 sectors/track, 1016 cylinders
Units = cylinders of 1922 * 512 = 984064 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 1016 976345 83 Linux
Disk /dev/sdb: 999 MB, 999997440 bytes
31 heads, 62 sectors/track, 1016 cylinders
Units = cylinders of 1922 * 512 = 984064 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 1016 976345 83 Linux
Disk /dev/sdc: 597.3 GB, 597366439936 bytes
255 heads, 63 sectors/track, 72625 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 72625 583360281 83 Linux
Disk /dev/sdd: 599.3 GB, 599366434816 bytes
255 heads, 63 sectors/track, 72868 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 72868 585312178+ 83 Linux
Disk /dev/dm-0: 999 MB, 999997440 bytes
31 heads, 62 sectors/track, 1016 cylinders
Units = cylinders of 1922 * 512 = 984064 bytes
Device Boot Start End Blocks Id System
/dev/dm-0p1 1 1016 976345 83 Linux
Disk /dev/dm-1: 999 MB, 999997440 bytes
31 heads, 62 sectors/track, 1016 cylinders
Units = cylinders of 1922 * 512 = 984064 bytes
Device Boot Start End Blocks Id System
/dev/dm-1p1 1 1016 976345 83 Linux
Disk /dev/dm-2: 597.3 GB, 597366439936 bytes
255 heads, 63 sectors/track, 72625 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/dm-2p1 1 72625 583360281 83 Linux
Disk /dev/dm-3: 599.3 GB, 599366434816 bytes
255 heads, 63 sectors/track, 72868 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/dm-3p1 1 72868 585312178+ 83 Linux
Disk /dev/dm-4: 999 MB, 999777280 bytes
255 heads, 63 sectors/track, 121 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-4 doesn't contain a valid partition table
Disk /dev/dm-5: 599.3 GB, 599359670784 bytes
255 heads, 63 sectors/track, 72867 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-5 doesn't contain a valid partition table
Disk /dev/dm-6: 597.3 GB, 597360927744 bytes
255 heads, 63 sectors/track, 72624 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-6 doesn't contain a valid partition table
Disk /dev/dm-7: 999 MB, 999777280 bytes
255 heads, 63 sectors/track, 121 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-7 doesn't contain a valid partition table
[root@glddb2 mpath]# ls -l
total 0
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath0 -> ../dm-0
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath0p1 -> ../dm-7
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath1 -> ../dm-1
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath1p1 -> ../dm-4
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath2 -> ../dm-2
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath2p1 -> ../dm-6
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath3 -> ../dm-3
lrwxrwxrwx 1 root root 7 May 30 15:32 mpath3p1 -> ../dm-5
[root@glddb2 mpath]#
[root@glddb2 mpath]#
[root@glddb2 mpath]#
[root@glddb2 mpath]# multipath -ll
mpath2 (3600c0ff000106a4ef906c34c01000000) dm-2 HP,MSA2312fc
[size=556G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:2 sdc 8:32 [active][ready]
mpath1 (3600c0ff000106a4edc06c34c01000000) dm-1 HP,MSA2312fc
[size=954M][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdb 8:16 [active][ready]
mpath0 (3600c0ff000106a4ec906c34c01000000) dm-0 HP,MSA2312fc
[size=954M][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:0 sda 8:0 [active][ready]
mpath3 (3600c0ff000106a4e1e07c34c01000000) dm-3 HP,MSA2312fc
[size=558G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:3 sdd 8:48 [active][ready]
与另外一个正常节点对比如下
[root@glddb1 mpath]# ls -l
total 0
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath0 -> ../dm-0
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath0p1 -> ../dm-7
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath1 -> ../dm-1
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath1p1 -> ../dm-5
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath2 -> ../dm-2
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath2p1 -> ../dm-4
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath3 -> ../dm-3
lrwxrwxrwx 1 root root 7 Jan 20 18:16 mpath3p1 -> ../dm-6
[root@glddb1 mpath]#
[root@glddb1 mpath]#
[root@glddb1 mpath]# multipath -ll
mpath2 (3600c0ff000106a4ef906c34c01000000) dm-2 HP,MSA2312fc
[size=556G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:2 sdc 8:32 [active][ready]
mpath1 (3600c0ff000106a4edc06c34c01000000) dm-1 HP,MSA2312fc
[size=954M][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdb 8:16 [active][ready]
mpath0 (3600c0ff000106a4ec906c34c01000000) dm-0 HP,MSA2312fc
[size=954M][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:0 sda 8:0 [active][ready]
mpath3 (3600c0ff000106a4e1e07c34c01000000) dm-3 HP,MSA2312fc
[size=558G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:3 sdd 8:48 [active][ready]
我发现错误原因是:
正常情况下:
sdc->dm-2->mpath2p1->dm-4
sdb->dm-1->mpath1p1->dm-5
sda->dm-0->mpath0p1->dm-7
sdd->dm-3->mpath3p1->dm-6
更换HBA后,错误如下
sdc->dm-2->mpath2p1->dm-6
sdb->dm-1->mpath1p1->dm-4
sda->dm-0->mpath0p1->dm-7
sdd->dm-3->mpath3p1->dm-5
请问各位专家能否做相应的调整,如果能调整,请给出对于的方法