我们这台故障机器是12盘位的戴尔EMC,10盘组raid10+1盘热备,安装MegaCli64看下面这个链接:
Proxmox(Debian)安装MegaCli64管理硬件Raid阵列卡root@JS-2002:~/megacli/Linux# MegaCli64 -LDInfo -Lall -aALL
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 9.093 TB
Sector Size : 512
Mirror Data : 9.093 TB
State : Degraded
Strip Size : 64 KB
Number Of Drives per span:2
Span Depth : 5
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No
Exit Code: 0x00
root@JS-2002:~/megacli/Linux# MegaCli64 -pdinfo -physdrv[:3] -a0
Enclosure Device ID: N/A
Slot Number: 3
Drive's position: DiskGroup: 0, Span: 1, Arm: 1
Enclosure position: N/A
Device Id: 3
WWN: 5000C500260EACC4
Sequence Number: 2
Media Error Count: 0
Other Error Count: 5
Predictive Failure Count: 3
Last Predictive Failure Event Seq Number: 30255
PD Type: SAS
Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 0008
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000c500260eacc5
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST32000444SS 00089WM3PSCZ
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :29C (84.20 F)
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : Yes
Exit Code: 0x00
然后设置这个磁盘下线,同时标记missing:
root@JS-2002:~/megacli/Linux# MegaCli64 -PDOffline -PhysDrv [:3] -a0
Adapter: 0: EnclId-N/A SlotId-3 state changed to OffLine.
Exit Code: 0x00
root@JS-2002:~/megacli/Linux# MegaCli64 -pdmarkmissing -physdrv[:3] -aAll
EnclId-N/A SlotId-3 is marked Missing.
Exit Code: 0x00
标记这个硬盘准备移除:
root@JS-2002:~/megacli/Linux# MegaCli64 -pdprprmv -physdrv[:3] -a0
Prepare for removal Success
Exit Code: 0x00
这时候再看阵列的状态, 是Degraded:
root@JS-2002:~/megacli/Linux# MegaCli64 -LDInfo -Lall -aALL
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 9.093 TB
Sector Size : 512
Mirror Data : 9.093 TB
State : Degraded
Strip Size : 64 KB
Number Of Drives per span:2
Span Depth : 5
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No
Exit Code: 0x00
然后将”热备”盘顶上,之前没有添加热备,只是插上了而已,这里最重要的是确定Array和row的参数是啥,找了好久….
实际上Raid10是将多组raid1的磁盘组成raid0阵列,所以在我们这里10盘的Raid10实际分成了5组Raid0。也就是这里面Array后面的参数。而row就是这每个raid1小组里面的0或者1,这样以来就好理解了,只要磁盘的Span号即可:
Enclosure Device ID: N/A
Slot Number: 3
Drive's position: DiskGroup: 0, Span: 1, Arm: 1
Enclosure position: N/A
Device Id: 3
是Array1,row1,于是:
root@JS-2002:~/megacli/Linux# MegaCli64 -PdReplaceMissing -PhysDrv[:10] -Array1 -row1 -a0
Adapter: 0: Failed to replace Missing PD at Array 1, Row 1.
FW error description:
The specified device is in a state that doesn't support the requested command.
Exit Code: 0x32
替换失败了,是因为这个盘作为一个普通non-raid盘存在,所以我们直接把这块盘拔掉,然后插到3号盘的位置,神奇的开始rebuild了:
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Sector Size: 0
Firmware state: Rebuild
Device Firmware Level: HPD7
注意:本段内容须“登录”后方可查看!
This article is under CC BY-NC-SA 4.0 license.
Please quote the original link:https://www.liujason.com/article/391.html