一台服务器换盘之后状态不正常,重启服务器进入raid管理界面,硬盘表示为:Blocked,且无法配置为热备.日志里有如下内容:
Copyback cannot be started on PD 03(e0x20/s3) from PD 05(e0x20/s5), as SAS/SATA is not supported in an array
1
CopybackcannotbestartedonPD03(e0x20/s3)fromPD05(e0x20/s5),asSAS/SATAisnotsupportedinanarray
查看硬盘状态:
[root@d10045101 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Unconfigured(good), Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
1
2
3
4
5
6
7
[root@d10045101~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Unconfigured(good),SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
同事说是Raid固件问题,需要升级固件。
升级前准备
查看当前Raid卡固件信息。可以看到阵列卡型号是 PERC H700 Integrated。
[root@d10045101 hean]# lspci |grep -i raid
03:00.0 RAID bus controller: LSI Logic / Symbios Logic LSI MegaSAS 9260 (rev 04)
[root@d10045101 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -adpallinfo -a0
Adapter #0
==============================================================================
Versions
================
Product Name : PERC H700 Integrated
Serial No : 9CO00AV
FW Package Build: 12.0.1-0091
Mfg. Data
================
Mfg. Date : 12/25/09
Rework Date : 12/25/09
Revision No : A00
Battery FRU : N/A
Image Versions in Flash:
================
BIOS Version : 3.09.00
FW Version : 2.0.03-0772
Preboot CLI Version: 02.00-013:#%00008
Ctrl-R Version : 2.00-0024
NVDATA Version : 2.02.0037
Boot Block Version : 2.00.00.00-0018
BOOT Version : 01.250.04.219
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
[root@d10045101hean]# lspci |grep -i raid
03:00.0RAIDbuscontroller:LSILogic/SymbiosLogicLSIMegaSAS9260(rev04)
[root@d10045101~]# /opt/MegaRAID/MegaCli/MegaCli64 -adpallinfo -a0
Adapter#0
==============================================================================
Versions
================
ProductName:PERCH700Integrated
SerialNo:9CO00AV
FWPackageBuild:12.0.1-0091
Mfg.Data
================
Mfg.Date:12/25/09
ReworkDate:12/25/09
RevisionNo:A00
BatteryFRU:N/A
ImageVersionsinFlash:
================
BIOSVersion:3.09.00
FWVersion:2.0.03-0772
PrebootCLIVersion:02.00-013:#%00008
Ctrl-RVersion:2.00-0024
NVDATAVersion:2.02.0037
BootBlockVersion:2.00.00.00-0018
BOOTVersion:01.250.04.219
查看机器型号和SN。
[root@d10045101 ~]# dmidecode -t system
# dmidecode 2.9
SMBIOS 2.6 present.
Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: Dell Inc.
Product Name: PowerEdge R710
Version: Not Specified
Serial Number: 1XXXXXX
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Not Specified
1
2
3
4
5
6
7
8
9
10
11
12
13
[root@d10045101~]# dmidecode -t system
# dmidecode 2.9
SMBIOS2.6present.
Handle0x0100,DMItype1,27bytes
SystemInformation
Manufacturer:DellInc.
ProductName:PowerEdgeR710
Version:NotSpecified
SerialNumber:1XXXXXX
Wake-upType:PowerSwitch
SKUNumber:NotSpecified
Family:NotSpecified
去戴尔官网根据SN和阵列卡型号下载相关驱动。
升级过程
给驱动镜像加可执行权限,直接执行即可,以下是升级过程。
[root@d10045101 ~]# ./SAS-RAID_Firmware_C3X7D_LN_12.10.6-0001_A12.BIN
Collecting inventory...
.......
Running validation...
PERC H700 Integrated Controller 0
The version of this Update Package is newer than the currently installed version.
Software application name: PERC H700 Integrated Controller 0 Firmware
Package version: 12.10.6-0001
Installed version: 12.0.1-0091
Continue? Y/N:y
Executing update...
WARNING: DO NOT STOP THIS PROCESS OR INSTALL OTHER DELL PRODUCTS WHILE UPDATE IS IN PROGRESS.
THESE ACTIONS MAY CAUSE YOUR SYSTEM TO BECOME UNSTABLE!
................................................................................................................................................................................................................
Device: PERC H700 Integrated Controller 0
Application: PERC H700 Integrated Controller 0 Firmware
The operation was successful.
Would you like to reboot your system now?
Continue? Y/N:y
Broadcast message from root (pts/0) (Wed Nov 5 15:40:44 2014):
The system is going down for reboot NOW!
[root@d10045101 ~]#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[root@d10045101~]# ./SAS-RAID_Firmware_C3X7D_LN_12.10.6-0001_A12.BIN
Collectinginventory...
.......
Runningvalidation...
PERCH700IntegratedController0
TheversionofthisUpdatePackageisnewerthanthecurrentlyinstalledversion.
Softwareapplicationname:PERCH700IntegratedController0Firmware
Packageversion:12.10.6-0001
Installedversion:12.0.1-0091
Continue?Y/N:y
Executingupdate...
WARNING:DONOTSTOPTHISPROCESSORINSTALLOTHERDELLPRODUCTSWHILEUPDATEISINPROGRESS.
THESEACTIONSMAYCAUSEYOURSYSTEMTOBECOMEUNSTABLE!
................................................................................................................................................................................................................
Device:PERCH700IntegratedController0
Application:PERCH700IntegratedController0Firmware
Theoperationwassuccessful.
Wouldyouliketorebootyoursystemnow?
Continue?Y/N:y
Broadcastmessagefromroot(pts/0)(WedNov515:40:442014):
ThesystemisgoingdownforrebootNOW!
[root@d10045101~]#
升级结果
升级完成,再次看阵列卡固件版本,可以看到FW Package Build已从12.0.1-0091变成12.10.6-0001。
[root@d10045101 ~]# lspci |grep -i raid
03:00.0 RAID bus controller: LSI Logic / Symbios Logic LSI MegaSAS 9260 (rev 04)
[root@d10045101 ~]# /opt/Mega
MegaCli MegaRAID/
[root@d10045101 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -adpallinfo -a0 |more
OSSpecificInitialize: Failed to load libsysfs.so.2.0.2 Please ensure that libsfs is present in the system.
The dependent library libsysfs.so.2.0.1 not available. Please contact LSI for distribution of the package
Adapter #0
==============================================================================
Versions
================
Product Name : PERC H700 Integrated
Serial No : 9CO00AV
FW Package Build: 12.10.6-0001
Mfg. Data
================
Mfg. Date : 12/25/09
Rework Date : 12/25/09
Revision No : A00
Battery FRU : N/A
Image Versions in Flash:
================
BIOS Version : 3.18.00_4.09.05.00_0x0416A000
FW Version : 2.100.03-2514
Preboot CLI Version: 04.04-010:#%00008
Ctrl-R Version : 2.02-0025.1
NVDATA Version : 2.07.03-0003
Boot Block Version : 2.02.00.00-0000
BOOT Version : 01.250.04.219
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[root@d10045101~]# lspci |grep -i raid
03:00.0RAIDbuscontroller:LSILogic/SymbiosLogicLSIMegaSAS9260(rev04)
[root@d10045101~]# /opt/Mega
MegaCliMegaRAID/
[root@d10045101~]# /opt/MegaRAID/MegaCli/MegaCli64 -adpallinfo -a0 |more
OSSpecificInitialize:Failedtoloadlibsysfs.so.2.0.2Pleaseensurethatlibsfsispresentinthesystem.
Thedependentlibrarylibsysfs.so.2.0.1notavailable.PleasecontactLSIfordistributionofthepackage
Adapter#0
==============================================================================
Versions
================
ProductName:PERCH700Integrated
SerialNo:9CO00AV
FWPackageBuild:12.10.6-0001
Mfg.Data
================
Mfg.Date:12/25/09
ReworkDate:12/25/09
RevisionNo:A00
BatteryFRU:N/A
ImageVersionsinFlash:
================
BIOSVersion:3.18.00_4.09.05.00_0x0416A000
FWVersion:2.100.03-2514
PrebootCLIVersion:04.04-010:#%00008
Ctrl-RVersion:2.02-0025.1
NVDATAVersion:2.07.03-0003
BootBlockVersion:2.02.00.00-0000
BOOTVersion:01.250.04.219
检查磁盘状态。
[root@d10045101 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Copyback
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
1
2
3
4
5
6
7
[root@d10045101~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Copyback
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Copyback过程结束后的磁盘状态:
[root@d10045101 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Hotspare, Spun Up
1
2
3
4
5
6
7
[root@d10045101~]# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aall |grep "Firmware st"
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Online,SpunUp
Firmwarestate:Hotspare,SpunUp
关于Copyback
Typically, a drive fails or is expected to fail, and the data is rebuilt on a hot spare.
The failed drive is replaced with a new drive. Then the data is copied from the hot spare to the new drive, and the hot spare reverts from a rebuild drive to its original hot spare status. The copyback operation runs as a background activity, and the virtual drive is still available online to the host.
解释一下,Copyback 功用只限于有Hot Spare Raid(空闲热备援盘) 上才有的功能.
假设一组有带Hot Spare 阵列,某个slot 5 坏了 ,这时处于 slot 12 的hot spare hdd 开始做rebuild动作.
Rebuild 完成.会在于此阵列已无hot spare 盘功能.并且位置有所改变 (Disk Group 简称DG) 也有所变化
有设定copyback 的话,当slot 5 放入新的状况良好硬碟时, slot 12 hot spare 会做sync 到slot 5 .重要的是做此sync时,你对整组VD 的写入 变化数据 slot 5 会跟slot 12 同步。完成Copy back后,这时后slot 12 中硬碟可以继续做用.原阵列架构 ,DG 都不会改变
参考链接
[1]. lsi 阵列卡 Copyback 功能说明文档 http://www.0li0.com/html/20121116/548.html
1
[1].lsi阵列卡Copyback功能说明文档http://www.0li0.com/html/20121116/548.html