实施方案:
1.检查ASM状态
SQL> select dg.name,a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and a.name='disk_repair_time';确认header_status都是MEMBER,mount_status 都是 CACHED,mode_status都是 ONLINE
SQL> select group_number,path,header_status,mount_status,mode_status,name from v$asm_disk;
确认asmmodestatus状态都是online,asmdeactivationoutcome 状态都为Yes
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
确 认当前没有其他rebalance的进程
SQL> select * from gv$asm_operation;
2.设置raid卷的cache为writethrough
# cellcli -e alter cell bbu drop for replacement
# cellcli -e list cell attributes bbustatus
3.将griddisk offline
# cellcli -e alter griddisk all inactive
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
4.将系统关掉
# shutdown -hP now
5.物理更换电池
拔出BBU, 插入新的BBU
6.加电重启系统,检查电池状态
# /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -a0
# cellcli -e alter cell bbu reenable
# cellcli -e list cell attributes bbustatus
7.将griddisk online,等待数据同步完成
# cellcli -e alter griddisk all active
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
实施:
更换存储服务器电池(一台):
[root@erpdb01 ~]# imageinfo-ver
12.1.1.1.2.150411
[root@erpdb01 ~]#
[root@erpdb01 ~]# ssh root@10.19.3.131
Last login: Wed Aug 12 21:04:29 2015 from erpdb01
[root@erpcel01 ~]# cellcli -e list cellattributes releaseVersion
12.1.1.1.2
[root@erpcel01 ~]#
NAME VALUE
--------------------------------------------------------------------------------
DATA_ERP 3.6h
SQL> /
GROUP_NUMBER PATH HEADER_STATU MOUNT_SMODE_ST NAME
-------------------------------------------------------------- ------------ ------- ----------------------
2 o/192.168.10.5/DBFS_DG_CD_04_erpcel03 MEMBER CACHED ONLINE DBFS_DG_CD_04_E RPCEL03
48 rows selected.
SQL> l
1*select group_number,path,header_status,mount_status,mode_status,name fromv$asm_disk
SQL>
[root@erpcel01 ~]# cellcli -e list griddisk attributesname,asmmodestatus,asmdeactivationoutcome
DATA_ERP_CD_00_erpcel01 ONLINE Yes
[root@erpcel01 ~]#
SQL> select * fromgv$asm_operation;
no rows selected
SQL>
2.设置raid卷的cache为writethrough
[root@erpcel01 ~]# cellcli -e alter cellbbu drop for replacement
HDD disk controller battery has beendropped for replacement
[root@erpcel01 ~]# cellcli -e liat cellattributes bbustatus
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]# cellcli -e list cellattributes bbustatus
dropped for replacement
[root@erpcel01 ~]#
[root@erpcel01 ~]# cellcli -e alter cellbbu drop for replacement
HDD disk controller battery has beendropped for replacement
[root@erpcel01 ~]# cellcli -e liat cellattributes bbustatus
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]# cellcli -e list cellattributes bbustatus
dropped for replacement
[root@erpcel01 ~]# cellcli -e alter grdidiskall inactive
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]# cellcli -e altergrididisk all inactive
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]# cellcli -e altergriddisk all inactive
GridDisk DATA_ERP_CD_00_erpcel01successfully altered
[root@erpcel01 ~]# cellcli -e list griddiskattributes name,asmmodestatus,asmdeactivationoutcome
DATA_ERP_CD_00_erpcel01 OFFLINE Yes
[root@erpcel01 ~]# ipmitool sunoem cli "show /SYS"
Connected. Use ^D to exit.
-> show /SYS
/SYS
Targets:
MB
SP
PS0
PS1
DBP0
DBP1
DBP2
PWRBS
INTSW
T_AMB
VPS_CPUS
VPS_MEMORY
VPS_FANS
VPS
OK
SERVICE
LOCATE
PS_FAULT
TEMP_FAULT
FAN_FAULT
Properties:
type = Host System
ipmi_name = /SYS
product_name = SUN FIRE X4270 M3
product_part_number = 7067081
product_serial_number = 1344NM504T
product_manufacturer = Oracle Corporation
fault_state = OK
clear_fault_action = (none)
power_state = On
Commands:
cd
reset
set
show
start
stop
-> Session closed
Disconnected
[root@erpcel01 ~]# ipmitool sunoem cli"show /SP"
Connected. Use ^D to exit.
-> show /SP
/SP
Targets:
alertmgmt
cli
clients
clock
config
diag
faultmgmt
firmware
logs
network
policy
powermgmt
preferences
serial
services
sessions
users
Properties:
check_physical_presence = false
hostname = erpcel01-ilom
reset_to_defaults = none
system_contact = (none)
system_description = SUN FIRE X4270 M3, ILOM v3.1.2.44, r88415
system_identifier = Exadata Database Machine X3-2 AK00158124
system_location = (none)
Commands:
cd
reset
set
show
version
-> Session closed
Disconnected
[root@erpcel01 ~]#
[root@erpcel01 ~]# shutdown -hP now
Broadcast message from root (pts/0) (SatDec 5 11:20:17 2015):
The system is going down for system haltNOW!
[root@erpcel01 ~]#
[root@erpcel01 ~]#
[root@erpcel01 ~]# Connection to 10.19.3.131closed by remote host.
Connection to 10.19.3.131 closed.
[root@erpdb01 ~]# ssh root@10.19.3.131
Last login: Sat Dec 5 10:26:29 2015 from erpdb01
[root@erpcel01 ~]# uptime
11:31:21 up 4 min, 1 user, load average: 3.01, 1.65, 0.68
[root@erpcel01 ~]#
[root@erpcel01 ~]#/opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -a0
BBU status for Adapter: 0
BatteryType: iBBU08
Voltage: 3868 mV
Current: 528 mA
Temperature: 30 C
Battery State: Optimal
Design Mode : 48+ Hrs retention with a non-transparent learn cycle and moderateservice life.
BBU Firmware Status:
Charging Status : Charging
Voltage : OK
Temperature : OK
Learn Cycle Requested : Yes
Learn Cycle Active : No
Learn Cycle Status : OK
Learn Cycle Timeout : No
I2cErrors Detected : No
Battery Pack Missing : No
Battery Replacement required : No
Remaining Capacity Low : No
Periodic Learn Required : No
Transparent Learn : No
Nospace to cache offload : No
Pack is about to fail & should be replaced : No
Cache Offload premium feature required : No
Module microcode update required : No
BBU GasGauge Status: 0x0180
Relative State of Charge: 30 %
Charger System State: 1
Charger System Ctrl: 0
Charging current: 528 mA
Absolute state of charge: 25 %
MaxError: 0 %
Battery backup charge time : 27 hours
BBU Capacity Info for Adapter: 0
Relative State of Charge: 30 %
Absolute State of charge: 26 %
Remaining Capacity: 392 mAh
Full Charge Capacity: 1320 mAh
Runtime to empty: Battery is not being charged.
Average time to empty: 47 Min.
Estimated Time to full recharge: 2 Hour, 43 Min.
Cycle Count: 0
BBU Design Info for Adapter: 0
Date of Manufacture: 09/16, 2014
Design Capacity: 1500 mAh
Design Voltage: 4100 mV
Specification Info: 0
Serial Number: 4778
Pack Stat Configuration: 0x0000
Manufacture Name: LS36691
Firmware Version :
Device Name: bq27541
Device Chemistry: LION
Battery FRU: N/A
Transparent Learn = 0
AppData = 0
BBU Properties for Adapter: 0
Auto Learn Period: 28 Days
Next Learn time: None Learn DelayInterval:1 Hours
Auto-Learn Mode: Disabled
BBUMode = 7
Exit Code: 0x00
[root@erpcel01 ~]#
[root@erpcel01 ~]# cellcli -e alter cell bbu reenable
CELL-02831: The BBU REENABLE command cannot reenable HDD disk controller battery because the battery was not droppedusing the BBU DROP FOR REPLACEMENT command.
[root@erpcel01 ~]# cellcli -e liat cell attributesbbustatus
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]# cellcli -e list cellattributes bbustatus
normal
[root@erpcel01 ~]#
[root@erpcel01 ~]#
[root@erpcel01 ~]# cellcli -e altergrdidisk all active
CELL-01504: Invalid command syntax.
[root@erpcel01 ~]#
[root@erpcel01 ~]# cellcli -e alter griddisk all active
GridDisk DATA_ERP_CD_00_erpcel01successfully altered
[root@erpcel01 ~]# cellcli -e list griddiskattributes name,asmmodestatus,asmdeactivationoutcome
DATA_ERP_CD_00_erpcel01 SYNCING Yes
[root@erpcel01 ~]# cellcli -e list griddiskattributes name,asmmodestatus,asmdeactivationoutcome
DATA_ERP_CD_00_erpcel01 ONLINE Yes