环境:oracle linux 6.7+oracle11.2.0.4 RAC
因在存储上误操作仲裁盘全部丢失,从而导致rac数据库崩溃无法再启动。
1、现象
fdisk以及multipath中无法找到ocr所在磁盘
tail -100 alertdgdb1.log
集群日志出现:
[cssd(18729)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/dgdb1/cssd/ocssd.log
检查ocr出现:
[grid@dgdb1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
(1)在存储中添加3块10g的磁盘
(2)multipath.conf中配置多路径
(3)创建asm磁盘
ll /dev/mapper/* 看多路径后的名字
oracleasm createdisk VOTO1 /dev/dm-11
oracleasm createdisk VOTO2 /dev/dm-10
oracleasm createdisk VOTO3 /dev/dm-9
2、恢复ocr
(1)在存储中添加3块10g的磁盘
在所有RAC节点上停止CRS服务
[root@dgdb1 bin]# ./crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dgdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.crf' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'dgdb1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.crf' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'dgdb1'
CRS-2677: Stop of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'dgdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'dgdb1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
(2)在一个节点上以NOCRS方式启动CRS,此操作会启动ASM实例
[root@dgdb1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'dgdb1'
CRS-2676: Start of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dgdb1'
CRS-2676: Start of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dgdb1'
CRS-2672: Attempting to start 'ora.gipcd' on 'dgdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'dgdb1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dgdb1'
CRS-2672: Attempting to start 'ora.diskmon' on 'dgdb1'
CRS-2676: Start of 'ora.diskmon' on 'dgdb1' succeeded
CRS-2676: Start of 'ora.cssd' on 'dgdb1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'dgdb1'
CRS-26
因在存储上误操作仲裁盘全部丢失,从而导致rac数据库崩溃无法再启动。
1、现象
fdisk以及multipath中无法找到ocr所在磁盘
tail -100 alertdgdb1.log
集群日志出现:
[cssd(18729)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/dgdb1/cssd/ocssd.log
检查ocr出现:
[grid@dgdb1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
故障处理思路:
首先重新添加磁盘--然后使用oracleasm 创建disk--再到asm中创建disk group--再用自动备份到本地磁盘的ocr备份还原到disk group中---再替换voting disk--最后重启
1、添加磁盘(1)在存储中添加3块10g的磁盘
(2)multipath.conf中配置多路径
(3)创建asm磁盘
ll /dev/mapper/* 看多路径后的名字
oracleasm createdisk VOTO1 /dev/dm-11
oracleasm createdisk VOTO2 /dev/dm-10
oracleasm createdisk VOTO3 /dev/dm-9
2、恢复ocr
(1)在存储中添加3块10g的磁盘
在所有RAC节点上停止CRS服务
[root@dgdb1 bin]# ./crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dgdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.crf' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'dgdb1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.crf' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'dgdb1'
CRS-2677: Stop of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'dgdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'dgdb1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
(2)在一个节点上以NOCRS方式启动CRS,此操作会启动ASM实例
[root@dgdb1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'dgdb1'
CRS-2676: Start of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dgdb1'
CRS-2676: Start of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dgdb1'
CRS-2672: Attempting to start 'ora.gipcd' on 'dgdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'dgdb1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dgdb1'
CRS-2672: Attempting to start 'ora.diskmon' on 'dgdb1'
CRS-2676: Start of 'ora.diskmon' on 'dgdb1' succeeded
CRS-2676: Start of 'ora.cssd' on 'dgdb1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'dgdb1'
CRS-26