RAC-OCR的备份和恢复

        Oracle Clusterware把整个集群的配置信息放在共享存储上,这些信息包括了集群节点的列表、集群数据库实例到节点的映射以及CRS应用程序资源信息。也即是存放在ocr 磁盘(或者ocfs文件)上。因此对于这个配置文件的重要性是不言而喻的。任意使得ocr配置发生变化的操作在操作之间或之后都建议立即备份ocr。本文主要基于Oracle 10g RAC环境描述OCR的备份与恢复。
 
一、OCR的备份与恢复概念
        与Oracle数据库备份恢复相似,OCR的备份也有物理备份或逻辑备份的概念,因此有两种备份方式,两种恢复方式。
        物理备份与恢复:
                缺省情况下,Oracle 每4个小时对其做一次备份,并且保留最后的3个副本,以及前一天,前一周的最后一个备份副本。
                用户不能自定义备份频率以及备份文件的副本数。
                对于OCR的备份备份由是由Master Node CRSD进程完成,因此备份的默认位置是$CRS_HOME/crs/cdata/<cluster_name>目录下。
                备份的文件会自动更名,以反应备份时间顺序,最近一次的备份叫作backup00.ocr。
                由于是在Master Node的节点之上进行备份,因此备份文件仅存在于Master Node节点。
                对于Master Node的节点crash之后则由剩余节点接管。
                备份目录可以通过ocrconfig -backuploc <directory_name> 命令修改。
                OCR磁盘最多只能有两个,一个Primary OCR 和一个Mirror OCR。两者互为镜像以避免单点故障。
                对于物理备份恢复,不能简单的使用操作系统级别的复制命令(使用ocr文件时)来完成,该操作将导致ocr不可用。
                
        逻辑备份与恢复:
                使用ocrconfig -export 方式产生的备份,统称之为逻辑备份。
                对于重大的ocr配置发生变化前后,如添加删除节点,修改集群资源,创建数据库等,都建议使用逻辑备份。
                对于由于错误配置而导致的ocr被损坏的情形下,我们可以使用ocrconfig -import方式进行恢复。
                对于这种逻辑方式也可以还原丢失或损坏的ocr磁盘(文件)。
        
        备份建议:
                将oracle的自动备份产生的文件复制到共享或其它可用存储设备上。
                每天至少导出一次ocr配置信息。

二、备份OCR 
  1. 1、OCR的自动备份
  2. #使用ocrconfig -showbackup查看ocr备份所在节点及路径
  3. oracle@bo2dbp:~> ocrconfig -showbackup
  4. bo2dbp 2013/ 02/ 25 06: 23: 15 /u01/oracle/crs/cdata/crs
  5. bo2dbp 2013/ 02/ 25 02: 23: 13 /u01/oracle/crs/cdata/crs
  6. bo2dbp 2013/ 02/ 24 22: 23: 13 /u01/oracle/crs/cdata/crs
  7. bo2dbp 2013/ 02/ 24 02: 23: 09 /u01/oracle/crs/cdata/crs
  8. bo2dbp 2013/ 02/ 22 18: 23: 04 /u01/oracle/crs/cdata/crs
  9. oracle@bo2dbp:~> ls -hltr /u01/oracle/crs/cdata/crs
  10. total 40M
  11. -rw-r--r-- 1 root root 6.7M 2013 -02 -22 18: 23 week.ocr
  12. -rw-r--r-- 1 root root 6.7M 2013 -02 -24 02: 23 day.ocr
  13. -rw-r--r-- 1 root root 6.7M 2013 -02 -24 22: 23 backup02.ocr
  14. -rw-r--r-- 1 root root 6.7M 2013 -02 -25 02: 23 backup01.ocr
  15. -rw-r--r-- 1 root root 6.7M 2013 -02 -25 02: 23 day_.ocr
  16. -rw-r--r-- 1 root root 6.7M 2013 -02 -25 06: 23 backup00.ocr
  17. #改变物理备份路径
  18. ocrconfig -backuploc <new_dirname>
  19. #使用物理备份恢复ocr
  20. ocrconfig -restore <backup_file_name>
  21. 对于物理备份,仅仅只能使用restore方式来进行恢复,而不支持 import方式
  22. 2、OCR的手动备份
  23. OCR的手动备份也即是逻辑备份,使用-export方式来实现
  24. ocrconfig -export <backup_file_name>
  25. #备份示例
  26. #建议在不同的节点导出ocr,导出位置尽可能存放在共享磁盘,以便任意节点均可从该磁盘恢复
  27. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/ocrconfig -export /u02/crs_bak/ocr_bak/exp/bo2dbp/ocr_bak.dmp
  28. root '''s password:
  29. oracle@bo2dbp:/u02/crs_bak/ocr_bak/exp/bo2dbp> ls -hltr /u02/crs_bak/ocr_bak/exp/bo2dbp/ocr_bak.dmp
  30. -rw-r--r-- 1 root root 144K 2013-02-25 10:10 /u02/crs_bak/ocr_bak/exp/bo2dbp/ocr_bak.dmp
  31. oracle@bo2dbs:~> sudo -s /u01/oracle/crs/bin/ocrconfig -export /u02/crs_bak/ocr_bak/exp/bo2dbs/ocr_bak.dmp
  32. root'''s password:
三、OCR的恢复
  1. 1、从可用的OCR镜像中恢复受损的OCR
  2. a、首先模拟ocr 损坏
  3. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw1 bs= 1024k count= 10
  4. 10+ 0 records in
  5. 10+ 0 records out
  6. 10485760 bytes ( 10 MB) copied, 0.24662 seconds, 42.5 MB/s
  7. oracle@bo2dbp:~> ocrcheck
  8. Status of Oracle Cluster Registry is as follows :
  9. Version : 2
  10. Total space (kbytes) : 204560
  11. Used space (kbytes) : 6184
  12. Available space (kbytes) : 198376
  13. ID : 1512159503
  14. Device/File Name : /dev/raw/raw1
  15. Device/File integrity check failed
  16. n Device/File Name : /dev/raw/raw11
  17. Device/File integrity check succeeded
  18. Cluster registry integrity check succeeded
  19. oracle@bo2dbp:~> ocrcheck
  20. Status of Oracle Cluster Registry is as follows :
  21. Version : 2
  22. Total space (kbytes) : 204560
  23. Used space (kbytes) : 6184
  24. Available space (kbytes) : 198376
  25. ID : 1512159503
  26. Device/File Name : /dev/raw/raw1
  27. Device/File needs to be synchronized with the other device
  28. Device/File Name : /dev/raw/raw11
  29. Device/File integrity check succeeded
  30. Cluster registry integrity check succeeded
  31. #尽管此时ocr文件被破坏,但整个集群依旧处于Online状态,此处不列出,读者可自行验证
  32. #接下来修复ocr
  33. b、校验所在的裸设备处于可用状态
  34. oracle@bo2dbp:~> sudo -s rcraw status | grep raw1
  35. root ''''s password:
  36. /dev/raw/raw1: bound to major 8, minor 33
  37. /dev/raw/raw11: bound to major 8, minor 113
  38. c、校验裸设备的权限
  39. oracle@bo2dbp:~> ls -hltr /dev/raw/raw1
  40. crw-rw---- 1 oracle dba 162, 1 2013-02-05 16:00 /dev/raw/raw1
  41. oracle@bo2dbp:~> ssh bo2dbs ls -hltr /dev/raw/raw1
  42. crw-rw---- 1 oracle dba 162, 1 2013-02-05 10:28 /dev/raw/raw1
  43. d、重新初始化裸设备
  44. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=200
  45. dd: writing `/dev/raw/raw1': No space left on device
  46. 200+0 records in
  47. 199+0 records out
  48. 209698816 bytes (210 MB) copied, 4.84775 seconds, 43.3 MB/s
  49. e、从镜像ocr恢复主ocr
  50. #实际上等同于添加一个新的ocr。此时主ocr从镜像ocr复制内容。
  51. #对于镜像ocr的损坏可以采用相同的方式如法炮制。
  52. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/ocrconfig -replace ocr /dev/raw/raw1
  53. root''' 's password:
  54. oracle@bo2dbp:~> ocrcheck
  55. Status of Oracle Cluster Registry is as follows :
  56. Version : 2
  57. Total space (kbytes) : 204560
  58. Used space (kbytes) : 6184
  59. Available space (kbytes) : 198376
  60. ID : 1512159503
  61. Device/File Name : /dev/raw/raw1
  62. Device/File integrity check succeeded
  63. Device/File Name : /dev/raw/raw11
  64. Device/File integrity check succeeded
  65. Cluster registry integrity check succeeded
  66. f、校验修复情况
  67. oracle@bo2dbp:~> cluvfy comp ocr -n all
  68. Verifying OCR integrity
  69. Checking OCR integrity...
  70. Checking the absence of a non-clustered configuration...
  71. All nodes free of non-clustered, local-only configurations.
  72. Uniqueness check for OCR device passed.
  73. Checking the version of OCR...
  74. OCR of correct Version "2" exists.
  75. Checking data integrity of OCR...
  76. Data integrity check for OCR passed.
  77. OCR integrity check passed.
  78. Verification of OCR integrity was successful.
  79. 2、从逻辑备份(导出的文件)中恢复OCR
  80. a、首先查看一下ocr的位置
  81. oracle@bo2dbp:~> more /etc/oracle/ocr.loc
  82. #Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw1
  83. ocrconfig_loc=/dev/raw/raw1
  84. ocrmirrorconfig_loc=/dev/raw/raw11
  85. local_only=false
  86. b、停止两个节点上的crs
  87. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/crsctl stop crs
  88. root' '''s password:
  89. Stopping resources. This could take several minutes.
  90. Successfully stopped CRS resources.
  91. Stopping CSSD.
  92. Shutting down CSS daemon.
  93. Shutdown request successfully issued.
  94. oracle@bo2dbp:~> ps -ef | grep d.bin | grep -v grep
  95. oracle@bo2dbs:~> sudo -s /u01/oracle/crs/bin/crsctl stop crs
  96. root''' 's password:
  97. Stopping resources. This could take several minutes.
  98. Successfully stopped CRS resources.
  99. Stopping CSSD.
  100. Shutting down CSS daemon.
  101. Shutdown request successfully issued.
  102. # Author : Robinson
  103. # Blog : http://blog.csdn.net/robinson_0612
  104. oracle@bo2dbs:~> ps -ef | grep d.bin | grep -v grep
  105. c、尝试破坏ocr
  106. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=10
  107. 10+0 records in
  108. 10+0 records out
  109. 10485760 bytes (10 MB) copied, 0.1811 seconds, 57.9 MB/s
  110. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw11 bs=1024k count=10
  111. 10+0 records in
  112. 10+0 records out
  113. 10485760 bytes (10 MB) copied, 0.167224 seconds, 62.7 MB/s
  114. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/crsctl start crs
  115. Attempting to start CRS stack
  116. The CRS stack will be started shortly
  117. oracle@bo2dbp:~> ps -ef | grep d.bin | grep -v grep
  118. oracle@bo2dbp:~> ./crs_stat.sh #这个查看已经无法同crs通信
  119. Resource name Target State
  120. -------------- ------ -----
  121. error connecting to CRSD at [(ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))] clsccon 184
  122. oracle@bo2dbp:~> crs_stat -t
  123. CRS-0184: Cannot communicate with the CRS daemon.
  124. d、从导出的备份文件中恢复ocr
  125. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/ocrconfig -import /u02/crs_bak/ocr_bak/exp/bo2dbp/ocr_bak.dmp
  126. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/crsctl start crs
  127. Attempting to start CRS stack
  128. The CRS stack will be started shortly
  129. oracle@bo2dbp:~> ps -ef | grep d.bin | grep -v grep
  130. oracle 27209 23220 0 10:32 ? 00:00:00 /u01/oracle/crs/bin/evmd.bin
  131. root 27307 23392 0 10:32 ? 00:00:01 /u01/oracle/crs/bin/crsd.bin reboot
  132. oracle 27613 27153 0 10:32 ? 00:00:00 /u01/oracle/crs/bin/ocssd.bin
  133. #尝试启动第2个几点的crs
  134. oracle@bo2dbs:~> sudo -s /u01/oracle/crs/bin/crsctl start crs
  135. root' '''s password:
  136. Attempting to start CRS stack
  137. The CRS stack will be started shortly
  138. e、在第二个节点上执行ocrcheck,此时显示ocrcheck成功
  139. oracle@bo2dbs:~> ocrcheck
  140. Status of Oracle Cluster Registry is as follows :
  141. Version : 2
  142. Total space (kbytes) : 204560
  143. Used space (kbytes) : 6184
  144. Available space (kbytes) : 198376
  145. ID : 1325424958
  146. Device/File Name : /dev/raw/raw1
  147. Device/File integrity check succeeded
  148. Device/File Name : /dev/raw/raw11
  149. Device/File integrity check succeeded
  150. Cluster registry integrity check succeeded
  151. oracle@bo2dbs:~> cluvfy comp ocr -n all #使用cluvfy工具校验
  152. Verifying OCR integrity
  153. Checking OCR integrity...
  154. Checking the absence of a non-clustered configuration...
  155. All nodes free of non-clustered, local-only configurations.
  156. Uniqueness check for OCR device passed.
  157. Checking the version of OCR...
  158. OCR of correct Version "2" exists.
  159. Checking data integrity of OCR...
  160. Data integrity check for OCR passed.
  161. OCR integrity check passed.
  162. Verification of OCR integrity was successful.
  163. 3、从物理备份中恢复OCR
  164. a、查看ocr的备份信息
  165. oracle@bo2dbp:~> ocrconfig -showbackup
  166. bo2dbp 2013/02/25 06:23:15 /u01/oracle/crs/cdata/crs
  167. bo2dbp 2013/02/25 02:23:13 /u01/oracle/crs/cdata/crs
  168. bo2dbp 2013/02/24 22:23:13 /u01/oracle/crs/cdata/crs
  169. bo2dbp 2013/02/24 02:23:09 /u01/oracle/crs/cdata/crs
  170. bo2dbp 2013/02/22 18:23:04 /u01/oracle/crs/cdata/crs
  171. oracle@bo2dbp:~> ls -hltr /u01/oracle/crs/cdata/crs #此时ocr的备份位于节点1
  172. total 40M
  173. -rw-r--r-- 1 root root 6.7M 2013-02-22 18:23 week.ocr
  174. -rw-r--r-- 1 root root 6.7M 2013-02-24 02:23 day.ocr
  175. -rw-r--r-- 1 root root 6.7M 2013-02-24 22:23 backup02.ocr
  176. -rw-r--r-- 1 root root 6.7M 2013-02-25 02:23 backup01.ocr
  177. -rw-r--r-- 1 root root 6.7M 2013-02-25 02:23 day_.ocr
  178. -rw-r--r-- 1 root root 6.7M 2013-02-25 06:23 backup00.ocr
  179. b、尝试损坏ocr文件
  180. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=10
  181. 10+0 records in
  182. 10+0 records out
  183. 10485760 bytes (10 MB) copied, 0.279904 seconds, 37.5 MB/s
  184. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw11 bs=1024k count=10
  185. 10+0 records in
  186. 10+0 records out
  187. 10485760 bytes (10 MB) copied, 0.145885 seconds, 71.9 MB/s
  188. #此时何ocr相关的操作都处于失败状态
  189. oracle@bo2dbp:~> ocrcheck
  190. Segmentation fault (core dumped)
  191. oracle@bo2dbp:~> ocrconfig -showbackup
  192. Segmentation fault (core dumped)
  193. oracle@bo2dbp:~> crs_stat -t
  194. Segmentation fault (core dumped)
  195. #ASM实例和RAC实例依旧处于online
  196. oracle@bo2dbp:~> ps -ef | grep pmon
  197. oracle 7915 1 0 10:09 ? 00:00:00 asm_pmon_+ASM1
  198. oracle 9234 1 0 10:10 ? 00:00:00 ora_pmon_ora10g1
  199. oracle 31704 11229 0 10:26 pts/0 00:00:00 grep pmon
  200. c、关闭crs,集群数据库及ASM
  201. oracle@bo2dbp:~> export ORACLE_SID=ora10g1
  202. oracle@bo2dbp:~> sqlplus / as sysdba
  203. SQL> show parameter db_name
  204. NAME TYPE VALUE
  205. ------------------------------------ ----------- ------------
  206. db_name string ora10g
  207. #此时查看一下ocr的位置,以便于恢复时查看对应的裸设备
  208. oracle@bo2dbp:~> more /etc/oracle/ocr.loc
  209. #Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw1
  210. ocrconfig_loc=/dev/raw/raw1
  211. ocrmirrorconfig_loc=/dev/raw/raw11
  212. local_only=false
  213. #停止crs,收到错误提示
  214. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/crsctl stop crs
  215. root''' 's password:
  216. Segmentation fault
  217. oracle@bo2dbs:~> sudo -s /u01/oracle/crs/bin/crsctl stop crs
  218. root' '''s password:
  219. Segmentation fault
  220. #下面的查询中crsd进程已经crash
  221. oracle@bo2dbp:~> ps -ef | grep d.bin | grep -v grep
  222. oracle 5844 5189 0 10:09 ? 00:00:00 /u01/oracle/crs/bin/evmd.bin
  223. oracle 6357 5883 0 10:09 ? 00:00:04 /u01/oracle/crs/bin/ocssd.bin
  224. #关闭集群数据库
  225. oracle@bo2dbp:~> export ORACLE_SID=ora10g1
  226. oracle@bo2dbp:~> sqlplus / as sysdba
  227. SQL> shutdown immediate;
  228. oracle@bo2dbs:~> export ORACLE_SID=ora10g2
  229. oracle@bo2dbs:~> sqlplus / as sysdba
  230. SQL> shutdown immediate;
  231. d、重启节点
  232. bo2dbp:~ # reboot
  233. bo2dbs:~ # reboot
  234. e、校验ocr所在的裸设备及其权限
  235. #校验所在的裸设备处于可用状态
  236. oracle@bo2dbp:~> sudo -s rcraw status | grep raw1
  237. root''' 's password:
  238. /dev/raw/raw1: bound to major 8, minor 33
  239. /dev/raw/raw11: bound to major 8, minor 113
  240. #校验裸设备的权限
  241. oracle@bo2dbp:~> ls -hltr /dev/raw/raw1
  242. crw-rw---- 1 oracle dba 162, 1 2013-02-05 16:00 /dev/raw/raw1
  243. oracle@bo2dbp:~> ssh bo2dbs ls -hltr /dev/raw/raw1
  244. crw-rw---- 1 oracle dba 162, 1 2013-02-05 10:28 /dev/raw/raw1
  245. #清空裸设备
  246. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=200
  247. dd: writing `/dev/raw/raw1': No space left on device
  248. 200+ 0 records in
  249. 199+ 0 records out
  250. 209698816 bytes ( 210 MB) copied, 4.84775 seconds, 43.3 MB/s
  251. oracle@bo2dbp:~> dd if=/dev/zero of=/dev/raw/raw11 bs= 1024k count= 200
  252. dd: writing `/dev/raw/raw11 ': No space left on device
  253. 200+0 records in
  254. 199+0 records out
  255. 209698816 bytes (210 MB) copied, 2.30847 seconds, 90.8 MB/s
  256. f、从物理备份中恢复ocr
  257. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/ocrconfig -restore /u01/oracle/crs/cdata/crs/backup00.ocr
  258. root' '''s password:
  259. oracle@bo2dbp:~> sudo -s /u01/oracle/crs/bin/crsctl start crs
  260. Attempting to start CRS stack
  261. The CRS stack will be started shortly
  262. oracle@bo2dbs:~> sudo -s /u01/oracle/crs/bin/crsctl start crs
  263. root''' 's password:
  264. Attempting to start CRS stack
  265. The CRS stack will be started shortly
  266. g、校验恢复结果
  267. oracle@bo2dbp:~> ocrcheck
  268. Status of Oracle Cluster Registry is as follows :
  269. Version : 2
  270. Total space (kbytes) : 204560
  271. Used space (kbytes) : 6184
  272. Available space (kbytes) : 198376
  273. ID : 1512159503
  274. Device/File Name : /dev/raw/raw1
  275. Device/File integrity check succeeded
  276. Device/File Name : /dev/raw/raw11
  277. Device/File integrity check succeeded
  278. Cluster registry integrity check succeeded
  279. oracle@bo2dbp:~> cluvfy comp ocr -n all
  280. Verifying OCR integrity
  281. Checking OCR integrity...
  282. Checking the absence of a non-clustered configuration...
  283. All nodes free of non-clustered, local-only configurations.
  284. Uniqueness check for OCR device passed.
  285. Checking the version of OCR...
  286. OCR of correct Version "2" exists.
  287. Checking data integrity of OCR...
  288. Data integrity check for OCR passed.
  289. OCR integrity check passed.
  290. Verification of OCR integrity was successful.
  291. #校验application
  292. oracle@bo2dbp:~> ./crs_stat.sh | grep bo2dbp
  293. ora.bo2dbp.ASM1.asm ONLINE ONLINE on bo2dbp
  294. ora.bo2dbp.LISTENER_BO2DBP.lsnr ONLINE ONLINE on bo2dbp
  295. ora.bo2dbp.LISTENER_ORA10G_BO2DBP.lsnr ONLINE ONLINE on bo2dbp
  296. ora.bo2dbp.gsd ONLINE ONLINE on bo2dbp
  297. ora.bo2dbp.ons ONLINE ONLINE on bo2dbp
  298. ora.bo2dbp.vip ONLINE ONLINE on bo2dbp
  299. ora.ora10g.ora10g1.inst
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值