oracle 中断 adg 同步,注意：ORACLE 11G ADG RAC 这个情况下并不能高可用

最新推荐文章于 2023-09-24 19:03:25 发布

weixin_39577422

最新推荐文章于 2023-09-24 19:03:25 发布

阅读量677

点赞数

文章标签： oracle 中断 adg 同步

原标题：注意：ORACLE 11G ADG RAC 这个情况下并不能高可用

墨墨导读：一套11 G r2(11.2.0.3)版本的2节点RAC adg环境，节点1因为硬件原因异常crash(apply redo 节点)，但同时实例2上的应用也都断开了(原来都是open)。

今天在一套11 G r2(11.2.0.3)版本的2节点RAC adg环境，节点1因为硬件原因异常crash(apply redo 节点)，但同时实例2上的应用也都断开了(原来都是open)，adg上是有连接一些只读业务，而且节点2 db alert log未发现明显手动close 实例的日志，并且是自动切换到了mount状态，RAC不是应该高可用吗？为什么死一个节点另外的节点也要跟着受影响？

这里如果检查实例2状态其实是“mount”，不知道有多少人知道数据库其实是有alter database close命令的，但是在一个实例的生命周期内手动close，也就无法再open，需要重启实例(pdb除外), 并且刚才也说了，实例2 alert没有close迹象，下面附一段

dg instance 2 db alert log

2020-09-03 00:10:39.601000 +08:00

ORA-01555 caused by SQL statement below (SQL ID: 4snkhx5vxrmv2, Query Duration=7340 sec, SCN: 0x0f46.c04d2402):

select....

2020-09-0411: 35: 29.504000+ 08: 00

ArchivedLogentry 105312added forthread2sequence200520ID0x1fcb56a7dest 1:

2020-09-0814: 16: 17.954000+ 08: 00

Reconfiguration started ( oldinc 22, newinc 24)

Listofinstances:

2(myinst: 2)

GlobalResourceDirectoryfrozen

* dead instancedetected - domain0invalid = TRUE

Communication channels reestablished

Masterbroadcasted resourcehashvaluebitmaps

Non- localProcess blocks cleaned out

LMS 0: 23GCS shadows cancelled, 0closed, 0Xw survived

LMS 2: 34GCS shadows cancelled, 0closed, 0Xw survived

LMS 1: 19GCS shadows cancelled, 0closed, 0Xw survived

Setmasternode info

Submitted allremote- enqueuerequests

2020-09-0814: 16: 19.005000+ 08: 00

Dwn-cvts replayed, VALBLKs dubious

Allgrantable enqueues granted

Post SMON tostart1st pass IR

2020-09-0814: 16: 22.014000+ 08: 00

ARC1: Becoming the active heartbeat ARCH

2020-09-0814: 16: 23.328000+ 08: 00

Submitted allGCS remote- cacherequests

Post SMON tostart1st pass IR

Fix write ingcs resources

2020-09-0814: 16: 24.368000+ 08: 00

Reconfiguration complete

Recoverysessionaborted due toinstancecrash

Closethe databasedue toaborted recoverysession

SMON: disabling tx recovery

2020-09-0814: 16: 54.955000+ 08: 00

Stopping background process MMNL

Stopping background process MMON

2020-09-0814: 17: 26.530000+ 08: 00

Background process MMON notdead after30seconds

Killing background process MMON

2020-09-0814: 18: 04.907000+ 08: 00

Startingbackground process MMON

MMON started withpid= 27, OS id= 18743

Startingbackground process MMNL

MMNL started withpid= 1865, OS id= 18745

CLOSE: killing serversessions.

2020-09-0814: 18: 07.003000+ 08: 00

Active process 3858user'grid'program 'oracle@anbob2'

Active process 14847user'grid'program 'oracle@anbob2'

Active process 3435user'grid'program 'oracle@anbob2'

Active process 25029user'grid'program 'oracle@anbob2'

Active process 9789user'grid'program 'oracle@anbob2'

Active process 23815user'grid'program 'oracle@anbob2'

...

Active process 24285user'itmuser'program 'oracle@anbob2 (TNS V1-V3)'

Active process 10045user'grid'program 'oracle@anbob2'

Active process 24229user'grid'program 'oracle@anbob2'

CLOSE: allsessions shutdownsuccessfully.

SMON: disabling tx recovery

SMON: disabling cacherecovery

2020-09-0814: 19: 06.638000+ 08: 00

2020-09-0814: 26: 56.608000+ 08: 00alterdatabaserecovermanagedstandbydatabaseusingcurrentlogfiledisconnectfromsessionAttempt tostartbackground ManagedStandbyRecoveryprocess (tbcsc2)MRP0 started withpid= 73, OS id= 9746MRP0: Background ManagedStandbyRecoveryprocess started (tbcsc2)2020-09-0814: 27: 01.712000+ 08: 00started logmerger processManagedStandbyRecoverystartingRealTimeApply2020-09-0814: 27: 08.171000+ 08: 00ParallelMedia Recoverystarted with32slavesWaiting forallnon- currentORLs tobe archived...Allnon- currentORLs have been archived.RecoveryofOnlineRedoLog: Thread2Group41Seq 201120Reading mem 0Mem # 0: /dev/yyc_oravg04/ryyc_redo41Media RecoveryLog/yyc1_arch/arch_1_288357_920590168.arcCompleted: alterdatabaserecovermanagedstandbydatabaseusingcurrentlogfiledisconnectfromsession2020-09-0814: 27: 29.221000+ 08: 00

-- 因为缺少主机1上的归档日志所以无法应用，我们cancel redo apply,abort了数据库alterdatabaserecovermanagedstandbydatabasecancel2020-09-0814: 31: 57.640000+ 08: 00

WARNING: inbound connectiontimed out(ORA -3136)2020-09-0814: 33: 30.958000+ 08: 00Shutting down instance( abort)

简而言之就是，如果apply redo应用日志的实例进程异常终止后，其它所有OPEN READ ONLY的实例会close, 因为在RAC ADG环境中，如果实例在应用日志过程中中断crash, 会把CACHE FUSION的锁留到残留幸存的实例中，会导致数据查询不一致，因次需要关闭数据库，重新打开来保证buffer cache和datafile 的一致状态。如果配置了DG BROKER 这个操作可以自动完成，版本大于11.2.0.2，如果没有配置dg broker，手动方式直接open 就可以了，接着手动执行应用日志命令，继续在幸存的节点上应用日志。

附上MOS那段解释

Symptoms

In an Active Data Guard RAC standby, if the redo apply instance crashes, all other instances of that standby that were open Read Only will be closed and returned to the MOUNT state. This disconnects all readers of the Active Data Guard standby.

Cause

In an Active Data Guard RAC standby, if the redo apply instance crashes in the middle of media recovery, it leaves the RAC cache fusion locks on the surviving instances and the data files on disk in an in-flux state. In such a situation, queries on the surviving instances can potentially see inconsistent data. To resolve this in-flux state, the entire standby database is closed. Upon opening the first instance after such a close, the buffer caches and datafiles are made consistent again.

But from 12.1 when the apply instance crashed in the middle of applying changes, one of the remaining open instances will be automatically posted to do “ADG instance recovery”, after the ADG instance recovery.We can see this, ADG instance recovery by checking the alert log, for the messages like “Beginning ADG Instance Recovery” and “Completed ADG Instance Recovery”. If DG broker is enabled then Broker will start the MRP on any of the surviving instances.返回搜狐，查看更多

责任编辑：

weixin_39577422

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
oracle 中断 adg 同步,注意：ORACLE 11G ADG RAC 这个情况下并不能高可用

原标题：注意：ORACLE 11G ADG RAC 这个情况下并不能高可用墨墨导读：一套11 G r2(11.2.0.3)版本的2节点RAC adg环境，节点1因为硬件原因异常crash(apply redo 节点)，但同时实例2上的应用也都断开了(原来都是open)。今天在一套11 G r2(11.2.0.3)版本的2节点RAC adg环境，节点1因为硬件原因异常crash(apply red...
复制链接

扫一扫