stream恢复处理一例(转)

目前酒窖数据库已经恢复正常了,具体如下:25618347_201203081451431.jpg

时间还应该倒回到2天前的早上,话说一大早刚上班没多久,我们主管跟我说了个情况,酒窖的stream是不是停掉了,主站最新订单信息都没有传过来,查看了下监控界面:25618347_201203081500561.jpg

25618347_201203081508401.jpg

25618347_201203081512561.jpg

出现了此类等待事件(同时伴有configuration事件),查询流运行情况发现,各个进程都是‘ENABLE’状态的,但是告警日志里面只有注册归档日志的信息,却没有进行挖掘,也没有报错,仔细观察发现propagation进程出现一个错误告警:ORA-25307 ENQUEUE RATE TOO HIGH. ENABLE FLOW CONTROL,因为之前也出现过这种情况,之后会自动恢复,就没过多关注,然后询问相关开发人员,前一天晚上酒窖站上了大概20个新功能,初步怀疑是新功能模块导致的,查询相关等待事件v$session_wait中等待事件类型非Idle的,发现上述等待一直持续,但是根据P1,P2,P3参数也匹配不到具体的sql使用情况,然后查询v$streams_transaction视图发现有大事务一直卡在那里,推测是流池的问题,试着调整之后重启相关进程仍不得解决,最后不得以重启propagation、capture和apply进程,发现propagation和apply进程不能正常关闭,最后强制关闭,然后重启:

begin
dbms_propagation_adm.stop_propagation(
propagation_name => 'PROPAGATION$_1033',force=>true);
end;
/
begin
dbms_apply_adm.stop_apply(
apply_name => 'APPLY$_YESMYNET_1142',force=>true);
end;
/
问题依旧,最后跟我们主管商量,抱着试试看的态度,等下班时间重启酒窖数据库......
晚7点10分左右,tomcat等服务都停掉之后,重启数据库(先把相关进程propagation、capture和apply进程停掉),起来之后,把三个进程启动起来,观察告警日志发现,日志开始应用,但是应用到某些归档的时候就卡住不动了,应用的过程中不断的报600错误: 25618347_201203081542351.jpg
根据报错信息和相关的trace文件来看,应该是流的高级队列出现了问题,
网上对这个BUG的描述如下:

ORA-600 [kwqbdrcp101] after Drop or Recreate of Propagation

(估计是之前强制停掉propagation进程引起的,所以以后得慎重~~)

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.4 and later [Release: 10.2 and later]
~ S6q]!cRq9p8?P0Information in this document applies to any platform.

p7r@a&q*UN0
Symptoms:
ORA-00600: internal error code, arguments: [kwqbdrcp101], [], [], [], [], [], [], []
Current SQL statement for this session:
{W?u-MW_;p0DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate; broken BOOLEAN := FALSE; BEGIN next_date := sys.dbms_aqadm.aq$_propaq(job); :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;
ITPUB个人空间.`6D$P;DB6{
Queue_to_queue is false.
Changes
Recreate the streams environment.
X3|Zjhr'?:nbbB0Drop/recreate the propagation.
Cause

5q#_K?u&p w0 Due to message existence in Queue.
没办法重建propagation进程:
1. Stop the Propagation.
exec DBMS_PROPAGATION_ADM.STOP_PROPAGATION('propagation_name'); --正常手段停不下来的话,可以强制停掉
2. Drop the Propagation.
exec DBMS_PROPAGATION_ADM.DROP_PROPAGATION('propagation_name');
3. Purge the Queue table.
DECLARE
options dbms_aqadm.aq$_purge_options_t;
BEGIN
options.delivery_mode:=DBMS_AQADM.BUFFERED;
DBMS_AQADM.PURGE_QUEUE_TABLE('YESMYNET$APPQT',NULL,options);
END;
/
4. Create the Propagation.
BEGIN
DBMS_PROPAGATION_ADM.CREATE_PROPAGATION(
propagation_name => 'PROPAGATION$_1033',
source_queue => 'STRMADMIN2.YESMYNET$CAPQ',
destination_queue => 'STRMADMIN2.YESMYNET$APPQ',
destination_dblink => null ,--如果源队列和目标队列位于同一个database,这里置为null;
rule_set_name => 'RULESET$_1032' -- specify rule_set name, if you want
--propagation to use a rule_set, otherwise eliminate this parameter.
);
END;
/
重建完之后启动进程,可恶的600错误终于消失了: 25618347_201203081602561.jpg
但是还是应用到一定的归档就卡住了,然后观察监控界面发现: 25618347_201203081607401.jpg
看来到此问题的根本原因还没有找到,最后查询相关资料发现,linux下内存设置不合理可能会出现 Streams capture: waiting for subscribers to catch up等待事件,而我们的酒窖库64G的内存,分配给SGA的达到了30G,PGA10G,最后利用内存调整顾问,把sga调整了下,stream pool也稍微进行了调整,然后重启数据库......
发现这类等待事件终于消失了,然后启动stream相关进程,疯狂的应用归档,贼爽了!运行到一半卡住了:
25618347_201203081616131.jpg
最后发现thread 3的一个归档没有传过来,手工拷贝过来,注册一下,终于又欢快的跑了起来:
Wed Mar 07 00:09:14 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 2 sequence 24014, /arch/stdarch/_arch2_24014_744833311.arc
Wed Mar 07 00:09:15 CST 2012
LOGMINER: End mining logfile: /arch/_arch3_13377_744833311.arc
Wed Mar 07 00:16:33 CST 2012
alter database register logfile '/arch/_arch3_13378_744833311.dbf' for 'YESMYNET$CAP'
Wed Mar 07 00:16:33 CST 2012
There are 1 logfiles specified.
ALTER DATABASE REGISTER [LOGICAL] LOGFILE
Completed:
alter database register logfile '/arch/_arch3_13378_744833311.dbf' for 'YESMYNET$CAP'
Wed Mar 07 00:16:33 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 3 sequence 13378, /arch/_arch3_13378_744833311.dbf
Wed Mar 07 00:16:35 CST 2012
LOGMINER: End mining logfile: /arch/stdarch/_arch1_25563_744833311.arc
Wed Mar 07 00:16:35 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 2 sequence 24014, /arch/stdarch/_arch2_24014_744833311.arc
Wed Mar 07 00:16:35 CST 2012
LOGMINER: End mining logfile: /arch/_arch3_13377_744833311.arc
Wed Mar 07 00:17:47 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 3 sequence 13378, /arch/_arch3_13378_744833311.dbf
Wed Mar 07 00:17:48 CST 2012
LOGMINER: End mining logfile: /arch/stdarch/_arch1_25563_744833311.arc
Wed Mar 07 00:17:48 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 1 sequence 25564, /arch/_arch1_25564_744833311.arc
Wed Mar 07 00:17:48 CST 2012
LOGMINER: End mining logfile: /arch/stdarch/_arch2_24014_744833311.arc
Wed Mar 07 00:17:48 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 2 sequence 24015, /arch/_arch2_24015_744833311.arc
。。。。
。。。。
LOGMINER: Begin mining logfile for session 141 thread 3 sequence 13417, /arch/_arch3_13417_744833311.arc
Wed Mar 07 00:24:02 CST 2012
LOGMINER: End mining logfile: /arch/stdarch/_arch1_25612_744833311.arc
Wed Mar 07 00:24:02 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 1 sequence 25613, /arch/_arch1_25613_744833311.arc
Wed Mar 07 00:24:05 CST 2012
LOGMINER: End mining logfile: /arch/_arch1_25613_744833311.arc
Wed Mar 07 00:24:05 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 1 sequence 25614, /arch/_arch1_25614_744833311.arc
Wed Mar 07 00:24:06 CST 2012
LOGMINER: End mining logfile: /arch/_arch1_25614_744833311.arc
Wed Mar 07 00:24:06 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 1 sequence 25615, /oracle/oradata/blog/stdredo41.log
Wed Mar 07 00:24:06 CST 2012
LOGMINER: End mining logfile: /arch/_arch2_24067_744833311.arc
Wed Mar 07 00:24:06 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 2 sequence 24068, /oracle/oradata/blog/stdredo181.log
Wed Mar 07 00:24:06 CST 2012
LOGMINER: End mining logfile: /arch/_arch3_13417_744833311.arc
Wed Mar 07 00:24:06 CST 2012
LOGMINER: Begin mining logfile for session 141 thread 3 sequence 13418, /oracle/oradata/blog/stdredo321.log
一直到7号凌晨0点25分左右,2,3百个归档终于全部应用完了,开心啊,终于搞定了,收工~~

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22949165/viewspace-718094/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/22949165/viewspace-718094/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值