linux 进程hang,GoldenGate Extract进程hang问题解决一例

一直运行正常的extract进程突然hang住了,起初怀疑是不是找不到归档所致,但是细细的检查了下,extract所需的归档都还在,而且日志中也没报任何错。而且将extract进程停掉后,也能正常起来,没有任何错误或者告警:

// ggserr.log

2015-05-25 11:03:27 INFO OGG-01513 Oracle GoldenGate Capture for Oracle, elis.prm: Positioning to (Thread 1) Sequence 5824, RBA 290544656, SCN 1415.2504802626.

2015-05-25 11:03:27 INFO OGG-01516 Oracle GoldenGate Capture for Oracle, elis.prm: Positioned to (Thread 1) Sequence 5824, RBA 290544656, SCN 1415.2504802626, May 20, 2015 4:50:25 PM.

2015-05-25 11:03:27 INFO OGG-01513 Oracle GoldenGate Capture for Oracle, elis.prm: Positioning to (Thread 2) Sequence 4798, RBA 18781712, SCN 1415.2504799356.

2015-05-25 11:03:27 INFO OGG-01516 Oracle GoldenGate Capture for Oracle, elis.prm: Positioned to (Thread 2) Sequence 4798, RBA 18781712, SCN 1415.2504799356, May 20, 2015 4:50:19 PM.

2015-05-25 11:03:27 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, elis.prm: EXTRACT ELIS started.

//extract进程的rpt文件

2015-05-25 11:03:27 INFO OGG-00546 Default thread stack size: 196608.

2015-05-25 11:03:27 INFO OGG-00547 Increasing thread stack size from 196608 to 1048576.

2015-05-25 11:03:27 INFO OGG-01513 Positioning to (Thread 1) Sequence 5824, RBA 290544656, SCN 1415.2504802626.

2015-05-25 11:03:27 INFO OGG-01516 Positioned to (Thread 1) Sequence 5824, RBA 290544656, SCN 1415.2504802626, May 20, 2015 4:50:25 PM.

2015-05-25 11:03:27 INFO OGG-01513 Positioning to (Thread 2) Sequence 4798, RBA 18781712, SCN 1415.2504799356.

2015-05-25 11:03:27 INFO OGG-01516 Positioned to (Thread 2) Sequence 4798, RBA 18781712, SCN 1415.2504799356, May 20, 2015 4:50:19 PM.

2015-05-25 11:03:27 INFO OGG-01517 Position of first record processed for Thread 1, Sequence 5824, RBA 290544656, SCN 1415.2504802623, May 20, 2015 4:50:25 PM.

2015-05-25 11:03:27 INFO OGG-01644 BOUNDED RECOVERY: COMPLETE: for object pool 1: p33619994_Redo Thread 1 at SeqNo: 5824, RBA: 290545168, SCN: 1415.2504802626 (6079883526466), Thread: 1.

2015-05-25 11:03:27 INFO OGG-01055 Recovery initialization completed for target file /ogg/ggate/data/dirext/lis/ea002534, at RBA 1109.

2015-05-25 11:03:27 INFO OGG-01478 Output file /ogg/ggate/data/dirext/lis/ea is using format RELEASE 11.2.

2015-05-25 11:03:27 INFO OGG-01517 Position of first record processed for Thread 2, Sequence 4798, RBA 18781712, SCN 1415.2504799356, May 20, 2015 4:50:19 PM.

2015-05-25 11:03:27 INFO OGG-01026 Rolling over remote file /ogg/ggate/data/dirext/lis/ea002534.

2015-05-25 11:03:27 INFO OGG-01053 Recovery completed for target file /ogg/ggate/data/dirext/lis/ea002535, at RBA 1109.

2015-05-25 11:03:27 INFO OGG-01057 Recovery completed for all targets.

从日志中看,extract进程没有问题,但是extract进程实际上是运行不正常的:

GGSCI (hxddlis01) 8> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING

JAGENT RUNNING

EXTRACT RUNNING ELIS 96:14:58 00:00:04

EXTRACT RUNNING PL2IS 00:00:00 00:00:00

EXTRACT RUNNING PLIS 00:00:00 00:00:09

extract进程的lag越来越大,而且trail文件也不再增长。这情况真是让人一头雾水。我们先来看看当前系统的运行环境:

GoldenGate:11.2.1.0.27

Database: 11.2.0.3 两节点的RAC,使用ASM

OS: AIX 6.1

然后搜索Oracle Support,找到一篇文章1432994.1,和碰到的情形有点相像。在我们的环境中也是使用了DBLOGREADER,但是不同的是DBLOGREADERBUFSIZE已经配置成1M:

...

TRANLOGOPTIONS BUFSIZE 1048576

TRANLOGOPTIONS DBLOGREADER, DBLOGREADERBUFSIZE 1048576

--TRANLOGOPTIONS _NOREADAHEAD ANY

EXTTRAIL /ogg/ggate/data/dirext/lis/ea

...

这也是AIX下能配置的最大值。难道该值还是太大的缘故?所以就尝试将该值改成了512K:

...

TRANLOGOPTIONS BUFSIZE 1048576

TRANLOGOPTIONS DBLOGREADER, DBLOGREADERBUFSIZE 524288

--TRANLOGOPTIONS _NOREADAHEAD ANY

EXTTRAIL /ogg/ggate/data/dirext/lis/ea

...

然后重启extract进程,结果就正常了。估计使用DBLOGREADER还是有不少的bug。

0b1331709591d260c1c78e86d0c51c18.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值