OGG-01164 Column Index (0) out of Sequence 处理

一套ogg rep端 报错OGG-01164 Column Index (0) out of Sequence XXX

开始以为是def文件有问题,比较抽取端和复制端的def中的改表,发现是完全一致的。

discard文件也没有异常报错。

查询mos发现文章描述很详细,研究一下。

Replicat Abends with OGG-01164 Column Index (0) out of Sequence (文档 ID 1393003.1)

Oracle GoldenGate - Version 11.1.1.1.0 and later
Information in this document applies to any platform.

SYMPTOMS

  • Oracle GoldenGate (OGG) replicat abends with an error similar to following:
ERROR OGG-01164 Column index (0) out of sequence for table QBORPT.APPGLOBALCCUPDATERINFO_1, last column index = 4.
ERROR OGG-01668 PROCESS ABENDING.
  • Extract pump is reading source trails from a NFS location. 



CAUSE

Trail record has not been written correctly by the extract pump, and it has been zero filled each of the records to the end with binary zeros. So when the replicat reads the record it can't correctly compose it to apply it to the target

Here is an example of the target trail file in error using logdump utility: 


Logdump 12 >next
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: A (x41)
RecLength : 213 (x00d5) IO Time : 2011/12/26 20:42:02.044.337
IOType : 5 (x05) OrigNode : 255 (xff)
TransInd : . (x03) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 307  AuditPos : 185517604
Continued : N (x00) RecCount : 1 (x01)

2011/12/26 20:42:02.044.337 Insert Len 213 RBA 38639386
Name:  QBORPT.APPGLOBALCCUPDATERINFO_1
After Image: Partition 4 G s
0000 000d 0000 0009 3331 3732 3638 3136 3500 0100 | ........317268165...
1400 0000 1039 3934 3437 3537 3736 3238 3733 3431 | .....994475776287341
3900 0200 1500 0032 3031 352d 3131 2d33 303a 3030 | 9......2015-11-30:00
3a30 303a 3030 0003 0015 ffff 3139 3030 2d30 312d | :00:00......1900-01-
3031 3a30 303a 3030 3a30 3000 0400 0600 0000 0256 | 01:00:00:00........V
4900  0000 0000 0000 0000 0000 0000 0000 0000 0000 | I...................
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 | ....................
Column 0 (x0000), Len 13 (x000d)
0000 0009 3331 3732 3638 3136 35 | ....317268165
Column 1 (x0001), Len 20 (x0014)
0000 0010 3939 3434 3735 3737 3632 3837 3334 3139 | ....9944757762873419
Column 2 (x0002), Len 21 (x0015)
0000 3230 3135 2d31 312d 3330 3a30 303a 3030 3a30 | ..2015-11-30:00:00:0
30 | 0
Column 3 (x0003), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0
Column 4 (x0004), Len 6 (x0006)
0000 0002 5649 | ....VI
Record Length 0 (x0000).



In this case the source trail that the Extract pump was reading was on NFS, and it is suspected that there was a buffer issue and the correct bytes were not read from the source.

The source trail looks like this in logdump has more than 5 columns :

Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: A (x41)
RecLength : 213 (x00d5) IO Time : 2011/12/26 20:42:02.000.000
IOType : 5 (x05) OrigNode : 255 (xff)
TransInd : . (x03) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 307  AuditPos : 185517604
Continued : N (x00) RecCount : 1 (x01)

2011/12/26 20:42:02.000.000 Insert Len 213 RBA 296906523
Name:  QBORPT.APPGLOBALCCUPDATERINFO_1
After Image: Partition 4 G s
0000 000d 0000 0009 3331 3732 3638 3136 3500 0100 | ........317268165...
1400 0000 1039 3934 3437 3537 3736 3238 3733 3431 | .....994475776287341
3900 0200 1500 0032 3031 352d 3131 2d33 303a 3030 | 9......2015-11-30:00
3a30 303a 3030 0003 0015 ffff 3139 3030 2d30 312d | :00:00......1900-01-
3031 3a30 303a 3030 3a30 3000 0400 0600 0000 0256 | 01:00:00:00........V
4900 0500 15ff ff31 3930 302d 3031 2d30 313a 3030 | I......1900-01-01:00
3a30 303a 3030 0006 0008 0000 0004 6e75 6c6c 0007 | :00:00........null..
Column 0 (x0000), Len 13 (x000d)
0000 0009 3331 3732 3638 3136 35 | ....317268165
Column 1 (x0001), Len 20 (x0014)
0000 0010 3939 3434 3735 3737 3632 3837 3334 3139 | ....9944757762873419
Column 2 (x0002), Len 21 (x0015)
0000 3230 3135 2d31 312d 3330 3a30 303a 3030 3a30 | ..2015-11-30:00:00:0
30 | 0
Column 3 (x0003), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0
Column 4 (x0004), Len 6 (x0006)
0000 0002 5649 | ....VI
Column 5 (x0005), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0
Column 6 (x0006), Len 8 (x0008)
0000 0004 6e75 6c6c | ....null
Column 7 (x0007), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0
Column 8 (x0008), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0
Column 9 (x0009), Len 21 (x0015)
ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0
30 | 0

SOLUTION

Overview of the solution :  
You must identify the location of the record in the target trail, and then match that to the source trail that is complete and then resend that complete trail record via the pump, and then alter the replicat to apply the new complete record. 

1. Add GROUPTRANSOPS 1 MAXTRANSOPS 1 to the replicat,
2. Restart the replicat (so we are now appling all the data up to the bad record). It will abend again.
3. Remove GROUPTRANSOPS 1 MAXTRANSOPS 1 from the replicat 
4. Review your replicat  report file for the trail sequence number and rba.  You will be using this to locate the record you had the failure on.

5. Use logdump to find the AUDITPOS in the target trail record where the error occurred , 
execute logdump utility 

./logdump
logdump> ghdr on 
logdump> detail data
logdump> Open <trailfilename> 
logdump> pos <rba of the trail record with the issue> 
logdump> next 

note the AUDITRBA and AUDITPOS in the header, and the commit time stamp for the record

6. Use logdump on the source trail to locate the same  AUDITPOS and check the AUDITRBA .  You will see the record is complete without zero's that you saw on the Target trail(check the example in the cause area that has been bolded)  , to determine which trail you need you need to open use the  time stamps file creation against the commit time stamp from the target trail

execute logdump 
./logdump
logdump>ghdr on
logdump>detail data
logdump> Open <sourcetrail name> 
logdump> filter include filename <owner.tablename from the target record> 
logdump> filter include auditrba <the AUDITPOS value from the target record> 
logdump> filter match all
logdump> next
logdump> filter clear

logdump> n


7. Alter the Pump extract to reposition it to start sending the data from Source trail record that we just found the matching AUDITRBA and AUDITPOS  values in the record in step 5 , note the output log sequence number of the Extract pump as this is the seqno + 1 is the sequence number you want to alter your replicat to.

alter replicat <name> extseqno <99999> extrba <99999> 

8. Alter the replicat to apply data from the start of the output log sequence + 1 (then next trail)  that your pump has just sent(alter replicat xxx  extseqno <current trail written by the pump>, extrba 0 
This will be skip over the trails that had already been pumped, as you are resending the data from the point where your replicat had the error forward.  So you don't need the previously pumped trails. 

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

官网虽然很详细,但我的AUDITRBA所指定的AUDITPOS 都没有显示,因为我的logdump显示的是一个insert的after image

不过不要紧,大体理解官方文档的意思。

目标端alter repxxx begin now (之前是append,要记录之前到现在的一个大概时间,比如停了8小时)

源端抽取 stop extxxx 

                 alter extxxx etrollover     (重新换一个文件,有可能不需要直接略过这一步)

                 alter extxxx begin XXXX(8小时之前的时间,从故障时间之前开始抽取)

源端投递stop dpexxx

                alter dpexxx    extseqno nn extrba nnn   (指定投递新产生的抽取文件,若源端第二步省略,这步也可省略)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值