刚刚观察测试库gg运行状况发现,pump进程异常abend,err log报错日志如下:
2012-05-24 10:45:54
ERROR OGG-01031 Oracle GoldenGate Capture for Oracle, dpump.prm: There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Reply received is Unable to open file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable)).
2012-05-24 10:45:54 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, dpump.prm: PROCESS ABENDING.
2012-05-24 10:45:54 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, dpump.prm: PROCESS ABENDING.
该错误官方说明如下:
OGG-01031: There is a problem in network communication, a remote file problem,
encryption keys for target and source do not match (if using ENCRYPT) or an
unknown error. (Reply received is {0})
Cause: Extract was not able to send data to the target.
Action: Examine the network for an outage between the source and target system.
Make certain that MGRPORT in RMTHOST matches the one in the parameter file
of the remote Manager. Check for errors on the target system that indicate the
Manager or Collector process is not running, or that Collector does not have
privileges to write to the remote trail. Check for encryption errors if the ENCRYPT
option is specified with the RMTHOST or RMTHOSTOPTIONS parameter: This
might indicate that the encryption key that was sent from the source does not
match the one in the ENCKEYS file on the target, or that the key or ENCKEYS file
on the target is missing.
encryption keys for target and source do not match (if using ENCRYPT) or an
unknown error. (Reply received is {0})
Cause: Extract was not able to send data to the target.
Action: Examine the network for an outage between the source and target system.
Make certain that MGRPORT in RMTHOST matches the one in the parameter file
of the remote Manager. Check for errors on the target system that indicate the
Manager or Collector process is not running, or that Collector does not have
privileges to write to the remote trail. Check for encryption errors if the ENCRYPT
option is specified with the RMTHOST or RMTHOSTOPTIONS parameter: This
might indicate that the encryption key that was sent from the source does not
match the one in the ENCKEYS file on the target, or that the key or ENCKEYS file
on the target is missing.
貌似是网络原因造成的,排查源库和目标库的相关参数发现,都是一致的,也没什么问题,再到目标库查看相关进程都是running状态,但是出现了一个比较明显的告警信息,如下:
2012-05-24 10:57:50
WARNING OGG-01223 Oracle GoldenGate Collector for Oracle: Unable to lock file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable). Lock currently held by process id (PID) 1731.
2012-05-24 10:57:50 WARNING OGG-01223 Oracle GoldenGate Collector for Oracle: Unable to open file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable).
2012-05-24 10:57:50 INFO OGG-01676 Oracle GoldenGate Collector for Oracle: Terminating after client disconnect.
2012-05-24 10:59:22 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from EXTRACT on host (START SERVER CPU -1 PRI -1 TIMEOUT 300 PARAMS ).
2012-05-24 10:59:22 INFO OGG-01677 Oracle GoldenGate Collector for Oracle: Waiting for connection (started dynamically).
2012-05-24 10:59:22 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from SERVER on host localhost.localdomain (REPORT 10710 7810).
2012-05-24 10:59:22 INFO OGG-00974 Oracle GoldenGate Manager for Oracle, mgr.prm: Manager started collector process (Port 7810).
2012-05-24 10:59:22 INFO OGG-01228 Oracle GoldenGate Collector for Oracle: Timeout in 300 seconds.
2012-05-24 10:57:50 WARNING OGG-01223 Oracle GoldenGate Collector for Oracle: Unable to open file "/home/oracle/ggs/dirdat/k1005466" (error 11, Resource temporarily unavailable).
2012-05-24 10:57:50 INFO OGG-01676 Oracle GoldenGate Collector for Oracle: Terminating after client disconnect.
2012-05-24 10:59:22 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from EXTRACT on host (START SERVER CPU -1 PRI -1 TIMEOUT 300 PARAMS ).
2012-05-24 10:59:22 INFO OGG-01677 Oracle GoldenGate Collector for Oracle: Waiting for connection (started dynamically).
2012-05-24 10:59:22 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from SERVER on host localhost.localdomain (REPORT 10710 7810).
2012-05-24 10:59:22 INFO OGG-00974 Oracle GoldenGate Manager for Oracle, mgr.prm: Manager started collector process (Port 7810).
2012-05-24 10:59:22 INFO OGG-01228 Oracle GoldenGate Collector for Oracle: Timeout in 300 seconds.
仔细观察发现,源和目标库的相关操作都是对目标库的trail文件k1005466进行操作的,推测大致的过程是这样的:
启动pump进程后,该进程要往目标库的第5466号trail文件写数据,而此时目标库的该文件被进程编号为1731的进程所锁定,pump无法获得该文件上的锁,造成等待,延时,最终超时而abend掉(这种情况表面看来很像是由于网络原因造成的,其实不然),之后在目标库查看1731号进程的具体情况如下:
[oracle@trptdb ggs]$ ps -ef | grep 1731
oracle 1731 1 0 May23 ? 00:00:13 ./server -w 300 -p 7819-12818 -m 7809 -k -l /home/oracle/ggs/ggserr.log
oracle 10699 10451 0 10:58 pts/1 00:00:00 grep 1731
oracle 1731 1 0 May23 ? 00:00:13 ./server -w 300 -p 7819-12818 -m 7809 -k -l /home/oracle/ggs/ggserr.log
oracle 10699 10451 0 10:58 pts/1 00:00:00 grep 1731
发现是ogg的server进程锁住了第5466号文件,推测应该是读取5466号文件写入ggserr log所造成吧,具体有待查证,之后干脆直接kill掉该进程,然后再次启动源库的pump进程,2端开始正常应用:
源库ggserr.log:
2012-05-24 10:52:26 INFO OGG-00987 Oracle GoldenGate Command Interpreter for Oracle: GGSCI command (oracle): start dpump.
2012-05-24 10:52:26 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from GGSCI on host 127.0.0.1 (START EXTRACT DPUMP ).
2012-05-24 10:52:26 INFO OGG-00975 Oracle GoldenGate Manager for Oracle, mgr.prm: EXTRACT DPUMP starting.
2012-05-24 10:52:26 INFO OGG-00992 Oracle GoldenGate Capture for Oracle, dpump.prm: EXTRACT DPUMP starting.
2012-05-24 10:52:26 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, dpump.prm: EXTRACT DPUMP started.
2012-05-24 10:52:31 INFO OGG-01226 Oracle GoldenGate Capture for Oracle, dpump.prm: Socket buffer size set to 27985 (flush size 27985).
2012-05-24 10:52:32 INFO OGG-01056 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery initialization completed for target file /home/oracle/ggs/dirdat/k1005466, at RBA 5777739, CSN 76400322526.
2012-05-24 10:52:32 INFO OGG-01478 Oracle GoldenGate Capture for Oracle, dpump.prm: Output file /home/oracle/ggs/dirdat/k1 is using format RELEASE 10.4/11.1.
2012-05-24 10:52:32 INFO OGG-01026 Oracle GoldenGate Capture for Oracle, dpump.prm: Rolling over remote file /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:52:32 INFO OGG-01054 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery completed for target file /home/oracle/ggs/dirdat/k1005467, at RBA 1393, CSN 76400322526.
2012-05-24 10:52:32 INFO OGG-01057 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery completed for all targets.
2012-05-24 10:52:33 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_GOODS_STOCK2. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:52:33 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_GOODS_STOCK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:52:34 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_PAGE_CLICK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:52:26 INFO OGG-00963 Oracle GoldenGate Manager for Oracle, mgr.prm: Command received from GGSCI on host 127.0.0.1 (START EXTRACT DPUMP ).
2012-05-24 10:52:26 INFO OGG-00975 Oracle GoldenGate Manager for Oracle, mgr.prm: EXTRACT DPUMP starting.
2012-05-24 10:52:26 INFO OGG-00992 Oracle GoldenGate Capture for Oracle, dpump.prm: EXTRACT DPUMP starting.
2012-05-24 10:52:26 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, dpump.prm: EXTRACT DPUMP started.
2012-05-24 10:52:31 INFO OGG-01226 Oracle GoldenGate Capture for Oracle, dpump.prm: Socket buffer size set to 27985 (flush size 27985).
2012-05-24 10:52:32 INFO OGG-01056 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery initialization completed for target file /home/oracle/ggs/dirdat/k1005466, at RBA 5777739, CSN 76400322526.
2012-05-24 10:52:32 INFO OGG-01478 Oracle GoldenGate Capture for Oracle, dpump.prm: Output file /home/oracle/ggs/dirdat/k1 is using format RELEASE 10.4/11.1.
2012-05-24 10:52:32 INFO OGG-01026 Oracle GoldenGate Capture for Oracle, dpump.prm: Rolling over remote file /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:52:32 INFO OGG-01054 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery completed for target file /home/oracle/ggs/dirdat/k1005467, at RBA 1393, CSN 76400322526.
2012-05-24 10:52:32 INFO OGG-01057 Oracle GoldenGate Capture for Oracle, dpump.prm: Recovery completed for all targets.
2012-05-24 10:52:33 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_GOODS_STOCK2. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:52:33 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_GOODS_STOCK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:52:34 WARNING OGG-00869 Oracle GoldenGate Capture for Oracle, dpump.prm: No unique key is defined for table M_PAGE_CLICK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
目标库ggserr.log:
2012-05-24 10:59:27 INFO OGG-01229 Oracle GoldenGate Collector for Oracle: Connected to :22983.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005466 (byte -1, current EOF 5777739).
2012-05-24 10:59:27 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005466 (byte 5777739, current EOF 5777739).
2012-05-24 10:59:27 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005467 (byte -1, current EOF 0).
2012-05-24 10:59:27 INFO OGG-01020 Oracle GoldenGate Delivery for Oracle, rep1.prm: Processed extract process RESTART_ABEND record at seq 5467, rba 1333 (aborted 0 records).
2012-05-24 10:59:30 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:59:30 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005468 (byte -1, current EOF 0).
2012-05-24 10:59:31 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005468.
2012-05-24 10:59:31 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005469 (byte -1, current EOF 0).
2012-05-24 10:59:33 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005469.
2012-05-24 10:59:33 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005470 (byte -1, current EOF 0).
2012-05-24 10:59:41 WARNING OGG-00869 Oracle GoldenGate Delivery for Oracle, rep1.prm: No unique key is defined for table 'M_GOODS_STOCK2'. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005466 (byte -1, current EOF 5777739).
2012-05-24 10:59:27 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005466 (byte 5777739, current EOF 5777739).
2012-05-24 10:59:27 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005466 to disk.
2012-05-24 10:59:27 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005466.
2012-05-24 10:59:27 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005467 (byte -1, current EOF 0).
2012-05-24 10:59:27 INFO OGG-01020 Oracle GoldenGate Delivery for Oracle, rep1.prm: Processed extract process RESTART_ABEND record at seq 5467, rba 1333 (aborted 0 records).
2012-05-24 10:59:30 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005467 to disk.
2012-05-24 10:59:30 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005467.
2012-05-24 10:59:30 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005468 (byte -1, current EOF 0).
2012-05-24 10:59:31 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005468 to disk.
2012-05-24 10:59:31 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005468.
2012-05-24 10:59:31 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005469 (byte -1, current EOF 0).
2012-05-24 10:59:33 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33 INFO OGG-01735 Oracle GoldenGate Collector for Oracle: Synchronizing /home/oracle/ggs/dirdat/k1005469 to disk.
2012-05-24 10:59:33 INFO OGG-01670 Oracle GoldenGate Collector for Oracle: Closing /home/oracle/ggs/dirdat/k1005469.
2012-05-24 10:59:33 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening /home/oracle/ggs/dirdat/k1005470 (byte -1, current EOF 0).
2012-05-24 10:59:41 WARNING OGG-00869 Oracle GoldenGate Delivery for Oracle, rep1.prm: No unique key is defined for table 'M_GOODS_STOCK2'. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
由此看来遇到问题的时候,errlog里面会提供大量有用的排错信息,这一点也需要细心和耐心,同时可以查阅官方对错误和警告的详细说明,对解决问题很有帮助。
记录一下~~
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25618347/viewspace-730797/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/25618347/viewspace-730797/