问题背景
使用Flink Oracle CDC组件
<dependency>
<groupId>com.ververica</groupId>
<artifactId>flink-connector-oracle-cdc</artifactId>
<version>2.1.0</version>
</dependency>
程序在运行一段时间后出现如下报错
Caused by: java.sql.SQLRecoverableException: ORA-00603: ORACLE server session terminated by fatal error
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 108
Additional information: 655360
ORA-04030: out of process memory when trying to allocate 8392728 bytes (krvxdups: priv,redo read buffer)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:509)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:461)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1104)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:550)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:268)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:655)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:270)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:91)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:970)
at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1012)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1168)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3666)
at oracle.jdbc.driver.T4CPreparedStatement.executeInternal(T4CPreparedStatement.java:1426)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3713)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1167)
at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:184)
... 7 more
Caused by: Error : 603, Position : 0, Sql = SELECT SCN, SQL_REDO, OPERATION_CODE, TIMESTAMP, XID, CSF, TABLE_NAME, SEG_OWNER, OPERATION, USERNAME, ROW_ID, ROLLBACK FROM V$LOGMNR_CONTENTS WHERE SCN > :1 AND SCN <= :2 AND ((OPERATION_CODE IN (5,34) AND USERNAME NOT IN ('SYS','SYSTEM','SYSTEM')) OR (OPERATION_CODE IN (7,36)) OR (OPERATION_CODE IN (1,2,3) AND TABLE_NAME != 'LOG_MINING_FLUSH' AND SEG_OWNER NOT IN ('APPQOSSYS','AUDSYS','CTXSYS','DVSYS','DBSFWUSER','DBSNMP','GSMADMIN_INTERNAL','LBACSYS','MDSYS','OJVMSYS','OLAPSYS','ORDDATA','ORDSYS','OUTLN','SYS','SYSTEM','WMSYS','XDB') AND (REGEXP_LIKE(SEG_OWNER,'^SCOTT$','i')) AND (REGEXP_LIKE(SEG_OWNER || '.' || TABLE_NAME,'^SCOTT.aaa$','i')) )), OriginalSql = SELECT SCN, SQL_REDO, OPERATION_CODE, TIMESTAMP, XID, CSF, TABLE_NAME, SEG_OWNER, OPERATION, USERNAME, ROW_ID, ROLLBACK FROM V$LOGMNR_CONTENTS WHERE SCN > ? AND SCN <= ? AND ((OPERATION_CODE IN (5,34) AND USERNAME NOT IN ('SYS','SYSTEM','SYSTEM')) OR (OPERATION_CODE IN (7,36)) OR (OPERATION_CODE IN (1,2,3) AND TABLE_NAME != 'LOG_MINING_FLUSH' AND SEG_OWNER NOT IN ('APPQOSSYS','AUDSYS','CTXSYS','DVSYS','DBSFWUSER','DBSNMP','GSMADMIN_INTERNAL','LBACSYS','MDSYS','OJVMSYS','OLAPSYS','ORDDATA','ORDSYS','OUTLN','SYS','SYSTEM','WMSYS','XDB') AND (REGEXP_LIKE(SEG_OWNER,'^SCOTT$','i')) AND (REGEXP_LIKE(SEG_OWNER || '.' || TABLE_NAME,'^SCOTT.aaa$','i')) )), Error Msg = ORA-00603: ORACLE server session terminated by fatal error
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 108
Additional information: 655360
ORA-04030: out of process memory when trying to allocate 8392728 bytes (krvxdups: priv,redo read buffer)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:513)
... 22 more
问题分析
登录到oracle,使用如下语句
select name,total,round(total-free,2) used, round(free,2) free,round((total-free)/total*100,2) pctused from
(select 'SGA' name,(select sum(value/1024/1024) from v$sga) total,
(select sum(bytes/1024/1024) from v$sgastat where name='free memory')free from dual)
union
select name,total,round(used,2)used,round(total-used,2)free,round(used/total*100,2)pctused from (
select 'PGA' name,(select value/1024/1024 total from v$pgastat where name='aggregate PGA target parameter')total,
(select value/1024/1024 used from v$pgastat where name='total PGA allocated')used from dual);
查看oracle PGA和SGA的内存如下
当程序在运行时,上面USED会一直增加。
查看源码,LogMinerStreamingChangeEventSource
类的execute
方法里存在如下逻辑
每次循环都会执行startLogMining
逻辑,该方法会开启logmnr,语句如下
BEGIN
sys.dbms_logmnr.start_logmnr ( startScn => '1749194', endScn => '1749199', OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.CONTINUOUS_MINE + DBMS_LOGMNR.NO_ROWID_IN_STMT );
END;
每次执行时,通过上述语句观察到的PGA内存使用都会增加。
而endMining
方法是结束logmnr并释放内存,语句如下
BEGIN
SYS.DBMS_LOGMNR.END_LOGMNR ();
END;
但是程序一直没有进入if条件,所以一直没有进入endMining
方法从而释放内存。
进一步分析
if条件逻辑里做的事情,是判断当前程序里的currentRedoLogSequences
和通过如下语句查询的是否一致
SELECT * FROM V$LOG WHERE STATUS = 'CURRENT'
;
如果在线日志没有归档,那么返回false;如果日志归档,则返回true。
什么情况下日志归档?
文件如果写满了就会归档,参考oracle11G归档日志管理。
综上,为什么会报开头的错误?因为是测试化境,程序运行时表数据一直没有变更,导致日志不会归档。
所以一直无法进入if条件从而释放内存。