问题背景
Flink CDC是一种很强大且实用的实时数据同步工具,官网如下。
链接: link
但是在实际使用过程中还是会有些不足之处,比如说同步Oracle数据库中无主键以及唯一键的表时,关于目标端的幂等性时无法保证的。
问题解决
在Oracle数据库中,表中有一个伪列ROWID,而在CDC同步过来的数据中是不包含此列的。
修改源码如下,使之携带ROWID信息传入flink程序中,并且在目标端建表时将ROWID设置为主键。
下面展示一些 内联代码片
。
类 debezium-1.9.7.Final\debezium-connector-oracle\src\main\java\io\debezium\connector\oracle\SourceInfo.java
// 增加以下内容
public static final String ROW_ID = "px_rowid";
private String rowid;
public String getRowId() {
return rowid;
}
public void setRowid(String rowId) {
this.rowid = rowId;
}
```java
类 debezium-1.9.7.Final\debezium-connector-oracle\src\main\java\io\debezium\connector\oracle\OracleSourceInfoStructMaker.java
public OracleSourceInfoStructMaker(String connector, String version, CommonConnectorConfig connectorConfig) {
super(connector, version, connectorConfig);
final SchemaBuilder schemaBuilder = commonSchemaBuilder()
.name("io.debezium.connector.oracle.Source")
.field(SourceInfo.SCHEMA_NAME_KEY, Schema.STRING_SCHEMA)
.field(SourceInfo.TABLE_NAME_KEY, Schema.STRING_SCHEMA)
.field(SourceInfo.TXID_KEY, Schema.OPTIONAL_STRING_SCHEMA)
.field(SourceInfo.ROW_ID, Schema.OPTIONAL_STRING_SCHEMA)
.field(SourceInfo.EVENT_SCN_KEY, Schema.OPTIONAL_STRING_SCHEMA)
.field(SourceInfo.COMMIT_SCN_KEY, Schema.OPTIONAL_STRING_SCHEMA)
.field(SourceInfo.LCR_POSITION_KEY, Schema.OPTIONAL_STRING_SCHEMA)
.field(CommitScn.ROLLBACK_SEGMENT_ID_KEY, Schema.OPTIONAL_STRING_SCHEMA)
.field(CommitScn.SQL_SEQUENCE_NUMBER_KEY, Schema.OPTIONAL_INT32_SCHEMA);
this.schema = CommitScn.schemaBuilder(schemaBuilder).build();
}
public Struct struct(SourceInfo sourceInfo) {
final String eventScn = sourceInfo.getEventScn() == null ? null : sourceInfo.getEventScn().toString();
final Struct ret = super.commonStruct(sourceInfo)
.put(SourceInfo.SCHEMA_NAME_KEY, sourceInfo.tableSchema())
.put(SourceInfo.TABLE_NAME_KEY, sourceInfo.table())
.put(SourceInfo.TXID_KEY, sourceInfo.getTransactionId())
.put(SourceInfo.ROW_ID, sourceInfo.getRowId())
.put(SourceInfo.EVENT_SCN_KEY, eventScn);
if (sourceInfo.getLcrPosition() != null) {
ret.put(SourceInfo.LCR_POSITION_KEY, sourceInfo.getLcrPosition());
}
if (sourceInfo.getRsId() != null) {
ret.put(CommitScn.ROLLBACK_SEGMENT_ID_KEY, sourceInfo.getRsId());
}
ret.put(CommitScn.SQL_SEQUENCE_NUMBER_KEY, sourceInfo.getSsn());
final CommitScn commitScn = sourceInfo.getCommitScn();
if (commitScn != null) {
commitScn.store(sourceInfo, ret);
}
return ret;
}
类 debezium-1.9.7.Final\debezium-connector-oracle\src\main\java\io\debezium\connector\oracle\OracleOffsetContext.java
// 添加以下方法 注意不要放在内部类build中
public void setRowId(String rowId) {
sourceInfo.setRowid(rowId);
}
修改完以上内容后记得install依赖,否则修改cdc代码时会报错。
类 flink-cdc-release-3.0.1\flink-cdc-connect\flink-cdc-source-connectors\flink-connector-oracle-cdc\src\main\java\io\debezium\connector\oracle\logminer\processor\AbstractLogMinerEventProcessor.java
protected void handleCommit(OraclePartition partition, LogMinerEventRow row)
throws InterruptedException {
...
@Override
public void accept(LogMinerEvent event, long eventsProcessed)
throws InterruptedException {
if (smallestScn.isNull() || commitScn.compareTo(smallestScn) < 0) {
offsetContext.setScn(event.getScn());
metrics.setOldestScn(event.getScn());
}
offsetContext.setEventScn(event.getScn());
offsetContext.setTransactionId(transactionId);
// 加入以下语句
offsetContext.setRowId(event.getRowId());
offsetContext.setSourceTime(
event.getChangeTime()
.minusSeconds(databaseOffset.getTotalSeconds()));
offsetContext.setTableId(event.getTableId());
offsetContext.setRedoThread(row.getThread());
if (eventsProcessed == numEvents) {
// reached the last event update the commit scn in the offsets
offsetContext.getCommitScn().recordCommit(row);
}
...
}