一:今天分享一篇FLinkCDC读取OracleCDC原理
首先映入眼帘的是flinkCDC读取Oracle的主要代码
public class OracleExample {
public static void main(String[] args) throws Exception {
SourceFunction<String> sourceFunction = OracleSource.<String>builder()
//.url("jdbc:oracle:thin:@{hostname}:{port}:{database}")
.hostname("162.14.97.42")
.port(1521)
.database("helowin") // monitor XE database
.schemaList("HR") // monitor inventory schema
.tableList("HR.EMPLOYEES") // monitor products table
.username("system")
.password("system")
.deserializer(new JsonDebeziumDeserializationSchema()) // converts SourceRecord to JSON String
.build();
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env
.addSource(sourceFunction)
.print().setParallelism(1); // use parallelism 1 for sink to keep message ordering
env.execute();
}
}
可以看到最主要的一个类是
OracleSource,进入到这个类里面去,发现都是一些配置属性以及一个DebeziumSourceFunction的配置属性,之后找到最后一个方法,进入到DebeziumSourceFunction这个类里面去,这个类主要继承了flink的一个richsourcefunction,和实现了sourcefunction这个接口,这个类里面主要方法是run方法,方法的代码如下
public void run(SourceContext<T> sourceContext) throws Exception {
properties.setProperty("name", "engine");
properties.setProperty("offset.storage", FlinkOffsetBackingStore.class.getCanonicalName());
if (restoredOffsetState != null) {
// restored from state
properties.setProperty(FlinkOffsetBackingStore.OFFSET_STATE_VALUE, restoredOffsetState);
}
// DO NOT include schema change, e.g. DDL
properties.setProperty("include.schema.changes", "false");
// disable the offset flush totally
properties.setProperty("offset.flush.interval.ms", String.valueOf(Long.MAX_VALUE));
// disable tombstones
properties.setProperty("tombstones.on.delete", "false");
if (engineInstanceName == null) {
// not restore from recovery
engineInstanceName = UUID.randomUUID().toString();
}
// history instance name to initialize FlinkDatabaseHistory
properties.setProperty(
FlinkDatabaseHistory.DATABASE_HISTORY_INSTANCE_NAME, engineInstanceName);
// we have to use a persisted DatabaseHistory implementation, otherwise, recovery can't
// continue to read binlog
// see
// https://stackoverflow.com/questions/57147584/debezium-error-schema-isnt-know-to-this-connector
// and https://debezium.io/blog/2018/03/16/note-on-database-history-topic-configuration/
properties.setProperty("database.history", determineDatabase().getCanonicalName());
// we have to filter out the heartbeat events, otherwise the deserializer will fail
String dbzHeartbeatPrefix =
properties.getProperty(
Heartbeat.HEARTBEAT_TOPICS_PREFIX.name(),
Heartbeat.HEARTBEAT_TOPICS_PREFIX.defaultValueAsString());
//扮演消费者的角色
this.debeziumChangeFetcher =
new DebeziumChangeFetcher<>(
sourceContext,
deserializer,
restoredOffsetState == null, // DB snapshot phase if restore state is null
dbzHeartbeatPrefix,
handover);
// 扮演生产者的角色
this.engine =
DebeziumEngine.create(Connect.class)
.using(properties)
.notifying(changeConsumer)
.using(OffsetCommitPolicy.always())
.using(
(success, message, error) -> {
if (success) {
// Close the handover and prepare to exit.
handover.close();
} else {
handover.reportError(error);
}
})
.build();
// run the engine asynchronously
executor.execute(engine);
debeziumStarted = true;
// initialize metrics
// make RuntimeContext#getMetricGroup compatible between Flink 1.13 and Flink 1.14
final Method getMetricGroupMethod =
getRuntimeContext().getClass().getMethod("getMetricGroup");
getMetricGroupMethod.setAccessible(true);
final MetricGroup metricGroup =
(MetricGroup) getMetricGroupMethod.invoke(getRuntimeContext());
metricGroup.gauge(
"currentFetchEventTimeLag",
(Gauge<Long>) () -> debeziumChangeFetcher.getFetchDelay());
metricGroup.gauge(
"currentEmitEventTimeLag",
(Gauge<Long>) () -> debeziumChangeFetcher.getEmitDelay());
metricGroup.gauge(
"sourceIdleTime", (Gauge<Long>) () -> debeziumChangeFetcher.getIdleTime());
// start the real debezium consumer
debeziumChangeFetcher.runFetchLoop();
}
其中在这个方法里面两个主要的属性engine,debeziumChangeFetcher,这两个采用了生产者和消费者的概念来读取Oraclecdc的数据,其中engine,这个属于是生产者的角色,debeziumChangeFetcher是消费者的角色,生产者将数据读取出来放到一个Handover,这个类里面,之后由消费者角色来进行读取,
进入到Handover这个类里面,这里面由两个重要的方法,如下所示,
生产数据,并且将数据发送给消费者(在Java线程里面就是将消费者线程唤醒)
public void produce(final List<ChangeEvent<SourceRecord, SourceRecord>> element)
throws InterruptedException {
checkNotNull(element);
synchronized (lock) {
while (next != null && !wakeupProducer) {
lock.wait();
}
wakeupProducer = false;
// an error marks this as closed for the producer
if (error != null) {
ExceptionUtils.rethrow(error, error.getMessage());
} else {
// if there is no error, then this is open and can accept this element
next = element;
lock.notifyAll();
}
}
}
消费数据,并且消费完成之后唤醒生产者继续生产数据
public List<ChangeEvent<SourceRecord, SourceRecord>> pollNext() throws Exception {
synchronized (lock) {
while (next == null && error == null) {
lock.wait();
}
List<ChangeEvent<SourceRecord, SourceRecord>> n = next;
if (n != null) {
next = null;
lock.notifyAll();
return n;
} else {
ExceptionUtils.rethrowException(error, error.getMessage());
// this statement cannot be reached since the above method always throws an
// exception this is only here to silence the compiler and any warnings
return Collections.emptyList();
}
}
}
其中produce方法生产数据由DebeziumChangeConsumer消费者的handleBatch来调度生产数据
public void handleBatch(
List<ChangeEvent<SourceRecord, SourceRecord>> events,
RecordCommitter<ChangeEvent<SourceRecord, SourceRecord>> recordCommitter) {
try {
currentCommitter = recordCommitter;
handover.produce(events);
} catch (Throwable e) {
// Hold this exception in handover and trigger the fetcher to exit
handover.reportError(e);
}
}
pollNext方法由DebeziumChangeFetcher这个类的runFetchLoop来消费数据并且将数据发送给flink,交由flink来处理数据
public void runFetchLoop() throws Exception {
try {
// begin snapshot database phase
if (isInDbSnapshotPhase) {
List<ChangeEvent<SourceRecord, SourceRecord>> events = handover.pollNext();
synchronized (checkpointLock) {
LOG.info(
"Database snapshot phase can't perform checkpoint, acquired Checkpoint lock.");
handleBatch(events);
while (isRunning && isInDbSnapshotPhase) {
handleBatch(handover.pollNext());
}
}
LOG.info("Received record from streaming binlog phase, released checkpoint lock.");
}
// begin streaming binlog phase
while (isRunning) {
// If the handover is closed or has errors, exit.
// If there is no streaming phase, the handover will be closed by the engine.
handleBatch(handover.pollNext());
}
} catch (Handover.ClosedException e) {
// ignore
}
}
总结:该架构主要是采用了kafka的一个生产者和消费者的思想来构成了整个flinkcdc消费Oraclecdc的数据,通过开启两个线程,充当不同的角色,每个线程只需要完成自己的线程当中任务。并且在这里,我们使用Handover作为缓冲区,将数据从生产者提交给消费者。因为这两个线程不直接相互通信,所以错误报告也依赖于切换。当发动机出现错误时,发动机会使用DebeziumEngine.CompletionCallback向交车报告错误,并唤醒消费者以检查错误。