flink流批一体的原理(flink1.14源码)
-
source
- taskmanager.Task#doRun的方法会调用taskmanager.Task#restoreAndInvoke
- 其会调用streamTask跑算子方法【map,filter等】(这一步会被阻塞,后面的代码暂不执行)
- 当sourceStreamTask的sourceFunction停止发送数据(数据已发完),则sourceStreamTask的线程(LegacySourceFunctionThread sourceThread)会结束,sourceStreamTask#processInput方法里会调用mailboxProcessor.suspend(),即将suspended设置为true,则MailboxProcessor#runMailboxLoop会跳出循环,所以streamTask会执行完invoke()方法
- sourceStreamTask的线程(LegacySourceFunctionThread sourceThread)会结束,即sourceThread .getCompletionFuture()不再阻塞
- MailboxProcessor#runMailboxLoop的循环条件是:isNextLoopPossible()【!suspended,即suspended为false】
- mailboxProcessor.suspend()方法会将suspended设置为true,即MailboxProcessor#runMailboxLoop跳出循环
- taskmanager.Task#restoreAndInvoke的阻塞结束,开始调用后面的代码
- 如果有下游输出算子,则循环调用partitionWriter.finish();(即BufferWritingResultPartition#finish)
- 循环调用subpartition.finish();(即PipelinedSubpartition#finish)
- PipelinedSubpartition#finish会发送EndOfPartitionEvent.INSTANCE的事件数据
- 到此此source准备完成task的finish
- taskmanager.Task#doRun的方法会调用taskmanager.Task#restoreAndInvoke
-
下游算子
-
下游算子会收到EndOfPartitionEvent.INSTANCE的事件数据
-
SingleInputGate#transformEvent
- 接收到EndOfPartitionEvent.INSTANCE的事件数据
- hasReceivedAllEndOfPartitionEvents设置为true
-
SingleInputGate#getNextBufferOrEvent返回Optional.empty()
-
if (hasReceivedAllEndOfPartitionEvents) { return Optional.empty(); }
-
-
AbstractStreamTaskNetworkInput#emitNext方法里,checkpointedInputGate.pollNext()收到Optional.empty()
-
SingleInputGate#isFinished返回true,则checkpointedInputGate.isFinished()返回true,AbstractStreamTaskNetworkInput#emitNext返回DataInputStatus.END_OF_INPUT;
-
-
StreamOneInputProcessor#processInput返回DataInputStatus.END_OF_INPUT
-
StreamTask#processInput收到DataInputStatus.END_OF_INPUT,则调用mailboxProcessor.suspend();
-
则MailboxProcessor#runMailboxLoop会跳出循环,所以streamTask会执行完invoke()方法,该下游算子也开始finish
-