大厂注重对底层源码的理解:hadoop,spark,flink
flink提交流程:
(1)以yarn-per-job模式提交流程为例,bin/flink run -t yarn-per-job -c com.xxx.xxx.WordCount ./WordCount.jar
(2)1.1脚本启动执行
1.2解析参数CliFrantendParser
1.3使用FlinkYarnSessionCli
1.4执行用户代码
1.5生成StreamGraph流图
1.6生成JobGraph
1.7上传jar包和配置
1.8封装提交参数和命令
bin/java ApplicationMaster -jar --class …
1.9YarnClient向Resourmanager提交任务信息submitApplication
(3)Resourcemanager向Nodemanager发送信息,在NodeManager上启动ApplicationMaster
(4)3.1AM启动Dispatcher
3.2AM启动ResourceManager(自身的)
3.3Dispatcher启动JobMaster
3.4JobMaster生成ExecutionGraph
(5)SlotManager实时检查所需的slot是否足够,向SlotPool注册、请求slot
(6)SlotManager向ResourceManager申请资源requestNewWorker
(7)启动TaskManager
(8)runTashManager启动TaskExecutor
(9)TaskExecutor向SlotManager注册slot
(10)SlotManager向TaskExecutor分配slot
(11)TaskExecutor向SlotPool提供slot
(12)最后提交执行sumitTask()