1. 配置yarn最大重试次数yarn-site.xml:
yarn.resourcemanager.am.max-attempts4
2. 配置Yarn重试次数
[root@hadoop2 flink-1.5.0]# vi conf/flink-conf.yaml yarn.application-attempts: 10
此参数代表Flink Job(yarn中称为application)在Jobmanager(或者叫Application Master)恢复时,允许重启的最大次数。
注意,Flink On Yarn环境中,当Jobmanager(Application Master)失败时,yarn会尝试重启JobManager(AM),重启后,会重新启动Flink的Job(application)。因此,yarn.application-attempts的设置不应该超过yarn.resourcemanager.am.max-attemps.
3. 配置ZooKeeper:
[root@hadoop2 flink-1.5.0]# vi conf/flink-conf.yaml high-availability: zookeeper high-availability.storageDir: hdfs:///flink/ha/ high-availability.zookeeper.quorum: 10.108.4.203:2181,10.108.4.204:2181,10.108.4.205:2181 high-availability.zookeeper.path.root: /flink high-availability.cluster-id: /cluster_yarn
4. co