Standalone Cluster
增加master和slave中配置
注意所有的服务器上的配置都相同
masters
10.4.243.134:8081
10.5.233.254:8081
slaves
10.4.243.134
10.5.233.254
设置jobMananger的端口
jobmanager.rpc.address: 10.4.243.134
jobmanager.rpc.port: 6123
rest.port: 8081
启动集群
./start-cluster.sh
Starting cluster.
used deprecated key `jobmanager.heap.mb`, please replace with key `jobmanager.heap.size`
Starting standalonesession daemon on host 10.4.243.134.
Nasty PTR record "10.4.243.134" is set up for 10.4.243.134, ignoring
used deprecated key `taskmanager.heap.mb`, please replace with key `taskmanager.heap.size`
Starting taskexecutor daemon on host 10.4.243.134.
Nasty PTR record "10.5.233.254" is set up for 10.5.233.254, ignoring
used deprecated key `taskmanager.heap.mb`, please replace with key `taskmanager.heap.size`
Starting taskexecutor daemon on host 10.5.233.254.
启动成功显示如下
默认每个机器都是一个task manager
设置task的slot和内存
jobmanager.rpc.address: 10.4.243.134
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 2048m
taskmanager.numberOfTaskSlots: 4
rest.port: 8081
启动成功显示如下
HA设置
通过zookeeper设置HA
# 注意去除job manager
# jobmanager.rpc.address: 10.4.243.134
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 4
# 高可用模式
high-availability: zookeeper
# zookeeper机器信息,多个,号分割
high-availability.zookeeper.quorum: 10.4.243.134:2181
# zooKeeper节点根目录,其下放置所有集群节点的namespace
high-availability.zookeeper.path.root: /flink
# 集群id
high-availability.cluster-id: /cluster_one
# 恢复一个JobManager挂掉所需的元数据,存放到hdfs中
high-availability.storageDir: hdfs:///flink/recovery/
rest.port: 8081
启动后效果
kill 掉现在的job manager
[root@10 flink-1.7.1]# jps
75960 StandaloneSessionClusterEntrypoint
[root@10 flink-1.7.1]# kill -9 75960
查看另一台机器的web ui
启动刚刚失败的job manager
./bin/jobmanager.sh start 10.4.243.134