1.下载flink
https://flink.apache.org/zh/downloads.html 当前最新版 Apache Flink 1.10.0 for Scala 2.11 (asc, sha512)
2.上传至服务器集群、解压
workapp]# tar -zxvf flink-1.10.0-bin-scala_2.11.tgz
3.配置
workapp]# vim flink-1.10.0/conf/flink-conf.yaml
# 配置主节点主机名
jobmanager.rpc.address: jdddata-processing-02
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The heap size for the JobManager JVM
jobmanager.heap.size: 1024m
# The heap size for the TaskManager JVM
taskmanager.heap.size: 2048m
# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
# 每台机器能并行运行多少个slot, 一核可以运行一个slot
taskmanager.numberOfTaskSlots: 2
# The parallelism used for programs that did not specify and other parallelism.
# 整个集群最大可以的并行度, slave节点数 * 节点CPU核数
parallelism.default: 10
修改 conf/master 和配置文件的主节点主机名一样,当前选择本机02节点
jdddata-processing-02:8081
修改 conf/slaves 相当于列出所有的slave节点,注意slave的配置文件中jobmanager.rpc.address都一样是当前的主节点
jdddata-processing-03
jdddata-processing-04
jdddata-processing-05
jdddata-processing-06
jdddata-processing-07
4.同步至各个节点
workapp]# rsync -avzP flink -e 'ssh -p 62222' root@jdddata-processing-07:/data/workapp/flink
5.启动/停止
只需要在主节点执行 (启动/停止)
bin/start-cluster.sh
bin/stop-cluster.sh
可能出现的异常:
第一点:设置HADOOP_CONF_DIR环境变量
export HADOOP_CONF_DIR="/etc/hadoop/conf"
第二点:当前ssh默认端口不是22,修改FLINK_SSH_OPTS这个环境变量
export FLINK_SSH_OPTS="-p 12345"