执行流程
- client客户端提交任务给JobManager
- JobManage负责给执行该任务分配计算所需资源
- JobManager分发任务给TaskManager执行
- TaskManager定期会向JobManager汇报状态信息,并在计算完成后返回计算结果
安装
这里安装的是flink软件版本:1.10.0
- 集群规划
- 服务器: node1(Master + Slave): JobManager + TaskManager
- 服务器: node2(Slave): TaskManager
- 服务器: node3(Slave): TaskManager
- 上传并解压
- 上传
- 解压
tar -zxvf flink-1.10.0-bin-scala_2.11.tgz -C ../server
- 创建软链接
ln -s flink-1.10.0/ flink
- 修改flink-conf.yaml
vim /export/server/flink/conf/flink-conf.yaml
jobmanager.rpc.address: node1
- 修改masters
vim /export/server/flink/conf/masters
node1:8081
- 修改slaves
vim /export/server/flink/conf/workers
node1
node2
node3
- 添加HADOOP_CONF_DIR环境变量
vim /etc/profile
export HADOOP_CONF_DIR=/export/server/hadoop/etc/hadoop
- 分发到其他机器
scp -r /export/server/flink node2:/export/server/flink
scp -r /export/server/flink node3:/export/server/flink
scp /etc/profile node2:/etc/profile
scp /etc/profile node3:/etc/profile
- source
在所有机器上执行:
source /etc/profile
- 启动集群
启动集群:
/export/server/flink/bin/start-cluster.sh
启动历史服务器:
/export/server/flink/bin/historyserver.sh start
- WebUI
http://node1:8081/#/overview
测试
运行官方测试案例:
- 需要开启hdfs和flink集群
- 在hdfs的/wordcount/input目录下创建一个words.txt文件存储单词
/export/server/flink/bin/flink run /export/server/flink/examples/batch/WordCount.jar \
--input hdfs://node1:8020/wordcount/input/words.txt \
--output hdfs://node1:8020/wordcount/output/result.txt \
--parallelism 2