CM6.3.X 部署Flink1.9.0服务
1.下载Flink包
下载方式1,官网下载parcel,manifest,csa 包及文件。
https://archive.cloudera.com/csa/1.0.0.0/parcels/
https://archive.cloudera.com/csa/1.0.0.0/csd/
下载方式2:提供百度网盘下载地址,内含上面三个包。官网下载比较慢 预计3小时
百度网盘:https://pan.baidu.com/s/1DVG8z77wGOohQSamerqT_w 提取码:94ul
放到cm server节点的对应目录
FLINK-1.9.0-csa1.0.0.0-cdh6.3.0.jar 放到/opt/cloudera/csd/下
FLINK-1.9.0-csa1.0.0.0-cdh6.3.0-el7.parcel及 manifest.json放到 httped对应目录。flink-1.9.0文件夹为后创建目录,并确保外部浏览器可访问。
2. Cloudera Manager界面部署
2.1重启CM server服务可在 添加服务里发现flink。
[root@cdh1 flink-1.9.0]# service cloudera-scm-server restart
2.2 激活flink
管理 -. 设置 -> parcel -> 远程parcel存储库URL 添加自己本地的flink路径
主机 -> parcel -> Flink 进行下载–分配–激活 3步
)]
激活后如图
2.3 添加flink 服务
添加服务 -> 选中flink ->继续运行 资源有限,暂时只使用单节点,剩下的保持默认,下一步下一步…
至此部署完。
3. 测试flink
编写work文件,上传到hdfs /tmp下
[root@cdh4 streaming]# vim word.txt
[root@cdh4 streaming]# cat word.txt
aaa
bbb
ccc
aaa
cc
ccc
aaa
cc
[root@cdh4 streaming]# hadoop fs -put ./word.txt /tmp/
[root@cdh4 streaming]# pwd
/opt/cloudera/parcels/FLINK/lib/flink/examples/streaming
运行flink wordcount.jar
[root@cdh4 examples]# flink run -m yarn-cluster -yn 2 -yjm 700 -ytm 700 /opt/cloudera/parcels/FLINK/lib/flink/examples/streaming/WordCount.jar --input hdfs://cdh2:8020/tmp/word.txt --output hdfs://cdh2:8020/tmp/result
查看结果:
[root@cdh4 streaming]# hadoop fs -get /tmp/result
[root@cdh4 streaming]# cat result
(aaa,1)
(bbb,1)
(ccc,1)
(aaa,2)
(cc,1)
(ccc,2)
(aaa,3)
(cc,2)
[root@cdh4 streaming]#
至此 完成。
报错1:设置的-yjm 和-ytm过小引起的错误
20/07/09 09:42:10 ERROR cli.CliFrontend: Error while running the command.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
......
.....
Caused by: java.lang.IllegalArgumentException: The configuration value 'containerized.heap-cutoff-min'='600' is larger than the total container memory 256
错误2:之前再hdfs里存在/tmp/result 引起的错误,删除就好了
20/07/09 09:48:44 ERROR cli.CliFrontend: Error while running the command.
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: e2c0bc2094f19cb7fb4cba44f12625b5)
...
...
Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'Keyed Aggregation -> Sink: Unnamed': File or directory already exists. Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.