FinkX的使用
一、服务器的准备
1 .1、hadoop启动停止
hadoop:
cd /opt/install/hadoop-2.5.2
sh hadoop-start.sh
hbase:
cd /opt/install/zookeeper-3.4.5
./bin/zkServer.sh start ./conf/zoo.cfg
cd /opt/install/hbase-0.98.6-hadoop2
bin/hbase-daemon.sh start master
bin/hbase-daemon.sh start regionserve
kafka:
cd /opt/install/kafka_2.11-2.2.0
./bin/kafka-server-start.sh -daemon config/server.properties
./bin/kafka-topics.sh --zookeeper CentOS8:2181 --list
./bin/kafka-topics.sh --zookeeper CentOS8:2181 --describe --topic topic01
./bin/kafka-console-producer.sh --broker-list CentOS8:2181 --topic topic01
二、FlinkX的提交模式
1.1 、Local模式运行任务
/opt/install/flinkx-1.8_release
bin/flinkx \
-mode local \
-job /opt/install/flinkx-1.8_release/job/stream.JSON \
-pluginRoot /opt/install/flinkx-1.8_release/plugins \
1.2、 Standalone模式运行
bin/flinkx \
-mode standalone \
-job /opt/install/flinkx-1.8_release/job/stream.JSON \
-pluginRoot /opt/install/flinkx-1.8_release/plugins \
-flinkconf /opt/install/flink-1.8.1/conf \
-flinkLibJar /opt/install/flink-1.8.1/lib \
-confProp "{\"flink.checkpoint.interval\":60000}"
1.3、 以Yarn Session模式运行任务
首先确保yarn集群是可用的,然后手动启动一个yarn session:
cd /opt/install/flink-1.8.1
nohup ./bin/yarn-session.sh -n 1 -s 2 -jm 1024 -tm 1024
bin/flinkx \
-mode yarn \
-job /opt/install/flinkx-1.8_release/jobs/job/stream.JSON \
-pluginRoot /opt/install/flinkx-1.8_release/plugins \
-flinkconf /opt/install/flink-1.8.1/conf \
-yarnconf /opt/install/hadoop-2.7.5/etc/hadoop
----异常处理
Container [pid=6263,containerID=container_1494900155967_0001_02_000001] is running beyond virtual memory limits
以Spark-Client模式运行,Spark-Submit时出现了下面的错误:
复制代码
User: hadoop
Name: Spark Pi
Application Type: SPARK
Application Tags:
YarnApplicationState: FAILED
FinalStatus Reported by AM: FAILED
Started: 16-五月-2017 10:03:02
Elapsed: 14sec
Tracking URL: History
Diagnostics: Application application_1494900155967_0001 failed 2 times due to AM Container for appattempt_1494900155967_0001_000002 exited with exitCode: -103
For more detailed output, check application tracking page:http://master:8088/proxy/application_1494900155967_0001/Then, click on links to logs of each attempt.
Diagnostics: Container [pid=6263,containerID=container_1494900155967_0001_02_000001] is running beyond virtual memory limits. Current usage: 107.3 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
复制代码
意思是说Container要用2.2GB的内存,而虚拟内存只有2.1GB,不够用了,所以Kill了Container。
我的SPARK-EXECUTOR-MEMORY设置的是1G,即物理内存是1G,Yarn默认的虚拟内存和物理内存比例是2.1,也就是说虚拟内存是2.1G,小于了需要的内存2.2G。解决的办法是把拟内存和物理内存比例增大,在yarn-site.xml中增加一个设置:
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.5</value>
</property>
再重启Yarn,这样一来就能有2.5G的虚拟内存,运行时就不会出错了。
1.4、 以Yarn-Per模式运行任务
bin/flinkx \
-mode yarnPer \
-job /opt/install/flinkx-1.8_release/jobs/job/stream.JSON \
-pluginRoot /opt/install/flinkx-1.8_release/plugins \
-flinkconf /opt/install/flink-1.8.1/conf \
-yarnconf /opt/install/hadoop-2.7.5/etc/hadoop \
-flinkLibJar /opt/install/flink-1.8.1/lib \
-confProp "{\"flink.checkpoint.interval\":60000}" \
-pluginLoadMode classpath