flink有三种启动方式。本地方式,集群方式,和flink on yarn。
现在说第三种生产上最常用的flink on yarn方式。
首先启动flink on yarn。
出现问题:
第一次运行报错:ClassNotFoundException: yarn.exceptions.YarnException
这个错误,解决方法很简单。需要把把官方的flink-shaded-hadoop-2-uber-2.7.5-7.0.jar放到flink的lib目录里。
由于这个包在国外服务器上,下载很慢,所以贴心的上传到csdn了。
flink-shaded-hadoop-2-uber-2.7.5-7.0.jar
flink-shaded-hadoop-2-uber-2.7.5-7.0.jar_flink-shaded-hadoop-2-uber-flink文档类资源-CSDN下载
启动session:
bin/yarn-session.sh -n 4 -s 2 -jm 1000 -tm 1000
-n : TaskManager的数量,相当于executor的数量
-s : 每个JobManager的core的数量,executor-cores。建议将slot的数量设置每台机器的处理器数量
-tm : 每个TaskManager的内存大小,executor-memory
-jm : JobManager的内存大小,driver-memory
-s : 每个taskmanager的slot槽位数 默认是1
启动后,可以看到如下的启动提示。
bin/yarn-session.sh -n 4 -s 2 -jm 1000 -tm 1000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data4/soft/flink-1.10.1/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-06-05 16:47:56,765 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost
2020-06-05 16:47:56,766 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2020-06-05 16:47:56,767 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m
2020-06-05 16:47:56,767 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 1728m
2020-06-05 16:47:56,767 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-05 16:47:56,767 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2020-06-05 16:47:56,768 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-06-05 16:47:57,347 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-06-05 16:47:57,555 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to hdfs (auth:SIMPLE), credentials check status: true
2020-06-05 16:47:57,592 INFO org.apache.flink.runtime.security.modules.JaasModule - Jaas file will be created as /tmp/jaas-1514014952401295599.conf.
2020-06-05 16:47:57,602 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli - The configuration directory ('/data4/soft/flink-1.10.1/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-06-05 16:47:58,227 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://tsl-hadoop-02:8188/ws/v1/timeline/
2020-06-05 16:47:58,466 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at tsl-hadoop-08/10.0.0.9:8050
2020-06-05 16:47:58,755 INFO org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
2020-06-05 16:47:58,954 WARN org.apache.flink.yarn.YarnClusterDescriptor - The JobManager or TaskManager memory is below the smallest possible YARN Container size. The value of 'yarn.scheduler.minimum-allocation-mb' is '2048'. Please increase the memory size.YARN will allocate the smaller containers but the scheduler will account for the minimum-allocation-mb, maybe not all instances you requested will start.
2020-06-05 16:47:58,954 INFO org.apache.flink.yarn.YarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=2048, taskManagerMemoryMB=1728, slotsPerTaskManager=1}
2020-06-05 16:47:59,533 WARN org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
再转到yarn application的web界面上,可以看到如下的进程,说明flink on yarn已经启动成功。