flink_kafka_yarn

flink

问题

1.gc overhead limit exceeded:多次回收后,gc空间仍不够用

解决方案:切换成G1垃圾回收器

2.修改flink并发度后,提示:Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster

解决方案:

3.环境问题,多试几次

2021-03-09 11:55:48|main|ERROR|org.apache.flink.runtime.entrypoint.ClusterEntrypoint|runClusterEntrypoint|520 - Could not start cluster entrypoint YarnJobClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
Caused by: java.lang.Exception: unable to establish the security context
	at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:73)
	at org.apache.flink.yarn.entrypoint.YarnEntrypointUtils.installSecurityContext(YarnEntrypointUtils.java:57)
	at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.installSecurityContext(YarnJobClusterEntrypoint.java:58)
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:166)
	... 2 common frames omitted
Caused by: java.lang.RuntimeException: unable to generate a JAAS configuration file
	at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
	at org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
	at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:67)
	... 5 common frames omitted
Caused by: java.nio.file.NoSuchFileException: /alidata1/soft/flink/tmp/jaas-6589995419609878833.conf
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
	at java.nio.file.Files.newByteChannel(Files.java:361)
	at java.nio.file.Files.createFile(Files.java:632)
	at java.nio.file.TempFileHelper.create(TempFileHelper.java:138)
	at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161)
	at java.nio.file.Files.createTempFile(Files.java:852)
	at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:163)
	... 7 common frames omitted

4.Pending record count must be zero at this point  

【Flink基础】-- 写入 Kafka 的两种方式_余额不足-CSDN博客

当flink的并发度小于下游topic的分区数时,部分分区不会写数据

package org.apache.flink.streaming.connectors.kafka.partitioner;
 
import org.apache.flink.util.Preconditions;
 
public class FlinkFixedPartitioner<T> extends FlinkKafkaPartitioner<T> {
    private int parallelInstanceId;
 
    public FlinkFixedPartitioner() {
    }
 
    public void open(int parallelInstanceId, int parallelInstances) {
        Preconditions.checkArgument(parallelInstanceId >= 0, "Id of this subtask cannot be negative.");
        Preconditions.checkArgument(parallelInstances > 0, "Number of subtasks must be larger than 0.");
        this.parallelInstanceId = parallelInstanceId;
    }
 
    public int partition(T record, byte[] key, byte[] value, String targetTopic, int[] partitions) {
        Preconditions.checkArgument(partitions != null && partitions.length > 0, "Partitions of the target topic is empty.");
        return partitions[this.parallelInstanceId % partitions.length];
    }
}

Flink根据接收器子任务ID和Kafka分区号计算余数。计算过程如下:

flink并行度为3(F0,F1,F2),分区数为2(P0,P1),则F0-> P0,F1-> P1,F2-> P0

flink并行度为2(F0,F1),分区数为3(P0,P1,P2),然后F0-> P0,F1-> P1

因此,默认分区程序将具有2个凹坑:

  • 当接收器的并发性低于主题分区的数量时,写入分区的接收器任务将导致某些分区完全没有数据。

  • 展开主题的分区后,需要重新启动操作以发现新的分区。

5.Caused by: java.nio.file.NoSuchFileException: /alidata1/soft/flink/tmp/jaas-4073091767736825725.conf

Caused by: java.lang.RuntimeException: unable to generate a JAAS configuration file
	at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
	at org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
	at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:67)
	... 5 more
Caused by: java.nio.file.NoSuchFileException: /alidata1/soft/flink/tmp/jaas-4073091767736825725.conf
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
	at java.nio.file.Files.newByteChannel(Files.java:361)
	at java.nio.file.Files.createFile(Files.java:632)
	at java.nio.file.TempFileHelper.create(TempFileHelper.java:138)
	at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161)
	at java.nio.file.Files.createTempFile(Files.java:852)
	at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:163)

没有设置flink重启机制,

env.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, 10));

flink on yarn提交命令

flink run \
-d \
-m yarn-cluster \
-p 4 \
-ys 1 \
-yjm 1024m \
-ytm 4096m \
-yqu root.users.test(Specify YARN queue) \
-ynm DataAPI-MarvelTradeMonitorJob(Set a custom name for the application on YARN) \
-c com.ppdai.tsflink.druidHS.DruidLogToES6 \
-yD env.java.opts="-Dlogback.configurationFile=file:///var/lib/hadoop-hdfs/jar_flink/test/logback.xml" \
/var/lib/hadoop-hdfs/jar_flink/test/tsflink-druids-1.0-SNAPSHOT-jar-with-dependencies.jar




/var/lib/hadoop-hdfs/flink-1.9.0/bin/flink run -d -m yarn-cluster -p 2 -ys 1 -yjm 1024m -ytm 4096m  -yqu root.users.test\
-ynm DataAPI-{JobName} \
-s hdfs://caasnameservice/flink-checkpoints/test/{JobName}/34375eb4cd7f24b8f4d8fc9d8913a7f0/chk-2460/_metadata \
-c com.test.realtime.{JobName} \
-yD env.java.opts="-Dlogback.configurationFile=file:///var/lib/hadoop-hdfs/jar_flink/test/logback.xml" \
/var/lib/hadoop-hdfs/jar_flink/test/{jarName}.jar  hdfs://caasnameservice /user/hdfs/test/{JobName}.properties




/var/lib/hadoop-hdfs/flink-1.9.0/bin/flink run \
-d \
-m yarn-cluster \
-p 2 \
-ys 1 \
-yjm 1024m \
-ytm 2048m \
-yqu root.users.test\
-ynm DataAPI-MarvelTradeMonitorJob \
-c com.test.datacloud.MarvelTradeMonitorJob \
-yD env.java.opts="-Dlogback.configurationFile=file:///var/lib/hadoop-hdfs/jar_flink/test/logback.xml" \
/var/lib/hadoop-hdfs/jar_flink/test/datacloud-query-task-1.0-SNAPSHOT-jar-with-dependencies.jar hdfs://nameservice1:8020 /user/hdfs/test/MarvelTradeMonitorJob.properties

savepoint

save到指定目录,并停止job:

flink stop -p [savepointDir] jobId
/var/lib/hadoop-hdfs/flink-1.9.0/bin/flink savepoint -yid application_1578367242038_3489 bcb89804600c227efc4d30d8af3d3d00 hdfs://nameservice1/flink-savepoints/test/TBillRepaymentTradeJob

从指定point恢复job:

flink run -s [savepointDir] xxxx.jar
flink run -s hdfs://namenode01.td.com/tmp/flink/savepoints/savepoint-40dcc6-a90008f0f82f flink-app-jobs.jar
/var/lib/hadoop-hdfs/flink-1.9.0/bin/flink run -d -m yarn-cluster -p 2 -yjm 1024m -ytm 2048m  -yqu root.users.test\
-ynm DataAPI-RepayResultRecordNewJob \
-s hdfs://caasnameservice/flink-checkpoints/test/RepayResultRecordNewJob/d2d5322888094e7ab16911dc01d927ef/chk-9/_metadata \
-c com.test.realtime.RepayResultRecordNewJob \
/var/lib/hadoop-hdfs/jar_flink/test/user-rt-compute-1.1-jar-with-dependencies.jar  hdfs://caasnameservice /user/hdfs/test/RepayResultRecordNewJob.properties

flink stop

/var/lib/hadoop-hdfs/flink-1.9.0/bin/flink stop -p hdfs://nameservice1/flink_savepoints/test/TBillRepaymentTradeJob -yid application_1578367242038_3489 bcb89804600c227efc4d30d8af3d3d00


 

yarn

查看task manager日志

yarn logs -applicationId application_1575946844259_176643

查看job

yarn application --list

杀死job

yarn application -kill

kafka

查看kafka运行状态

ps -ef|grep server.properties

关闭kafka

bin/kafka-server-stop.sh

启动kafka

bin/kafka-server-start.sh -daemon config/server.properties

创建topic

bin/kafka-topics.sh --create --zookeeper ip1:2181,ip2:2181,ip3:2181/kafka --replication-factor 2 --partitions 2 --topic test

查看主题/消息

bin/kafka-topics.sh --list --zookeeper ip:2181
bin/kafka-topics.sh --describe --zookeeper ip:2181 --topic test

删除topic

bin/kafka-topics.sh --delete --zookeeper ip:2181 --topic test

发送主题消息,不显示消息,创建生产者

bin/kafka-console-producer.sh --broker-list ip1:9092,ip2:9092,ip3:9092 --topic test

消费主题消息,显示消息,创建消费者

bin/kafka-console-consumer.sh --bootstrap-server ip1:9092,ip2:9092,ip3:9092 --topic test

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值