【flink】Checkpoint expired before completing.

使用flink同步数据出现错误Checkpoint expired before completing.


11:32:34,455 WARN  org.apache.flink.runtime.checkpoint.CheckpointFailureManager [Checkpoint Timer]  - Failed to trigger or complete checkpoint 4 for job 1b1d41031ea45d15bdb3324004c2d749. (2 consecutive failed attempts so far)
org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint expired before completing.
	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:2143)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
11:32:34,459 INFO  org.jobslink.flink.sink.OperateMysqlDataSink                 [Source: CDC Sourceorg.jobslink.flink.TradeAndWorkTypeAndSkillsCDCJob -> (Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink jk_skills_base), Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink jk_trade_base), Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink jk_worktypes_base)) (1/1)#0]  - READ isExitSql is : [ SELECT count(1) count from jobslink_data_platform.src_skills_base where id= 1325753409319084034 ] 
11:32:34,468 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [flink-akka.actor.default-dispatcher-9]  - Trying to recover from a global failure.
org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable failure threshold.
	at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.checkFailureAgainstCounter(CheckpointFailureManager.java:206)
	at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleJobLevelCheckpointException(CheckpointFailureManager.java:169)
	at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleCheckpointException(CheckpointFailureManager.java:122)
	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2082)
	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2061)
	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.access$600(CheckpointCoordinator.java:98)
	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:2143)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
11:32:34,470 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [flink-akka.actor.default-dispatcher-9]  - Job org.jobslink.flink.TradeAndWorkTypeAndSkillsCDCJob (1b1d41031ea45d15bdb3324004c2d749) switched from state RUNNING to RESTARTING.
11:32:34,471 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [flink-akka.actor.default-dispatcher-9]  - Source: CDC Sourceorg.jobslink.flink.TradeAndWorkTypeAndSkillsCDCJob -> (Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink base), Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink base), Filter -> Flat Map -> Filter -> (Sink: Print to Std. Out, Sink: sink base)) (1/1) (3525ceb58f2dc3264812966ec8600a19) switched from RUNNING to CANCELING.

任务超时了:

重新把任务配置参数,配置如下:

//开启CK
env.getCheckpointConfig().setCheckpointTimeout(60000);
//设置定期安排检查点的时间间隔。
env.getCheckpointConfig().setCheckpointInterval(60000);
//设置可能同时进行的检查点尝试的最大次数
env.getCheckpointConfig().setMaxConcurrentCheckpoints(500);
//设置检查点尝试之间的最小暂停时间。
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(500);

或者修改

flink的 配置文件flink-conf.yaml 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Checkpoint过期是指在Flink应用程序的检查点操作完成之前,检查点的存储时间已经过期。当一个检查点过期时,Flink将无法恢复到该检查点的状态,并且可能会导致应用程序失败。 通常,Checkpoint过期是由于以下原因之一引起的: 1. 检查点存储时间设置过短:在Flink配置文件中,可以通过`state.checkpoints.timeout`参数来设置检查点的存储时间。如果该值设置得过小,可能会导致检查点过期。建议根据应用程序的需求和数据量来适当调整该值。 2. 检查点操作耗时过长:如果应用程序中的检查点操作需要花费很长时间,而检查点的存储时间设置较短,那么在操作完成之前可能会导致检查点过期。可以通过优化应用程序中的操作逻辑、增加资源或调整并行度等方式来减少检查点操作的耗时。 3. 资源不足:如果Flink集群中的资源(如内存、磁盘空间)不足以存储和处理检查点数据,那么可能会导致检查点过期。可以通过增加集群的资源或调整应用程序的并行度来解决这个问题。 4. 网络延迟:如果检查点数据在传输过程中遇到网络延迟或故障,可能导致检查点操作未能及时完成,从而导致检查点过期。可以通过检查网络连接、增加网络带宽或优化网络配置来解决这个问题。 建议您根据具体情况进行排查和调整,以解决Checkpoint过期的问题。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值