Flink之异步请求AsyncDataStream生产问题记录

最近生产在搞实时统计需求,在开发环境进行开发的初期,想通过Flink的AsyncDataStream.orderedWait()异步请求的方式,将统计结果最终落地到数据库中。过程中需要查询一些MongoDB中维表的数据,本地测试数据量可能不是很大,所以没有问题。但是当到生产上运行后,就出现了下述问题:

java.util.concurrent.RejectedExecutionException: java.lang.IllegalStateException: Mailbox is in state CLOSED, but is required to be in state OPEN for put operations.
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.execute(MailboxExecutorImpl.java:60)
	at org.apache.flink.streaming.api.operators.async.AsyncWaitOperator$ResultHandler.processInMailbox(AsyncWaitOperator.java:335)
	at org.apache.flink.streaming.api.operators.async.AsyncWaitOperator$ResultHandler.complete(AsyncWaitOperator.java:330)
	at com.it.flink.base.sink.IndexStatisticsAsyncFunc.lambda$asyncInvoke$1(IndexStatisticsAsyncFunc.java:103)
	at io.vertx.ext.jdbc.impl.JDBCClientImpl.lambda$null$8(JDBCClientImpl.java:295)
	at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
	at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Mailbox is in state CLOSED, but is required to be in state OPEN for put operations.
	at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkPutStateConditions(TaskMailboxImpl.java:265)
	at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.put(TaskMailboxImpl.java:193)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.execute(MailboxExecutorImpl.java:58)
	... 13 more

问题很清晰就是MongoDB数据的连接数不够。

解决:去除异步请求的方式,将结果直接写入数据库后,问题就不出现了。

另外还有一个情况,之前处理上述问题的时候,尝试每次查询完MongoDB,将连接关闭,

try (MongoClient mongoClient = MongoConnection.getInstance()) {}

但是实际运行时报错:

ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging.
java.lang.IllegalStateException: state should be: open
	at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
	at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:82)
	at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
	at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
	at com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
	at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:402)
	at com.mongodb.operation.FindOperation.execute(FindOperation.java:510)
	at com.mongodb.operation.FindOperation.execute(FindOperation.java:81)
	at com.mongodb.Mongo.execute(Mongo.java:836)
	at com.mongodb.Mongo$2.execute(Mongo.java:823)
	at com.mongodb.OperationIterable.iterator(OperationIterable.java:47)
	at com.mongodb.FindIterableImpl.iterator(FindIterableImpl.java:151)
	at com.it.flink.base.source.MongoUpdateSelectSource.getDocumentByPrimaryKey(MongoUpdateSelectSource.java:55)
	at com.it.flink.base.transform.EffectiveFlatMapFunc$MongoUpdateType.getMessage(EffectiveFlatMapFunc.java:213)
	at com.it.flink.base.transform.EffectiveFlatMapFunc.flatMap(EffectiveFlatMapFunc.java:59)
	at com.it.flink.base.transform.EffectiveFlatMapFunc.flatMap(EffectiveFlatMapFunc.java:24)
	at org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:50)
	at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:641)
	at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:616)
	at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:596)
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
	at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
	at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
	at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:398)
	at org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher.emitRecord(Kafka010Fetcher.java:91)
	at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:156)
	at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:718)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:200)

说明MongoDB数据连接是不需要关闭的,否则会导致其他查询获取不到可用的连接,从而导致查询失败。

问题解决,后续还需要处理下,为什么异步请求会出现上述问题。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
Flink的异步IO是指在流处理中,可以并发地处理多个异步请求和接收多个响应,从而提高处理吞吐量。异步IO的控制参数包括超时参数和容量参数。超时参数定义了异步请求发出多久后未得到响应即被认定为失败,防止一直等待得不到响应的请求。容量参数定义了可以同时进行的异步请求数,限制并发请求的数量,避免积压。\[1\] Flink提供了两种模式来控制异步IO的结果记录顺序。无序模式是异步请求一结束就立刻发出结果记录,流中记录的顺序在经过异步IO算子之后发生了改变。这种模式具有最低的延迟和最少的开销,适用于使用处理时间作为基本时间特征的场景。有序模式保持了流的顺序,发出结果记录的顺序与触发异步请求的顺序相同。为了实现这一点,算子将缓冲一个结果记录直到这条记录前面的所有记录都发出(或超时)。有序模式通常会带来一些额外的延迟和checkpoint开销,因为记录或结果需要在checkpoint的状态中保存更长的时间。\[3\] 总之,Flink的异步IO可以通过控制参数来限制并发请求数和超时时间,从而提高流处理的吞吐量。同时,可以选择无序模式或有序模式来控制结果记录的顺序。 #### 引用[.reference_title] - *1* *2* *3* [Flink之外部数据访问的异步 I/O](https://blog.csdn.net/weixin_45366499/article/details/115265800)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值