Kafka 异常 : DefaultOffsetCommitCallback.onComplete(ConsumerCoordinator.java:537) -Offset commit faile

Kafka 异常 : DefaultOffsetCommitCallback.onComplete(ConsumerCoordinator.java:537) -Offset commit failed

异常详情:

ConsumerCoordinator$DefaultOffsetCommitCallback.onComplete(ConsumerCoordinator.java:537) -Offset commit failed.
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
        at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600)
        at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541)
        at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679)
        at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
环境参数,大数据环境 HDP, kafka 版本 2.0.0,spark版本 2.3.2

由于业务逻辑的修改,调大 SparkStreaming的批次间的时间间隔至5分钟。启动服务之后就会报上述的错误。

从报错内容来看是消费者在执行poll操作时超过了这个线程的默认最大的空闲时间,导致消费者组认为该消费者已离开消费者组,所以消费者组执行了再均衡操作,从而导致了sparkStreaming poll 失败。

解决方法:通过调大这个空闲时间的参数(max.poll.interval.ms)来解决这个问题。

但是现实情况确实即便是调大到1小时,这个错误依旧会出现。

继续查看日志,到任务启动阶段我们可以找到下面这部分Kafka消费者日志信息。

2022-03-29 14:09:49,285 INFO org.apache.kafka.common.config.AbstractConfig.logAll(AbstractConfig.java:178) -ConsumerConfig values:
        metric.reporters = []
        metadata.max.age.ms = 300000
        partition.assignment.strategy = [org.apache.kafka.clients.consumer.RangeAssignor]
        reconnect.backoff.ms = 50
        sasl.kerberos.ticket.renew.window.factor = 0.8
        max.partition.fetch.bytes = 1048576
        bootstrap.servers = [192.168.1.177:6667, 192.168.0.130:6667, 192.168.0.117:6667]
        ssl.keystore.type = JKS
        enable.auto.commit = false
        sasl.mechanism = GSSAPI
        interceptor.classes = null
        exclude.internal.topics = true
        ssl.truststore.password = null
        client.id = consumer-1
        ssl.endpoint.identification.algorithm = null
        max.poll.records = 2147483647
        check.crcs = true
        request.timeout.ms = 600000
        heartbeat.interval.ms = 3000
        auto.commit.interval.ms = 5000
        receive.buffer.bytes = 65536
        ssl.truststore.type = JKS
        ssl.truststore.location = null
        ssl.keystore.password = null
        fetch.min.bytes = 1
        send.buffer.bytes = 131072
        value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
        group.id = tinyeyePortScan
        retry.backoff.ms = 100
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        ssl.trustmanager.algorithm = PKIX
        ssl.key.password = null
        fetch.max.wait.ms = 500
        sasl.kerberos.min.time.before.relogin = 60000
        connections.max.idle.ms = 480000
        session.timeout.ms = 300000
        metrics.num.samples = 2
        key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
        ssl.protocol = TLS
        ssl.provider = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.keystore.location = null
        ssl.cipher.suites = null
        security.protocol = PLAINTEXT
        ssl.keymanager.algorithm = SunX509
        metrics.sample.window.ms = 30000
        auto.offset.reset = latest

2022-03-29 14:09:49,322 WARN org.apache.kafka.common.config.AbstractConfig.logUnused(AbstractConfig.java:186) -The configuration max.poll.interval.ms = 450000 was supplied but isn't a known config.
2022-03-29 14:09:49,326 INFO org.apache.kafka.common.utils.AppInfoParser$AppInfo.<init>(AppInfoParser.java:83) -Kafka version : 0.10.0.1
2022-03-29 14:09:49,326 INFO org.apache.kafka.common.utils.AppInfoParser$AppInfo.<init>(AppInfoParser.java:84) -Kafka commitId : a7a17cdec9eaa6c5

值得注意的是Kafka日志信息显示 没有 max.poll.interval.ms 这个配置项,并且 kafka 的版本为 0.10.0.1。

这说明我们的配置的kafka配置项没有生效。二是我们的kafka实际的集群版本是 2.0.0 与实际的消费者API版本不符。

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-10_${scala.binary.version}</artifactId>
    <version>${spark.version}</version>
</dependency>

如果直接在maven项目中引用 spark-streaming-kafka-0-10_ 默认同时会引用 kafka-client 版本是 0.10.0.1的 kafka api,而我们需要配置的 max.poll.interval.ms 在0.10.0.1版本中还不是可配置项,所以才会出现上面的情况。

解决方法

在pom中添加

<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>2.0.0</version>
</dependency>

配置完成之后, max.poll.interval.ms 参数生效,问题解决。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值