Kafka报错处理记录

kafka有台broker挂机之后长时间没重启,导致重启报错,与出现消费死锁解决

  1. 报错信息:

2019-04-23 17:22:42,423 WARN kafka.controller.KafkaController: [Controller 782]: Partition [hjw_test8,6] failed to complete preferred replica leader election. Leader is 201
2019-04-23 17:22:42,423 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-5-203], Error for partition [alarm_callback_topic,2] to broker 203:org.apache.kafka.common.errors.Not
LeaderForPartitionException: This server is not the leader for that topic-partition.
2019-04-23 17:22:42,426 ERROR state.change.logger: Controller 782 epoch 227 encountered error while electing leader for partition [004_8,3] due to: Preferred replica 203 for partition [004
_8,3] is either not alive or not in the isr. Current leader and ISR: [{“leader”:759,“leader_epoch”:3,“isr”:[759]}].
2019-04-23 17:22:42,426 ERROR state.change.logger: Controller 782 epoch 227 initiated state change for partition [004_8,3] from OfflinePartition to OnlinePartition failed
kafka.common.StateChangeFailedException: encountered error while electing leader for partition [004_8,3] due to: Preferred replica 203 for partition [004_8,3] is either not alive or not in
the isr. Current leader and ISR: [{“leader”:759,“leader_epoch”:3,“isr”:[759]}].
2019-04-23 17:22:42,430 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-203], Error for partition [kjTest,8] to broker 203:org.apache.kafka.common.errors.NotLeaderForParti
tionException: This server is not the leader for that topic-partition.
2019-04-23 17:22:42,430 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-203], Error for partition [app_error_log,2] to broker 203:org.apache.kafka.common.errors.NotLeaderF
orPartitionException: This server is not the leader for that topic-partition.
2019-04-23 17:22:42,430 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-203], Error for partition [kjTest2,8] to broker 203:org.apache.kafka.common.errors.NotLeaderForPart
itionException: This server is not the leader for that topic-partition.
2019-04-23 17:22:42,430 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-203], Error for partition [fj1001,9] to broker 203:org.apache.kafka.common.errors.NotLeaderForParti
tionException: This server is not the leader for that topic-partition.
2019-04-23 17:22:42,430 ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-203], Error for partition [lzsw_alarm_topic,1] to broker 203:org.apache.kafka.common.errors.NotLead
erForPartitionException: This server is not the leader for that topic-partition.
[Kafka Server 782], Proceeding to do an unclean shutdown as all the controlled shutdown attempts failed

  1. 问题原因:可能是副本的offset比leader的新,导致的不能启动。
  2. 解决方案:使用命令直接平衡所有的topic。
  3. 操作步骤:进入kafka目录,执行以下命令(若是集群执行其中一台即可)
./kafka-preferred-replica-election.sh --zookeeper localhost:2181

kafka集群单点故障

  1. 问题描述:kafka集群有三个节点,当停掉其中一个节点后,整个集群就不能正常工作。
  2. 问题原因:经排查发现__consumer_offsets这个topic的partition都存在一台kafka服务器上,而当它只有一个副本时就会存在单点故障。注: __consumer_offsets这个topic是由kafka自动创建的,默认50个。
  3. 解决方案:
  • 首先调整配置文件中的参数,如下
    num.partitions=3 (默认分区数为3)
    auto.create.topics.enable=true (自动创建topic)
    default.replication.factor=3 (默认副本数为3)
  • 等所有节点都调整完成后,需要在zookeeper中删除__consumer_offsets。
进入zookeeper/bin目录执行./zkCli.sh
ls /brokers/topics
rm -r /brokers/topics/__consumer_offsets
ls /brokers/topics
  • 最后重启zookeeper和kafka。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值