Kafka生产中的一些常见问题记录

程序小法师

已于 2022-11-17 09:48:18 修改

阅读量342

点赞数

分类专栏： Java 文章标签： kafka

于 2022-04-16 14:46:41 首次发布

Java 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文探讨了Kafka生产中可能导致消息丢失的各个环节，包括生产者、broker端和消费者端的问题，以及采取拉取而非推送的原因。针对消息丢失，提出了消费者限速、手动提交位移等解决方案，并分析了Kafka的复制策略。

摘要由CSDN通过智能技术生成

消息丢失

要解决消息不丢的问题，要从整个链路去看，只有保证全链路的不丢失，才能完全保证消息不丢

生产者

生产者保证消息不丢的方式，网上的教程很多，无非就是配置ACK级别，增加重试，非常重要的消息，可以尝试无限重试（个人并不觉得是个好的解决方案），感觉重试一定次数之后，仍然失败的消息，持久化即可，后续认为介入处理或者定时任务进行重试处理，这样的话需要考虑消息量大的时候，持久化方式的承受能力，一般mysql数据库的承受能力约千TPS级别（也是要看机器配置，可以参考一下腾讯云做的性能测试）。

broker端

默认情况下，当 acks=all 时，一旦所有同步副本ISR都收到消息，就会发生确认。

broker端如何保证数据不丢失，这个可以参考一些书籍，如深入理解Kafka:核心设计与实践原理，一般是刷盘前宕机导致数据丢失，如果ack=all的时候，按理来说这种丢失概率很小，只有未刷盘前，leader和整个ISR都挂掉才可能出现数据丢失，否则应该不会出现（有点忘了TODO）

消费者端

默认的消费者会批量拉取消息，然后自动提交位移，具体的拉取量可以自己配置。

如果是同步的消费方式，消息拉过来，消费了一部分消息之后消费者重启或者重新部署了之后，位移是没有提交的，这时候重启之后就会出现重复消费的情况，这个消费端自己做幂等即可，幂等实现方案包括常见的数据库唯一索引等，网上一大堆。

但是常见的消费模型是异步的，也就是说一个消费线程拉取消息，然后多个线程（线程池）去并行消费消息，异步消息的消费线程不会阻塞拉取线程，因此拉取完消息，就自动提交了位移，这样常见的问题就是消费者会不断的拉取消息，消息量大的情况下，如果不做处理，会导致潜在的OOM异常。

常见的解决方案是消费端进行限速（别的方案如broker端限速quota，这个是用来防止broker端被搞爆用的，可能还有别的方案，有没有大佬求指导），常见的限速方案包括滑动窗口、漏桶、令牌桶算法，算法区别要注意一下，但是不同的消息处理对CPU的消耗是不同的，CPU消耗高的消息多时，需要降低限速，不然CPU压力之后，服务可能会异常，CPU消耗少的消息多时，可以调高限速，动态的调整限速也是一个问题（这个阿里的sentinel有基于负载状态的限速方案可以研究一下）

提一下：后来想了一下，如果线程池拒绝策略调整为调用线程执行这个问题能不能得到解决？

异步消费会导致潜在的消息丢失问题，因为拉取线程对异步消费任务提交之后，就直接提交了位移，如果此时服务宕机或者消费异常，服务重启之后，就会出现消息丢失的现象。常见的解决方法就是消费完之后手动提交位移。

手动提交位移在并行消费的时候会引入另外一个问题：因为每个线程的处理速度是不同的，无法保证位移小的消息一定先被消费完并提交位移，如果位移大的消息先被处理完成，并提交了位移，然后宕机了，那么重启之后，由于提交了大位移，消费者就会从大位移之后的位置开始消费，这样大位移之前的消息就丢了。

采用同步消费的方式，每消费一个消息就提交一个位移明显不是一个好方法，这样吞吐完全没法上去。

简单的方法是设置一个消费Barrier，每来一轮消息，并发消费完之后对位移进行提交，不失为一个提高吞吐的方法，肯定不是最优解，消息消费的速率取决于最慢那个线程的消费速度，一旦一个消息消费出现异常，就会导致整个消费位移无法提交，消费阻塞。

深入理解Kafka:核心设计与实践原理 中提供了一种基于滑动窗口的方式来进行位移提交，具体原理TODO

手动提交位移同样需要消费者注意消费幂等问题

为什么采取拉取而不是broker端推送的方式

官网的文档说明：

There are pros and cons to both approaches. However, a push-based system has difficulty dealing with diverse consumers as the broker controls the rate at which data is transferred. The goal is generally for the consumer to be able to consume at the maximum possible rate; unfortunately, in a push system this means the consumer tends to be overwhelmed when its rate of consumption falls below the rate of production (a denial of service attack, in essence). A pull-based system has the nicer property that the consumer simply falls behind and catches up when it can. This can be mitigated with some kind of backoff protocol by which the consumer can indicate it is overwhelmed, but getting the rate of transfer to fully utilize (but never over-utilize) the consumer is trickier than it seems. Previous attempts at building systems in this fashion led us to go with a more traditional pull model.

基于推送的系统难以处理不同的消费者，因为代理控制了数据传输的速率。目标通常是让消费者能够以最大可能的速率消费；不幸的是，在推送系统中，这意味着当消费者的消费率低于生产率时，消费者往往会不知所措（本质上是拒绝服务攻击）。拉式系统具有更好的特性，即消费者只是落后并在可能的情况下赶上。这可以通过某种退避协议来缓解，消费者可以通过该协议表明它已经不堪重负，但是让传输速率充分利用（但从不过度利用）消费者比看起来更棘手。

Another advantage of a pull-based system is that it lends itself to aggressive batching of data sent to the consumer. A push-based system must choose to either send a request immediately or accumulate more data and then send it later without knowledge of whether the downstream consumer will be able to immediately process it. If tuned for low latency, this will result in sending a single message at a time only for the transfer to end up being buffered anyway, which is wasteful. A pull-based design fixes this as the consumer always pulls all available messages after its current position in the log (or up to some configurable max size). So one gets optimal batching without introducing unnecessary latency.

基于拉取的系统的另一个优点是，它有助于对发送给消费者的数据进行积极的批处理。基于推送的系统必须选择立即发送请求或累积更多数据，然后在不知道下游消费者是否能够立即处理它的情况下稍后发送。如果针对低延迟进行了调整，这将导致一次发送一条消息，只是为了传输最终被缓冲，这是一种浪费。基于拉取的设计解决了这个问题，因为消费者总是在日志中的当前位置之后拉取所有可用消息（或达到某个可配置的最大大小）。因此，可以在不引入不必要延迟的情况下获得最佳批处理。

The deficiency of a naive pull-based system is that if the broker has no data the consumer may end up polling in a tight loop, effectively busy-waiting for data to arrive. To avoid this we have parameters in our pull request that allow the consumer request to block in a "long poll" waiting until data arrives (and optionally waiting until a given number of bytes is available to ensure large transfer sizes).

一个简单的基于拉的系统的缺点是，如果代理没有数据，消费者可能最终会在一个紧密的循环中轮询，实际上是忙于等待数据到达。为了避免这种情况，我们在拉取请求中设置了参数，允许消费者请求在“长轮询”中阻塞，等待数据到达（并且可选地等待给定数量的字节可用以确保大传输大小）。

复制策略

There are a rich variety of algorithms in this family including ZooKeeper's Zab, Raft, and Viewstamped Replication. The most similar academic publication we are aware of to Kafka's actual implementation is PacificA from Microsoft.

The downside of majority vote is that it doesn't take many failures to leave you with no electable leaders. To tolerate one failure requires three copies of the data, and to tolerate two failures requires five copies of the data. In our experience having only enough redundancy to tolerate a single failure is not enough for a practical system, but doing every write five times, with 5x the disk space requirements and 1/5th the throughput, is not very practical for large volume data problems. This is likely why quorum algorithms more commonly appear for shared cluster configuration such as ZooKeeper but are less common for primary data storage. For example in HDFS the namenode's high-availability feature is built on a majority-vote-based journal, but this more expensive approach is not used for the data itself.

多数投票算法在大数据量的时候会有更多冗余