【kafka系列教程20】kafka副本和leader选举

最新推荐文章于 2024-07-18 09:08:14 发布

dagai888

最新推荐文章于 2024-07-18 09:08:14 发布

阅读量2.1k

点赞数

分类专栏：消息中间件文章标签： java kafaka mq

本文链接：https://blog.csdn.net/dcm19920115/article/details/93387259

版权

Kafka通过在多个服务器上复制topic分区确保数据可用性。每个分区有一个leader和零个或多个followers，所有读写操作都在leader上进行。在故障时，follower可以接管。Kafka提供保证：只要至少一个同步副本存活，已提交的消息就不会丢失。在所有副本死亡的情况下，可能会进行不干净的leader选举，可能导致数据不一致。系统还提供了可用性和耐久性保证，生产者可以选择等待消息确认。副本管理确保集群中的分区平衡分布。

摘要由CSDN通过智能技术生成

副本

Kafka replicates the log for each topic's partitions across a configurable number of servers (you can set this replication factor on a topic-by-topic basis). This allows automatic failover to these replicas when a server in the cluster fails so messages remain available in the presence of failures.

kafka集群在各个服务器上备份topic分区中日志（ps：就是备份我们的消息，称为副本，你可以设置每个topic的副本数）。当集群中某个服务器发生故障时，自动切换到这些副本，从而保障在故障时消息仍然可用。

Other messaging systems provide some replication-related features, but, in our (totally biased) opinion, this appears to be a tacked-on thing, not heavily used, and with large downsides: slaves are inactive, throughput is heavily impacted, it requires fiddly manual configuration, etc. Kafka is meant to be used with replication by default—in fact we implement un-replicated topics as replicated topics where the replication factor is one.
其他消息系统提供一些副本相关的功能，但是，在我们看来（有偏见），这似乎是一个附加的东西，没有大量的使用，这有很大的缺点：slave不活跃，吞吐量受到严重影响，它需要的精确的手动配置等。kafka使用的是默认副本 — 就是不需要副本的topic的复制因子就是1。

The unit of replication is the topic partition. Under non-failure conditions, each partition in Kafka has a single leader and zero or more followers. The total number of replicas including the leader constitute the replication factor. All reads and writes go to the leader of the partition. Typically, there are many more partitions than brokers and the leaders are evenly distributed among brokers. The logs on the followers are identical to the leader's log—all have the same offsets and messages in the same order (though, of course, at any given time the leader may have a few as-yet unreplicated messages at the end of its log).

副本以topic的分区为单位。在正常情况下，kafka每个分区都有一个单独的leader，0个或多个follower。副本的总数包括leader。所有的读取和写入到该分区的leader。通常，分区数比broker多，leader均匀分布在broker。follower的日志完全等同于leader的日志 — 相同的顺序相同的偏移量和消息（当然，在任何一个时间点上，leader比follower多几条消息，尚未同步到follower）

Followers consume messages from the leader just as a normal Kafka consumer would and apply them to their own log. Having the followers pull from the leader has the nice property of allowing the follower to naturally batch together log entries they are applying to their log.

Followers作为普通的消费者从leader中消费消息并应用到自己的日志中。并允许follower从leader拉取批量日志应用到自己的日志，这样具有良好的性能。

As with most distributed systems automatically handling failures requires having a precise definition of what it means for a node to be "alive". For Kafka node liveness has two conditions

和大多数分布式系统一样，自动处理失败的节点。需要精确的定义什么样的节点是“活着”的，对于kafka的节点活着有2个条件：

A node must be able to maintain its session with ZooKeeper (via ZooKeeper's heartbeat mechanism)
一个节点必须能维持与zookeeper的会话（通过zookeeper的心跳机制）
If it is a slave it must replicate the writes happening on the leader and not fall "too far" behind
如果它是一个slave，它必须复制leader并且不能落后"太多"

We refer to nodes satisfying these two conditions as being "in sync" to avoid the vagueness of "alive" or "failed". The leader keeps track of the set of "in sync" nodes. If a follower dies, gets stuck, or falls behind, the leader will remove it from the list of in sync replicas. The definition of, how far behind is too far, is controlled by the replica.lag.max.messages configuration and the definition of a stuck replica is controlled by the replica.lag.time.max.ms configuration.

我们让节点满足这2个“同步”条件，以区分“活着”还是“故障”。leader跟踪“同步”节点。如果一个follower死掉，卡住，或落后，leader将从同步副本列表中移除它。落后是通过replica.lag.max.messages配置控制，卡住是通过replica.lag.time.max.ms配置控制的。

In distributed systems terminology we only attempt to handle a "fail/recover" model of failures where nodes suddenly cease working and then later recover (perhaps without knowing that they have died). Kafka does not handle so-called "Byzantine" failures in which nodes produce arbitrary or malicious responses (perhaps due to bugs or foul play).

在分布式系统，我们只是尝试处理故障节点突然停止工作和然后稍后恢复的“故障/恢复”模式（也许不知道它们已经故障了）。kafka不处理节点产生任意或恶意的响应（也许是因为bug或犯规），所谓的“Byzantine”故障。

We can now more precisely define that a message is considered committed when all in sync replicas for that partition have applied it to their log. Only committed messages are ever given out to the consumer. This means that the consumer need n