-
消息队列
说明:点对点:客户端发起轮询,一个消息只被一个消费者接收,接收后删除,即使有多个消费者监听也只会被一个消费者消费。 -
kafka架构
-
kafka只能保证分区有序性,每个partition内按消息的生产属性消费,比如partition0的offset 2未消费,不允许消费partition0的offset 3;不同分区内的消息不能保证消费的先后和生产顺序一致。 -
如果存在备份(replication),kafka生产者的消息会发布到leader,follower从leader复制数据做备份。消费者从leader消费数据。即消息的读写+备份都是通过leader进行的,follower仅做备份。当leader挂掉之后,从follower中选一个做leader。
好处是follower不参与读,不需要保证follower之间数据一致,只需要保证follower和leader之间数据一致,实现和维护更简单,不容易出差错。
与之相比,zk的leader负责读写,follower只读。当follower收到写请求时,会转发到leader。
kafka的2种分区策略:
range是针对topic,对每个topic的分区按字典序平均分给cosumer。
具体算法如下:假设n=分区数/消费者数量,m=分区数%消费者数量,那么前m个消费者每个分配n+1个分区,后面的(消费者数量-m)个消费者每个分配n个分区。
举例:比如t0有[p0,p1,p2,p3,p4]5个分区,有[c0,c1]两个consumer,计算得出,c0分得5/2向上取整数=3个,c1分得5/2=2个。所以前3个分给c0后2个分给c1,即c1[p0,p1,p2],c2[p3,p4]
问题:可能分配不平均。假设有2个主题,每个主题有3个分区,有2个消费者,那么所订阅的所有分区可以标识为:t0p0、t0p1、t0p2、t1p0、t1p1、t1p2。最终的分配结果为:C0:t0p0、t0p1、t1p0、t1p1;C1:t0p2、t1p2。C0分配到的分区是C1的两倍。
RoundRobin是针对所有topic和cosumer的轮询。举个例子。有t1[p0,p1,p2], t2[p0,p1,p2,p3], t3[p0,p1]
c1订阅[t1,t3], c2订阅[t2,t3] c3订阅[t1], c4订阅[t2]。 会分别轮询分区和cosumer,用类似贪心的方法分配。
即,首先判断t1p0和c1,有订阅关系,分配成功;轮询下一个分区和消费者,t1p1和c2,无订阅关系,
轮询下一个消费者,t1p1和c3,有消费关系,分配成功;轮询下一个分区和消费者t1p2和c4,无订阅关系,
轮询下一个消费者,t1p2和c1,有订阅关系,分配成功…以此类推。
/*
* 备注:下文为源码中的注释
* For example, suppose there are two consumers C0 and C1, two topics t0 and t1, and each topic has 3 partitions,
* resulting in partitions t0p0, t0p1, t0p2, t1p0, t1p1, and t1p2.
* The assignment will be:
* C0: [t0p0, t0p2, t1p1]
* C1: [t0p1, t1p0, t1p2]
*
* When subscriptions differ across consumer instances, the assignment process still considers each
* consumer instance in round robin fashion but skips over an instance if it is not subscribed to
* the topic. Unlike the case when subscriptions are identical, this can result in imbalanced
* assignments. For example, we have three consumers C0, C1, C2, and three topics t0, t1, t2,
* with 1, 2, and 3 partitions, respectively. Therefore, the partitions are t0p0, t1p0, t1p1, t2p0,
* t2p1, t2p2. C0 is subscribed to t0; C1 is subscribed to t0, t1; and C2 is subscribed to t0, t1, t2.
*
* Tha assignment will be:
* C0: [t0p0]
* C1: [t1p0]
* C2: [t1p1, t2p0, t2p1, t2p2]
* /
博客
https://www.cnblogs.com/cjsblog/p/9664536.html