学习过kafka的都知道同一个消费组中同一个topic的数据只会消费一次(就是按groupId去订阅的topic),对于一个topic可以设置多个partition,如果一个消费组中存在多个消费者,则会通过分区策略来进行分配,比如一个topic有5个partition(p1,p2,p3,p4,p5),然后消费组中存在两个consume(c1,c2),那最后分区有可能是下图:
学到这问题来了,以前我做过一个测试,在本地对一个有5个partition的topic发送消息,只启了一个消费者客户端,对于不同partition的消费却是异步的,按照kafka的原理,我只启了一个consume,应该也是顺序消费才对,问题肯定是出在springboot集成上
org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory里面有个配置
concurrency就是起到了这个作用,来看初始化的代码,就是调用父类的初始化方法,然后将参数传递到
ConcurrentMessageListenerContainer里
@Override
protected void initializeContainer(ConcurrentMessageListenerContainer<K, V> instance) {
super.initializeContainer(instance);
if (this.concurrency != null) {
instance.setConcurrency(this.concurrency);
}
}
在ConcurrentMessageListenerContainer里面启动方法为doStart,可以看到这里判断如果concurrency大于partition的值则重置为partition的长度,我理解concurrency的值决定了启多少线程去消费,消费是根据parttition来分的,concurrency大于partition的话多的那个会空跑,然后循环创建KafkaMessageListenerContainer并调用doStart方法,doStart方法里面会创ListenerConsumer线程,而ListenerConsumer中就有熟悉的初始化kafkaConsumer方法及调用poll的方法
@Override
protected void doStart() {
if (!isRunning()) {
ContainerProperties containerProperties = getContainerProperties();
TopicPartitionInitialOffset[] topicPartitions = containerProperties.getTopicPartitions();
if (topicPartitions != null
&& this.concurrency > topicPartitions.length) {
this.logger.warn("When specific partitions are provided, the concurrency must be less than or "
+ "equal to the number of partitions; reduced from " + this.concurrency + " to "
+ topicPartitions.length);
this.concurrency = topicPartitions.length;
}
setRunning(true);
for (int i = 0; i < this.concurrency; i++) {
KafkaMessageListenerContainer<K, V> container;
if (topicPartitions == null) {
container = new KafkaMessageListenerContainer<>(this.consumerFactory, containerProperties);
}
else {
container = new KafkaMessageListenerContainer<>(this.consumerFactory, containerProperties,
partitionSubset(containerProperties, i));
}
if (getBeanName() != null) {
container.setBeanName(getBeanName() + "-" + i);
}
if (getApplicationEventPublisher() != null) {
container.setApplicationEventPublisher(getApplicationEventPublisher());
}
container.setClientIdSuffix("-" + i);
container.start();
this.containers.add(container);
}
}
这里要在说一下KafkaMessageListenerContainer创建时的partitionSubset方法,可以看到如果concurrency为1的话就取的所有的分区,else里面就走了个消费者的分区策略,大家都知道多个consume消费多个分区,kafka自带分区策略的,这里针对一个consume又定制了个分区策略,逻辑也很简单,就是用分区数除以并发数,然后逐个分配。
private TopicPartitionInitialOffset[] partitionSubset(ContainerProperties containerProperties, int i) {
TopicPartitionInitialOffset[] topicPartitions = containerProperties.getTopicPartitions();
if (this.concurrency == 1) {
return topicPartitions;
}
else {
int numPartitions = topicPartitions.length;
if (numPartitions == this.concurrency) {
return new TopicPartitionInitialOffset[] { topicPartitions[i] };
}
else {
int perContainer = numPartitions / this.concurrency;
TopicPartitionInitialOffset[] subset;
if (i == this.concurrency - 1) {
subset = Arrays.copyOfRange(topicPartitions, i * perContainer, topicPartitions.length);
}
else {
subset = Arrays.copyOfRange(topicPartitions, i * perContainer, (i + 1) * perContainer);
}
return subset;
}
}
}
至此这个代码就分析完啦,如有错误可提出来讨论讨论!后续会再针对这块源码来分析。