记一次kafka消费能力优化
之前的代码:
有多个source:多个kafka,一个ES
1.消费者数据接口
interface Source {
List<String> poll();
}
- 1
- 2
- 3
2.impl
class KafkaSource implement Source {
List<String> poll() {
ConsumerRecords<String,String> records = kafkaConsumer.poll(500);
List<String> dataList = new ArrayList(); //linkedlist是否要好点
for(ConsumerRecord record : records ) {
String data = Adaptor.adaptor(record);
dataList.add(data);
}
return dataList ;
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
3.实际消费者
class Server {
Source source;
public Server(Source source) {
this.source = source;
}
void run() {
while(true) {
List<String> dataList = source.poll();
for(String data : dataList) {
doSomething(data);
}
}
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
消费能力:10000条/s
1.消费者数据接口
改造(使用Vistor模式)
public interface Source {
void consume(SourceVistor sourceVistor);
interface SourceVistor{
void accept(Event event);
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
2.impl
class KafkaSource implement Source {
public void consume(final SourceVistor sourceVistor) {
executorService.submit(new Runnable() {
@Override
public void run() {
while (true) {
ConsumerRecords<String, String> consumerRecords = consumer.poll(pollMillions);
for (ConsumerRecord<String, String> record : consumerRecords)
{
sourceVistor.accept(Adaptor.adapt(record.value()));
}
consumer.commitAsync();
}
}
});
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
3.实际消费者
public class Server {
@Override
public void start() {
//开始消费kafka
for (Source source : sources) {
source.consume(new MatchAccepter());
}
}}
private class MatchAccepter implements Source.SourceVistor{
public void accept(String data) {
doSomething(data);
}
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
4.修改kafka消费配置,指定消费数据量
<!-- 消费者通用配置 -->
<util:properties id="commonConsumerConfig">
<prop key="enable.auto.commit">${kafka.consumer.auto.commit.enable:false}</prop>
<prop key="request.timeout.ms">${kafka.consumer.request.timeout.ms:50000}</prop>
<prop key="auto.commit.interval.ms">${kafka.consumer.auto.commit.interval.ms:1000}</prop>
<prop key="max.partition.fetch.bytes">${kafka.consumer.max.partition.fetch.bytes:1000000}</prop>
<prop key="auto.offset.reset">${kafka.consumer.auto.offset.reset:latest}</prop>
<prop key="heartbeat.interval.ms">${kafka.consumer.heartbeat.interval.ms:25000}</prop>
<prop key="session.timeout.ms">${kafka.consumer.session.timeout.ms:30000}</prop>
<prop key="key.deserializer">org.apache.kafka.common.serialization.StringDeserializer</prop>
<prop key="value.deserializer">org.apache.kafka.common.serialization.StringDeserializer</prop>
</util:properties>
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
消费能力: 70000/s
总结:
可以看出,之前的方式,读取出来再封装进List再循环读取这种方式非常简单,也是第一时间想到的,但是效率比后者访问者模式低了一个数量级(可能也和kafka配置有关.)改了之后,只需要读取一次就消费,逻辑上来讲,减少了2/3的浪费!!!
对于kafka这种我们监控数据,每秒钟可能10W条数据,因此一点点地方都要注意,何况这个地方是重中之重!改了之后,性能立马飙升!
(内存2G)
最后:Vistor模式很不错!