Kafka参数影响及性能测试

Kafka提供了2个测试脚本,kafka-producer-perf-test.sh以及kafka-consumer-perf-test.sh,  kafka参数非常多,有些使用默认即可,有些对性能影响极大,只有经过测试,你才能够对这些参数有直观的感觉。 下面我们先测试producer.

先看看producer脚本怎么使用:

[hdfs@namenode02 tmp]$  /opt/cloudera/parcels/KAFKA/lib/kafka/bin/kafka-producer-perf-test.sh
usage: producer-performance [-h] --topic TOPIC --num-records NUM-RECORDS --record-size RECORD-SIZE --throughput THROUGHPUT
                            --producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]

This tool is used to verify the producer performance.

optional arguments:
  -h, --help             show this help message and exit
  --topic TOPIC          produce messages to this topic
  --num-records NUM-RECORDS
                         number of messages to produce
  --record-size RECORD-SIZE
                         message size in bytes
  --throughput THROUGHPUT
                         throttle maximum message throughput to *approximately* THROUGHPUT messages/sec
  --producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]
                         kafka producer related configuaration properties like bootstrap.servers,client.id etc..
[hdfs@namenode02 tmp]$ 

默认测试命令如下, 发送100000条记录,每个记录100 bytes

/opt/cloudera/parcels/KAFKA/lib/kafka/bin/kafka-producer-perf-test.sh --topic jlwang --num-records 1000000 --record-size 100 --producer-props  bootstrap.servers=datanode04.isesol.com:9092 --throughput 1000000 

由于默认参数没有去做修改,那么主要的几个参数如下:

buffer.memory = 33554432              这个就是消息缓存,producer发消息默认先发给buffer

block.on.buffer.full = false                如果发送的消息量太大,撑满了buffer怎么办? 我相信kafka会有清理 buffer的功能,但是如果即使清理也赶不到发送速度呢? 这个参数的

                                                              意义就是如果出现这个情况,是堵塞发送,还是报错?

request.timeout.ms = 30000

acks = 1

retries = 0

max.request.size = 1048576

linger.ms = 0                                    

batch.size = 16384

接下来我们主要测试 batch, buffer, ack, linger.ms的影响。


默认:

1000000 records sent, 288184.438040 records/sec (27.48 MB/sec), 574.34 ms avg latency, 918.00 ms max

acks=all :

1000000 records sent, 121212.121212 records/sec (11.56 MB/sec), 1566.87 ms avg latency, 2640.00 ms max latency

acks=all, linger.ms=100ms :

1000000 records sent, 128188.693757 records/sec (12.23 MB/sec), 1506.37 ms avg latency, 1960.00 ms max latency

buffer.memory=100000 :

1000000 records sent, 66427.527567 records/sec (6.34 MB/sec), 1.06 ms avg latency, 307.00 ms max latency

batch.size=1, acks=1 :

16669 records sent, 3333.8 records/sec (0.32 MB/sec), 2587.5 ms avg latency, 4303.0 max latency.

随后报错:org.apache.kafka.common.errors.TimeoutException: Batch Expired   生产的数据速度远远超过发送速度,导致失败timeout,然后失败。


其实已经不用测了,上面这几个参数对整个发送性能都有相当大的影响, 如果发送量很大,可以考虑增加buffer, batch.size, linger.ms的值,acks设置为1.  至于设置多大,坦白说我觉得给个double就行了,也不需要太大。 如果发送量不大,其实默认值kafka给的很不错,可以应付大部分系统。

另外要提一点record.size也严重影响发送速度,生产上尽量避免太大的record.size, 看下面测试结果,我设置record.size=10000,速度严重不行

24499 records sent, 4899.8 records/sec (46.73 MB/sec), 364.1 ms avg latency, 748.0 max latency.
28500 records sent, 5700.0 records/sec (54.36 MB/sec), 346.4 ms avg latency, 742.0 max latency.
28134 records sent, 5626.8 records/sec (53.66 MB/sec), 363.0 ms avg latency, 806.0 max latency.
28037 records sent, 5607.4 records/sec (53.48 MB/sec), 362.7 ms avg latency, 821.0 max latency.
23201 records sent, 4640.2 records/sec (44.25 MB/sec), 429.9 ms avg latency, 1088.0 max latency.
17055 records sent, 3411.0 records/sec (32.53 MB/sec), 605.7 ms avg latency, 1361.0 max latency.
21415 records sent, 4283.0 records/sec (40.85 MB/sec), 490.0 ms avg latency, 1019.0 max latency.
26560 records sent, 5312.0 records/sec (50.66 MB/sec), 383.6 ms avg latency, 853.0 max latency.
23193 records sent, 4638.6 records/sec (44.24 MB/sec), 446.7 ms avg latency, 1225.0 max latency.
26156 records sent, 5231.2 records/sec (49.89 MB/sec), 387.6 ms avg latency, 1068.0 max latency.
28024 records sent, 5604.8 records/sec (53.45 MB/sec), 372.2 ms avg latency, 855.0 max latency.
27209 records sent, 5441.8 records/sec (51.90 MB/sec), 377.0 ms avg latency, 842.0 max latency.


对于consumer就不做具体测试了,主要是因为影响参数没那么多,receive.buffer.bytes,auto.offset.reset,max.partition.fetch.bytes,fetch.min.bytes,isolation.level,max.poll.interval.ms,receive.buffer.bytes,request.timeout.ms  

估计真正会设置的几个参数也就这个,其他基本都不太用。





  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

tom_fans

谢谢打赏

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值