Producer 性能调优公式及验证

最新推荐文章于 2024-05-16 08:18:21 发布

美伊小公主的奶爸

最新推荐文章于 2024-05-16 08:18:21 发布

阅读量1.1k

点赞数

分类专栏： Kafka 文章标签： Kafka Producer 参数调优源码

本文链接：https://blog.csdn.net/cymvp/article/details/76020614

版权

本文介绍了在Kafka环境中，针对Producer的性能调优，特别是关注send.buffer.bytes和receive.buffer.bytes参数对低延迟和高延迟网络的影响。分析了acks参数如何影响记录发送速率，并揭示了在高延迟网络中，这两个缓冲区大小的重要性。通过实验展示了不同配置下的性能基准，包括max.in.flight.requests.per.connection、record_size、partitions和brokers。

摘要由CSDN通过智能技术生成

背景

Kafka的Producer有很多的参数可以影响到Producer的写性能. 大多数人应该会对这些参数比较困惑, 往往会混淆名称相似的参数; 即使能够区分每个参数的意义，也很难知道如何通过组合这些参数达到Producer的比较高的性能.

本人通过研究源码加上实践和思考，总结出了一个计算Producer的吞吐的公式, 这个公式包含Producer端重要的调优的参数，可以帮助大家直观的调优自己的Producer的性能, 而不用毫无头绪和方向的盲目尝试各种参数的组合.

环境

硬件

CPU: 32cors, Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz

Memory: 128GB

Network: 1Gb

OS: Linux version 3.10.0-514.el7.x86_64

Topic

Topic Name	Broker Count	Partition Count per Broker	Replicas
test_topic	1	16	1
test_single_partition	3	1	3
test_performance	4	1	3

工具

使用官方提供的性能测试工具: kafka-producer-perf-test.sh.

调优参数

Producer的参数

batch.size

send.buffer.bytes

receive.buffer.bytes //Can simply equals the send.buffer.bytes;

acks

linger.ms

max.in.flight.requests.per.connection

buffer.memory

工具模拟的参数

record-size

num-records

throughput

Data Center

为了模拟数据延迟带来的性能问题以及调优方案, 这里引入两个数据中心进行数据传输的场景. 两个数据中心之间的延迟可以通过ping得到一个经验值, 实际的延迟估算后面会详细描述.

!355 $ ping nycgmq01
PING nycgmq01.fwmrm.net (10.0.13.205) 56(84) bytes of data.
64 bytes from 10.0.13.205: icmp_seq=1 ttl=59 time=71.9 ms
64 bytes from 10.0.13.205: icmp_seq=2 ttl=59 time=71.9 ms
64 bytes from 10.0.13.205: icmp_seq=3 ttl=59 time=72.0 ms
64 bytes from 10.0.13.205: icmp_seq=4 ttl=59 time=71.8 ms
64 bytes from 10.0.13.205: icmp_seq=5 ttl=59 time=71.8 ms

公式

我先列出Producer的性能计算公式, 然后对这个公式做详细的解释:

if (record-size > batch.size){
    packet_size = record-size; //Can not append more records into one batch;
}else{
    packet_size = batch.size / record-size * record-size; //make batch full of records;
}
 
request_size = packet_size * partitions_per_broker;
 
speed = min(max.in.flight.requests.per.connection * request_size, send.buffer.size) * 1/RTT;

公式中出现的变量, 基本都被上面列出的参数包含.

公式中, 直接包含的参数有: batch.size, record-size, send.buffer.bytes(receive.buffer.bytes), max.in.flight.requests.per.connection, record-size.

未在公式中出现的参数:

1 acks: 这个是和公式中的RTT相关的, 后面会详细介绍;

2 buffer.memory这个参数想对于其他参数来说，并不是很影响Producer的性能, 除非你的Producer和Brokers同在一个网络质量非常高, 延迟率非常低的网络中. 所以在本篇文章的实验测试中, 我把这个值固定在200MB, 以防止过小的值对公式的验证带来扰动.

3 linger.ms这个参数取决于你上层应用的调用Producer的速度; 在本篇文章中, 我们通常会设置record-size大于batch-size, 这样linger.ms就没有实际用处了, 因为无法积攒多条record到一个batch中. 所以我们把它设置为0.

4 "num-records" and "throughput" 这连个参数是用于模拟消息的发送速度和发送的总条数. 这两个参数不会影响Producer的性能.

如何计算RTT

上图中出现了RTT这个东西. The round-trip time (RTT) is the length of time it takes for a signal to be sent plus the length of time it takes for an acknowledgment of that signal to be received. This time delay therefore consists of the propagation times between the two points of a signal.

那如何计算RTT呢? 如果我们能保证每个Producer发送给每个Broker的Request请求都只包含一条Record, 并且我们能够知道每秒发送了多少条Record, 那么我们就能够知道每秒发送的Reques数量, 这样我们就能计算出发送一个Request的往返时间, 这样就得到了RTT.

为了达到每个Request只包含一条Record这个条件, 根据上图描述的数据发送原理, 我们设计各个参数的关系如下:

1 让record_size >= batch.size; //确保一个batch中只有一条Record;

2 创建一个topic, 这个topic只有一个partition; //一个Request请求只包含一个batch;

3 max.in.flight.requests.per.connection=1; 确保向一个broker一次只发送一个Request请求;

RTT除了与Producer和Broker的传输延迟有关, 还与Broker收到Request后的处理时间有关. 所以我列出了acks等于-1,0,1三种值下, RTT的测试值: