- 1.kafka版本:kafka_2.11-2.1.0-kafka-4.0.0.jar
- 2.server.properties:所有调优参数都是默认
- 3.topic :null,所有参数默认
- 4.入库1G txt文件,只加三个参数:
acks:all
batch.size:1048576
linger.ms:10
batch.size和linger.ms是对kafka producer性能影响比较大的两个参数。batch.size是producer批量发送的基本单位,默认是16384Bytes,即16kB;lingger.ms是sender线程在检查batch是否ready时候,判断有没有过期的参数,默认大小是0ms。
那么producer是按照batch.size大小批量发送消息呢,还是按照linger.ms的时间间隔批量发送消息呢?这里先说结论:其实满足batch.size和ling.ms之一,producer便开始发送消息。
- 5.入库报错:
[2019-12-18 16:50:50,325] WARN [Producer clientId=producer-1] Got error produce response in correlation id 645 on topic-partition null-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)
[2019-12-18 16:50:50,359] WARN [Producer clientId=producer-1] Got error produce response in correlation id 646 on topic-partition null-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)
Currently, producers do the batch splitting based on the batch size. However, the split will never succeed when batch size is greatly larger than the topic-level max message size.
For instance, if the batch size is set to 8MB but we maintain the default value for broker-side `message.max.bytes` (1000012, about1MB), producer will endlessly try to split a large batch but never succeeded, as shown below:
[2019-05-10 16:25:09,233] WARN [Producer clientId=producer-1] Got error produce response in correlation id 61 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:10,021] WARN [Producer clientId=producer-1] Got error produce response in correlation id 62 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:10,758] WARN [Producer clientId=producer-1] Got error produce response in correlation id 63 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:12,071] WARN [Producer clientId=producer-1] Got error produce response in correlation id 64 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
A better solution is to have producer do splitting based on the minimum of these two configs. However, it is tricky for the client to get the topic-level or broker-level config values. Seems there could be three ways to do this:
- When broker throws `RecordTooLargeException`, do not swallow its real message since it contains the max message size already. If the message is not swallowed, the client easily gets it from the response.
- Add code to issue `DescribeConfigsRequest` to retrieve the value.
- If splitting failed, decreases the batch size gradually until the split is successful. For example,
// In RecordAccumulator.java
private int steps = 1;
......
public int splitAndReenqueue(ProducerBatch bigBatch) {
......
Deque<ProducerBatch> dq = bigBatch.split(this.batchSize / steps);
if (dq.size() == 1) // split failed
steps++;
......
}
- 7.调整入库参数:topic :message.max.bytes:default:1000012
batch.size为1000001,入库成功;
batch.size为1000012,入库成功;
batch.size为1000112,入库失败;
8.疑问点:仅仅设置produce端压缩方式为gzip,然后设置代码produce参数batch.size为1048576 >message.max.bytes:default,其他保持不变,这种情况下从未出现MESSAGE_TOO_LARGE报错,不知道为啥,希望大神解答,感谢。
9.然后设置topic为lz4,仅仅设置produce端压缩方式为gzip,produce参数batch.size为1048576,入库又报错了。