kafka java api 入数据报错：Error: MESSAGE_TOO_LARGE

最新推荐文章于 2023-09-20 12:13:17 发布

人蠢多读书

最新推荐文章于 2023-09-20 12:13:17 发布

阅读量3.2k

点赞数 2

分类专栏：大数据运维之kafka日常

本文链接：https://blog.csdn.net/qq_35440040/article/details/103600861

版权

大数据运维之kafka日常专栏收录该内容

7 篇文章 0 订阅

订阅专栏

1.kafka版本：kafka_2.11-2.1.0-kafka-4.0.0.jar
2.server.properties：所有调优参数都是默认
3.topic ：null，所有参数默认
4.入库1G txt文件，只加三个参数：

acks：all

batch.size：1048576

linger.ms：10

引用：http://www.mamicode.com/info-detail-2265305.html

batch.size和linger.ms是对kafka producer性能影响比较大的两个参数。batch.size是producer批量发送的基本单位，默认是16384Bytes，即16kB；lingger.ms是sender线程在检查batch是否ready时候，判断有没有过期的参数，默认大小是0ms。

那么producer是按照batch.size大小批量发送消息呢，还是按照linger.ms的时间间隔批量发送消息呢？这里先说结论：其实满足batch.size和ling.ms之一，producer便开始发送消息。

5.入库报错：

[2019-12-18 16:50:50,325] WARN [Producer clientId=producer-1] Got error produce response in correlation id 645 on topic-partition null-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)
[2019-12-18 16:50:50,359] WARN [Producer clientId=producer-1] Got error produce response in correlation id 646 on topic-partition null-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)

6.bug链接：https://issues.apache.org/jira/browse/KAFKA-8350

Currently, producers do the batch splitting based on the batch size. However, the split will never succeed when batch size is greatly larger than the topic-level max message size.

For instance, if the batch size is set to 8MB but we maintain the default value for broker-side `message.max.bytes` (1000012, about1MB), producer will endlessly try to split a large batch but never succeeded, as shown below:

[2019-05-10 16:25:09,233] WARN [Producer clientId=producer-1] Got error produce response in correlation id 61 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:10,021] WARN [Producer clientId=producer-1] Got error produce response in correlation id 62 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:10,758] WARN [Producer clientId=producer-1] Got error produce response in correlation id 63 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)
[2019-05-10 16:25:12,071] WARN [Producer clientId=producer-1] Got error produce response in correlation id 64 on topic-partition test-0, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender:617)

A better solution is to have producer do splitting based on the minimum of these two configs. However, it is tricky for the client to get the topic-level or broker-level config values. Seems there could be three ways to do this:

When broker throws `RecordTooLargeException`, do not swallow its real message since it contains the max message size already. If the message is not swallowed, the client easily gets it from the response.
Add code to issue `DescribeConfigsRequest` to retrieve the value.
If splitting failed, decreases the batch size gradually until the split is successful. For example,

// In RecordAccumulator.java
private int steps = 1;
......
public int splitAndReenqueue(ProducerBatch bigBatch) {
......
    Deque<ProducerBatch> dq = bigBatch.split(this.batchSize / steps);
    if (dq.size() == 1) // split failed
        steps++;
......
}

7.调整入库参数：topic ：message.max.bytes：default：1000012

batch.size为1000001，入库成功；

batch.size为1000012，入库成功；

batch.size为1000112，入库失败；

8.疑问点：仅仅设置produce端压缩方式为gzip，然后设置代码produce参数batch.size为1048576 >message.max.bytes：default,其他保持不变，这种情况下从未出现MESSAGE_TOO_LARGE报错，不知道为啥，希望大神解答，感谢。

9.然后设置topic为lz4，仅仅设置produce端压缩方式为gzip，produce参数batch.size为1048576，入库又报错了。

人蠢多读书

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
kafka java api 入数据报错：Error: MESSAGE_TOO_LARGE

1.kafka版本：kafka_2.11-2.1.0-kafka-4.0.0.jar 2.server.properties：所有调优参数都是默认 3.topic ：null，所有参数默认 4.入库1G txt文件，只加三个参数：acks：allbatch.size：1048576linger.ms：10引用：http://www.mamicode.com/info-detai...
复制链接

扫一扫

专栏目录