前言
本篇文章主要从以下几个方面来分析批量消息:批量消息的介绍、如何发送批量消息、批量消息在broker端时如何存储、消费者如何消费批量消息。
一、批量消息
1. 批量消息特点
批量消息有如下特点:
- 批量消息具有相同的topic
- 批量消息具有相同的waitStoreMsgOK属性
- 批量消息不支持延迟消息
- 批量消息的大小不超过4M
2.使用场景
- 提高消息的吞吐能力和处理效率
- 降低下游系统的API调用频率
3.demo示例
RocketMQ源码的example包中给出了批量消息的2个使用示例,一个是发送批量消息,一个是在发送大批量消息时通过切分消息来确保消息的大小没有超过限制。具体代码如下:
demo 1:
public class SimpleBatchProducer {
public static void main(String[] args) throws Exception {
DefaultMQProducer producer = new DefaultMQProducer("BatchProducerGroupName");
producer.start();
//If you just send messages of no more than 1MiB at a time, it is easy to use batch
//Messages of the same batch should have: same topic, same waitStoreMsgOK and no schedule support
String topic = "BatchTest";
List<Message> messages = new ArrayList<>();
messages.add(new Message(topic, "Tag", "OrderID001", "Hello world 0".getBytes()));
messages.add(new Message(topic, "Tag", "OrderID002", "Hello world 1".getBytes()));
messages.add(new Message(topic, "Tag", "OrderID003", "Hello world 2".getBytes()));
producer.send(messages);
}
}
demo 2:
public class SplitBatchProducer {
public static void main(String[] args) throws Exception {
DefaultMQProducer producer = new DefaultMQProducer("BatchProducerGroupName");
producer.start();
//large batch
String topic = "BatchTest";
List<Message> messages = new ArrayList<>(100 * 1000);
for (int i = 0; i < 100 * 1000; i++) {
messages.add(new Message(topic, "Tag", "OrderID" + i, ("Hello world " + i).getBytes()));
}
//split the large batch into small ones:
ListSplitter splitter = new ListSplitter(messages);
while (splitter.hasNext()) {
List<Message> listItem = splitter.next();
producer.send(listItem);
}
}
}
class ListSplitter implements Iterator<List<Message>> {
private int sizeLimit = 1000 * 1000;
private final List<Message> messages;
private int currIndex;
public ListSplitter(List<Message> messages) {
this.messages = messages;
}
@Override
public boolean hasNext() {
return currIndex < messages.size();
}
@Override
public List<Message> next() {
int nextIndex = currIndex;
int totalSize = 0;
for (; nextIndex < messages.size(); nextIndex++) {
Message message = messages.get(nextIndex);
int tmpSize = message.getTopic().length() + message.getBody().length;
Map<String, String> properties = message.getProperties();
for (Map.Entry<String, String> entry : properties.entrySet()) {
tmpSize += entry.getKey().length() + entry.getValue().length();
}
tmpSize = tmpSize + 20; //for log overhead
if (tmpSize > sizeLimit) {
//it is unexpected that single message exceeds the sizeLimit
//here just let it go, otherwise it will block the splitting process
if (nextIndex - currIndex == 0) {
//if the next sublist has no element, add this one and then break, otherwise just break
nextIndex++;
}
break;
}
if (tmpSize + totalSize > sizeLimit) {
break;
} else {
totalSize += tmpSize;
}
}
List<Message> subList = messages.subList(currIndex, nextIndex);
currIndex = nextIndex;
return subList;
}
@Override
public void remove() {
throw new UnsupportedOperationException("Not allowed to remove");
}
}
二、批量消息发送
批量消息发送的过程与普通消息发送基本一致,具体发送详情可以参考笔者之前分享的文章RocketMQ源码分析之普通消息发送,此外需要注意目前批量消息的发送方式只支持同步和异步,没有单向发送。
批量消息发送流程虽然与普通消息发送基本一致,但是在发送前会先对消息进行batch操作,即将消息从Collection转换为MessageBatch,这里我们先简单看下MessageBatch对象,可以发现MessageBatch继承了Message并且它还包含一个属性List messages(存储批量消息的容器)
batch包含以下操作:
1.通过调用generateFromList方法将批量消息从Collection转换为MessageBatch
2.遍历MessageBatch对象,并对每条消息做以下操作:
- 检查消息的topic以及消息
- 为每条消息构建UNIQ_KEY,并将该kv存储在其properties属性中
- 为每条消息设置topic
3.将MessageBatch对象encode并将其结果设置到MessageBatch对象的body属性中
private MessageBatch batch(Collection<Message> msgs) throws MQClientException {
MessageBatch msgBatch;
try {
msgBatch = MessageBatch.generateFromList(msgs);
for (Message message : msgBatch) {
Validators.checkMessage(message, this);
MessageClientIDSetter.setUniqID(message);
message.setTopic(withNamespace(message.getTopic()));
}
msgBatch.setBody(msgBatch.encode());
} catch (Exception e) {
throw new MQClientException("Failed to initiate the MessageBatch", e);
}
msgBatch.setTopic(withNamespace(msgBatch.getTopic()));
return msgBatch;
}
下面重点分析generateFromList方法和encode方法:
1.generateFromList
generateFromList方法的主要操作如下:
- 遍历messages,检查每条消息的DELAY级别是否大于0以及消息的topic是否以%RETRY%开头,如果是则抛出异常UnsupportedOperationException,从这里可以看出批量消息不支持延迟消息以及Retry Group。此外还会检查每条消息的topic和waitStoreMsgOK是否一致,如果不一致也会抛出异常UnsupportedOperationException
- 使用循环获取的messageList构建MessageBatch对象,并设置topic和waitStoreMsgOK
public static MessageBatch generateFromList(Collection<Message> messages) {
assert messages != null;
assert messages.size() > 0;
List<Message> messageList = new ArrayList<Message>(messages.size());
Message first = null;
for (Message message : messages) {
if (message.getDelayTimeLevel() > 0) {
throw new UnsupportedOperationException("TimeDelayLevel is not supported for batching");
}
if (message.getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
throw new UnsupportedOperationException("Retry Group is not supported for batching");
}
if (first == null) {
first = message;
} else {
if (!first.getTopic().equals(message.getTopic())) {
throw new UnsupportedOperationException("The topic of the messages in one batch should be the same");
}
if (first.isWaitStoreMsgOK() != message.isWaitStoreMsgOK()) {
throw new UnsupportedOperationException("The waitStoreMsgOK of the messages in one batch should the same");
}
}
messageList.add(message);
}
MessageBatch messageBatch = new MessageBatch(messageList);
messageBatch.setTopic(first.getTopic());
messageBatch.setWaitStoreMsgOK(first.isWaitStoreMsgOK());
return messageBatch;
}
2.encode
encode方法的实现是在encodeMessages中,该方法会遍历所有的消息并对每条消息进行编码,编码只需要消息的flag、body和properties属性,每条消息会按照以下格式进行编码,最后将编码后的结果转换成字节数组,这样就会获取到一个包含所有消息的字节数组的列表,最后将该字节数组的列表拷贝到一个字节数组中。这里为什么要进行这一系列的操作呢,原因是MessageBatch继承了Message,其body属性是byte[]。
public byte[] encode() {
return MessageDecoder.encodeMessages(messages);
}
public static byte[] encodeMessages(List<Message> messages) {
//TO DO refactor, accumulate in one buffer, avoid copies
List<byte[]> encodedMessages = new ArrayList<byte[]>(messages.size());
int allSize = 0;
for (Message message : messages) {
byte[] tmp = encodeMessage(message);
encodedMessages.add(tmp);
allSize += tmp.length;
}
byte[] allBytes = new byte[allSize];
int pos = 0;
for (byte[] bytes : encodedMessages) {
System.arraycopy(bytes, 0, allBytes, pos, bytes.length);
pos += bytes.length;
}
return allBytes;
}
public static byte[] encodeMessage(Message message) {
//only need flag, body, properties
byte[] body = message.getBody();
int bodyLen = body.length;
String properties = messageProperties2String(message.getProperties());
byte[] propertiesBytes = properties.getBytes(CHARSET_UTF8);
//note properties length must not more than Short.MAX
short propertiesLength = (short) propertiesBytes.length;
int sysFlag = message.getFlag();
int storeSize = 4 // 1 TOTALSIZE
+ 4 // 2 MAGICCOD
+ 4 // 3 BODYCRC
+ 4 // 4 FLAG
+ 4 + bodyLen // 4 BODY
+ 2 + propertiesLength;
ByteBuffer byteBuffer = ByteBuffer.allocate(storeSize);
// 1 TOTALSIZE
byteBuffer.putInt(storeSize);
// 2 MAGICCODE
byteBuffer.putInt(0);
// 3 BODYCRC
byteBuffer.putInt(0);
// 4 FLAG
int flag = message.getFlag();
byteBuffer.putInt(flag);
// 5 BODY
byteBuffer.putInt(bodyLen);
byteBuffer.put(body);
// 6 properties
byteBuffer.putShort(propertiesLength);
byteBuffer.put(propertiesBytes);
return byteBuffer.array();
}
三、批量消息存储
producer向broker发送存储批量消息的请求类型是:RequestCode.SEND_BATCH_MESSAGE,broker处理该请求的processor是SendMessageProcessor,SendMessageProcessor在处理请求时会先判断该请求是普通消息还是批量消息,判断方法是请求的requestHeader的batch属性表示是否是批量消息,true表示是批量消息,false表示是普通消息,批量消息会调用asyncSendBatchMessage方法来处理。
private CompletableFuture<RemotingCommand> asyncSendBatchMessage(ChannelHandlerContext ctx, RemotingCommand request,
SendMessageContext mqtraceContext,
SendMessageRequestHeader requestHeader) {
final RemotingCommand response = preSend(ctx, request, requestHeader);
final SendMessageResponseHeader responseHeader = (SendMessageResponseHeader)response.readCustomHeader();
if (response.getCode() != -1) {
return CompletableFuture.completedFuture(response);
}
int queueIdInt = requestHeader.getQueueId();
TopicConfig topicConfig = this.brokerController.getTopicConfigManager().selectTopicConfig(requestHeader.getTopic());
if (queueIdInt < 0) {
queueIdInt = randomQueueId(topicConfig.getWriteQueueNums());
}
if (requestHeader.getTopic().length() > Byte.MAX_VALUE) {
response.setCode(ResponseCode.MESSAGE_ILLEGAL);
response.setRemark("message topic length too long " + requestHeader.getTopic().length());
return CompletableFuture.completedFuture(response);
}
MessageExtBatch messageExtBatch = new MessageExtBatch();
messageExtBatch.setTopic(requestHeader.getTopic());
messageExtBatch.setQueueId(queueIdInt);
int sysFlag = requestHeader.getSysFlag();
if (TopicFilterType.MULTI_TAG == topicConfig.getTopicFilterType()) {
sysFlag |= MessageSysFlag.MULTI_TAGS_FLAG;
}
messageExtBatch.setSysFlag(sysFlag);
messageExtBatch.setFlag(requestHeader.getFlag());
MessageAccessor.setProperties(messageExtBatch, MessageDecoder.string2messageProperties(requestHeader.getProperties()));
messageExtBatch.setBody(request.getBody());
messageExtBatch.setBornTimestamp(requestHeader.getBornTimestamp());
messageExtBatch.setBornHost(ctx.channel().remoteAddress());
messageExtBatch.setStoreHost(this.getStoreHost());
messageExtBatch.setReconsumeTimes(requestHeader.getReconsumeTimes() == null ? 0 : requestHeader.getReconsumeTimes());
String clusterName = this.brokerController.getBrokerConfig().getBrokerClusterName();
MessageAccessor.putProperty(messageExtBatch, MessageConst.PROPERTY_CLUSTER, clusterName);
CompletableFuture<PutMessageResult> putMessageResult = this.brokerController.getMessageStore().asyncPutMessages(messageExtBatch);
return handlePutMessageResultFuture(putMessageResult, response, request, messageExtBatch, responseHeader, mqtraceContext, ctx, queueIdInt);
}
这里可以看到broker端会根据请求中的信息来构建MessageExtBatch对象,在构建的同时会设置broker的地址以及集群名称,broker端存储批量消息的过程与存储普通消息类似,不同之处在于对批量消息数据的组织。下面就看看不同之处。
1.在对批量消息存储前会对MessageExtBatch对象进行encode操作,并将其结果设置到MessageExtBatch对象的encodedBuff属性。这里encode方法完成的功能是将批量消息中的所有消息按照以下格式进行组织,与普通消息所不同的是在properties这里需要将该条消息的properties和批量消息的properties都写进去,此外还需要注意:在这一步中并没有确定消息的QUEUEOFFSET和PHYSICALOFFSET,所以这两个属性值都先填充为0
messageExtBatch.setEncodedBuff(batchEncoder.encode(messageExtBatch));
public ByteBuffer encode(final MessageExtBatch messageExtBatch) {
msgBatchMemory.clear(); //not thread-safe
int totalMsgLen = 0;
ByteBuffer messagesByteBuff = messageExtBatch.wrap();
int sysFlag = messageExtBatch.getSysFlag();
int bornHostLength = (sysFlag & MessageSysFlag.BORNHOST_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
int storeHostLength = (sysFlag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
ByteBuffer bornHostHolder = ByteBuffer.allocate(bornHostLength);
ByteBuffer storeHostHolder = ByteBuffer.allocate(storeHostLength);
// properties from MessageExtBatch
String batchPropStr = MessageDecoder.messageProperties2String(messageExtBatch.getProperties());
final byte[] batchPropData = batchPropStr.getBytes(MessageDecoder.CHARSET_UTF8);
final short batchPropLen = (short) batchPropData.length;
while (messagesByteBuff.hasRemaining()) {
// 1 TOTALSIZE
messagesByteBuff.getInt();
// 2 MAGICCODE
messagesByteBuff.getInt();
// 3 BODYCRC
messagesByteBuff.getInt();
// 4 FLAG
int flag = messagesByteBuff.getInt();
// 5 BODY
int bodyLen = messagesByteBuff.getInt();
int bodyPos = messagesByteBuff.position();
int bodyCrc = UtilAll.crc32(messagesByteBuff.array(), bodyPos, bodyLen);
messagesByteBuff.position(bodyPos + bodyLen);
// 6 properties
short propertiesLen = messagesByteBuff.getShort();
int propertiesPos = messagesByteBuff.position();
messagesByteBuff.position(propertiesPos + propertiesLen);
final byte[] topicData = messageExtBatch.getTopic().getBytes(MessageDecoder.CHARSET_UTF8);
final int topicLength = topicData.length;
final int msgLen = calMsgLength(messageExtBatch.getSysFlag(), bodyLen, topicLength,
propertiesLen + batchPropLen);
// Exceeds the maximum message
if (msgLen > this.maxMessageSize) {
CommitLog.log.warn("message size exceeded, msg total size: " + msgLen + ", msg body size: " + bodyLen
+ ", maxMessageSize: " + this.maxMessageSize);
throw new RuntimeException("message size exceeded");
}
totalMsgLen += msgLen;
// Determines whether there is sufficient free space
if (totalMsgLen > maxMessageSize) {
throw new RuntimeException("message size exceeded");
}
// 1 TOTALSIZE
this.msgBatchMemory.putInt(msgLen);
// 2 MAGICCODE
this.msgBatchMemory.putInt(CommitLog.MESSAGE_MAGIC_CODE);
// 3 BODYCRC
this.msgBatchMemory.putInt(bodyCrc);
// 4 QUEUEID
this.msgBatchMemory.putInt(messageExtBatch.getQueueId());
// 5 FLAG
this.msgBatchMemory.putInt(flag);
// 6 QUEUEOFFSET
this.msgBatchMemory.putLong(0);
// 7 PHYSICALOFFSET
this.msgBatchMemory.putLong(0);
// 8 SYSFLAG
this.msgBatchMemory.putInt(messageExtBatch.getSysFlag());
// 9 BORNTIMESTAMP
this.msgBatchMemory.putLong(messageExtBatch.getBornTimestamp());
// 10 BORNHOST
this.resetByteBuffer(bornHostHolder, bornHostLength);
this.msgBatchMemory.put(messageExtBatch.getBornHostBytes(bornHostHolder));
// 11 STORETIMESTAMP
this.msgBatchMemory.putLong(messageExtBatch.getStoreTimestamp());
// 12 STOREHOSTADDRESS
this.resetByteBuffer(storeHostHolder, storeHostLength);
this.msgBatchMemory.put(messageExtBatch.getStoreHostBytes(storeHostHolder));
// 13 RECONSUMETIMES
this.msgBatchMemory.putInt(messageExtBatch.getReconsumeTimes());
// 14 Prepared Transaction Offset, batch does not support transaction
this.msgBatchMemory.putLong(0);
// 15 BODY
this.msgBatchMemory.putInt(bodyLen);
if (bodyLen > 0)
this.msgBatchMemory.put(messagesByteBuff.array(), bodyPos, bodyLen);
// 16 TOPIC
this.msgBatchMemory.put((byte) topicLength);
this.msgBatchMemory.put(topicData);
// 17 PROPERTIES
this.msgBatchMemory.putShort((short) (propertiesLen + batchPropLen));
if (propertiesLen > 0) {
this.msgBatchMemory.put(messagesByteBuff.array(), propertiesPos, propertiesLen);
}
if (batchPropLen > 0) {
this.msgBatchMemory.put(batchPropData, 0, batchPropLen);
}
}
msgBatchMemory.flip();
return msgBatchMemory;
}
2.在进行消息写入时,首先会遍历MessageExtBatch对象的encodedBuff属性,计算出每条消息的QUEUEOFFSET和PHYSICALOFFSET并将计算结果写入对应属性,在遍历的过程中会判断当前commitlog剩余空间是否能够容纳本批次的所有消息,如果不能则写入空消息。最后在遍历完encodedBuff后将encodedBuff写入commitlog中,到这里我们可以看出批量消息是不能跨文件且在存储时被拆分为一条条进行存储。
public AppendMessageResult doAppend(final long fileFromOffset, final ByteBuffer byteBuffer, final int maxBlank,
final MessageExtBatch messageExtBatch) {
byteBuffer.mark();
//physical offset
long wroteOffset = fileFromOffset + byteBuffer.position();
// Record ConsumeQueue information
keyBuilder.setLength(0);
keyBuilder.append(messageExtBatch.getTopic());
keyBuilder.append('-');
keyBuilder.append(messageExtBatch.getQueueId());
String key = keyBuilder.toString();
Long queueOffset = CommitLog.this.topicQueueTable.get(key);
if (null == queueOffset) {
queueOffset = 0L;
CommitLog.this.topicQueueTable.put(key, queueOffset);
}
long beginQueueOffset = queueOffset;
int totalMsgLen = 0;
int msgNum = 0;
msgIdBuilder.setLength(0);
final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
ByteBuffer messagesByteBuff = messageExtBatch.getEncodedBuff();
int sysFlag = messageExtBatch.getSysFlag();
int storeHostLength = (sysFlag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
ByteBuffer storeHostHolder = ByteBuffer.allocate(storeHostLength);
this.resetByteBuffer(storeHostHolder, storeHostLength);
ByteBuffer storeHostBytes = messageExtBatch.getStoreHostBytes(storeHostHolder);
messagesByteBuff.mark();
while (messagesByteBuff.hasRemaining()) {
// 1 TOTALSIZE
final int msgPos = messagesByteBuff.position();
final int msgLen = messagesByteBuff.getInt();
final int bodyLen = msgLen - 40; //only for log, just estimate it
// Exceeds the maximum message
if (msgLen > this.maxMessageSize) {
CommitLog.log.warn("message size exceeded, msg total size: " + msgLen + ", msg body size: " + bodyLen
+ ", maxMessageSize: " + this.maxMessageSize);
return new AppendMessageResult(AppendMessageStatus.MESSAGE_SIZE_EXCEEDED);
}
totalMsgLen += msgLen;
// Determines whether there is sufficient free space
if ((totalMsgLen + END_FILE_MIN_BLANK_LENGTH) > maxBlank) {
this.resetByteBuffer(this.msgStoreItemMemory, 8);
// 1 TOTALSIZE
this.msgStoreItemMemory.putInt(maxBlank);
// 2 MAGICCODE
this.msgStoreItemMemory.putInt(CommitLog.BLANK_MAGIC_CODE);
// 3 The remaining space may be any value
//ignore previous read
messagesByteBuff.reset();
// Here the length of the specially set maxBlank
byteBuffer.reset(); //ignore the previous appended messages
byteBuffer.put(this.msgStoreItemMemory.array(), 0, 8);
return new AppendMessageResult(AppendMessageStatus.END_OF_FILE, wroteOffset, maxBlank, msgIdBuilder.toString(), messageExtBatch.getStoreTimestamp(),
beginQueueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
}
//move to add queue offset and commitlog offset
messagesByteBuff.position(msgPos + 20);
messagesByteBuff.putLong(queueOffset);
messagesByteBuff.putLong(wroteOffset + totalMsgLen - msgLen);
storeHostBytes.rewind();
String msgId;
if ((sysFlag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0) {
msgId = MessageDecoder.createMessageId(this.msgIdMemory, storeHostBytes, wroteOffset + totalMsgLen - msgLen);
} else {
msgId = MessageDecoder.createMessageId(this.msgIdV6Memory, storeHostBytes, wroteOffset + totalMsgLen - msgLen);
}
if (msgIdBuilder.length() > 0) {
msgIdBuilder.append(',').append(msgId);
} else {
msgIdBuilder.append(msgId);
}
queueOffset++;
msgNum++;
messagesByteBuff.position(msgPos + msgLen);
}
messagesByteBuff.position(0);
messagesByteBuff.limit(totalMsgLen);
byteBuffer.put(messagesByteBuff);
messageExtBatch.setEncodedBuff(null);
AppendMessageResult result = new AppendMessageResult(AppendMessageStatus.PUT_OK, wroteOffset, totalMsgLen, msgIdBuilder.toString(),
messageExtBatch.getStoreTimestamp(), beginQueueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
result.setMsgNum(msgNum);
CommitLog.this.topicQueueTable.put(key, queueOffset);
return result;
}
由于批量消息在存储时会将其拆分成一条条存储,所以和普通消息存储没有区别,消费者在消费批量消息时和普通消息一样。消费消息可以参考笔者之前的学习笔记:RocketMQ源码分析之consumer并发消费信息、RocketMQ源码分析之消息拉取流程、RocketMQ源码分析之RebalanceService