RocketMQ作为一个消息队列需要一个高效的机制来存储消息,store模块就是RocketMQ的消息存储模块,下面介绍一下store中消息存储的核心流程。
一、消息存储核心类
DefaultMessageStore是消息存储的核心模块和入口,它的定义如下:
public class DefaultMessageStore implements MessageStore {
/**
* 消息过滤器
*/
private final MessageFilter messageFilter = new DefaultMessageFilter();
/**
* MessageStore配置
*/
private final MessageStoreConfig messageStoreConfig;
/**
* CommitLog
*/
private final CommitLog commitLog;
/**
* 消费队列集合
*/
private final ConcurrentHashMap<String/* topic */, ConcurrentHashMap<Integer/* queueId */, ConsumeQueue>> consumeQueueTable;
/**
* flush 消费队列线程服务
*/
private final FlushConsumeQueueService flushConsumeQueueService;
private final CleanCommitLogService cleanCommitLogService;
private final CleanConsumeQueueService cleanConsumeQueueService;
/**
* 索引实现类
*/
private final IndexService indexService;
private final AllocateMappedFileService allocateMappedFileService;
/**
* 重放消息线程服务
*/
@SuppressWarnings("SpellCheckingInspection")
private final ReputMessageService reputMessageService;
private final HAService haService;
private final ScheduleMessageService scheduleMessageService;
private final StoreStatsService storeStatsService;
private final TransientStorePool transientStorePool;
private final RunningFlags runningFlags = new RunningFlags();
private final SystemClock systemClock = new SystemClock();
private final ScheduledExecutorService scheduledExecutorService =
Executors.newSingleThreadScheduledExecutor(new ThreadFactoryImpl("StoreScheduledThread"));
private final BrokerStatsManager brokerStatsManager;
private final MessageArrivingListener messageArrivingListener;
private final BrokerConfig brokerConfig;
private volatile boolean shutdown = true;
private StoreCheckpoint storeCheckpoint;
private AtomicLong printTimes = new AtomicLong(0);
}
二、消息存储流程
DefaultMessageStore中消息通过函数putMessage写入,代码如下:
代码:DefaultMessageStore#putMessage
public PutMessageResult putMessage(MessageExtBrokerInner msg) {
......
// 从节点不允许写入
if (BrokerRole.SLAVE == this.messageStoreConfig.getBrokerRole()) {
long value = this.printTimes.getAndIncrement();
if ((value % 50000) == 0) {
log.warn("message store is slave mode, so putMessage is forbidden ");
}
return new PutMessageResult(PutMessageStatus.SERVICE_NOT_AVAILABLE, null);
}
// store是否允许写入
if (!this.runningFlags.isWriteable()) {
long value = this.printTimes.getAndIncrement();
if ((value % 50000) == 0) {
log.warn("message store is not writeable, so putMessage is forbidden " + this.runningFlags.getFlagBits());
}
return new PutMessageResult(PutMessageStatus.SERVICE_NOT_AVAILABLE, null);
} else {
this.printTimes.set(0);
}
// topic过长
if (msg.getTopic().length() > Byte.MAX_VALUE) {
log.warn("putMessage message topic length too long " + msg.getTopic().length());
return new PutMessageResult(PutMessageStatus.MESSAGE_ILLEGAL, null);
}
// 消息附加属性过长
if (msg.getPropertiesString() != null && msg.getPropertiesString().length() > Short.MAX_VALUE) {
log.warn("putMessage message properties length too long " + msg.getPropertiesString().length());
return new PutMessageResult(PutMessageStatus.PROPERTIES_SIZE_EXCEEDED, null);
}
if (this.isOSPageCacheBusy()) {
return new PutMessageResult(PutMessageStatus.OS_PAGECACHE_BUSY, null);
}
long beginTime = this.getSystemClock().now();
// 添加消息到commitLog
PutMessageResult result = this.commitLog.putMessage(msg);
long eclipseTime = this.getSystemClock().now() - beginTime;
if (eclipseTime > 500) {
log.warn("putMessage not in lock eclipse time(ms)={}, bodyLength={}", eclipseTime, msg.getBody().length);
}
this.storeStatsService.setPutMessageEntireTimeMax(eclipseTime);
if (null == result || !result.isOk()) {
this.storeStatsService.getPutMessageFailedTimes().incrementAndGet();
}
return result;
}
通过对上面代码的解析可以看出,最后是通过CommitLog的函数putMessage来写入的:
代码:CommitLog#putMessage
/**
* 添加消息,返回消息结果
*
* @param msg 消息
* @return 结果
*/
public PutMessageResult putMessage(final MessageExtBrokerInner msg) {
......
// 获取写入映射文件
MappedFile unlockMappedFile = null;
MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile();
// 获取追加锁,限制同一时间只能有一个线程进行数据的Put工作
lockForPutMessage(); //spin...
try {
long beginLockTimestamp = this.defaultMessageStore.getSystemClock().now();
this.beginTimeInLock = beginLockTimestamp;
// Here settings are stored timestamp, in order to ensure an orderly
// global
msg.setStoreTimestamp(beginLockTimestamp);
// 当不存在映射文件或者文件已经空间已满,进行创建
if (null == mappedFile || mappedFile.isFull()) {
mappedFile = this.mappedFileQueue.getLastMappedFile(0); // Mark: NewFile may be cause noise
}
if (null == mappedFile) {
log.error("create maped file1 error, topic: " + msg.getTopic() + " clientAddr: " + msg.getBornHostString());
beginTimeInLock = 0;
return new PutMessageResult(PutMessageStatus.CREATE_MAPEDFILE_FAILED, null);
}
// 将消息追加到MappedFile的MappedByteBuffer/writeBuffer中,更新其写入位置wrotePosition,但还没Commit及Flush
result = mappedFile.appendMessage(msg, this.appendMessageCallback);
......
}
......
return putMessageResult;
}
CommitLog中存储了具体的消息内容,定义如下:
public class CommitLog {
/**
* MAGIC_CODE - MESSAGE
* Message's MAGIC CODE daa320a7
* 标记某一段为消息,即:[msgId, MESSAGE_MAGIC_CODE, 消息]
*/
public final static int MESSAGE_MAGIC_CODE = 0xAABBCCDD ^ 1880681586 + 8;
/**
* MAGIC_CODE - BLANK
* End of file empty MAGIC CODE cbd43194
* 标记某一段为空白,即:[msgId, BLANK_MAGIC_CODE, 空白]
* 当CommitLog无法容纳消息时,使用该类型结尾
*/
private final static int BLANK_MAGIC_CODE = 0xBBCCDDEE ^ 1880681586 + 8;
/**
* 映射文件队列
*/
private final MappedFileQueue mappedFileQueue;
/**
* 消息存储
*/
private final DefaultMessageStore defaultMessageStore;
/**
* flush commitLog 线程服务
*/
private final FlushCommitLogService flushCommitLogService;
/**
* If TransientStorePool enabled, we must flush message to FileChannel at fixed periods
* commit commitLog 线程服务
*/
private final FlushCommitLogService commitLogService;
/**
* 写入消息到Buffer Callback
*/
private final AppendMessageCallback appendMessageCallback;
/**
* topic消息队列 与 offset 的Map
*/
private HashMap<String/* topic-queue_id */, Long/* offset */> topicQueueTable = new HashMap<>(1024);
/**
* TODO
*/
private volatile long confirmOffset = -1L;
/**
* 当前获取lock时间。
* 如果当前解锁,则为0
*/
private volatile long beginTimeInLock = 0;
/**
* true: Can lock, false : in lock.
* 添加消息 螺旋锁(通过while循环实现)
*/
private AtomicBoolean putMessageSpinLock = new AtomicBoolean(true);
/**
* 添加消息重入锁
*/
private ReentrantLock putMessageNormalLock = new ReentrantLock(); // Non fair Sync
}
在CommitLog中, MappedFileQueue内部包含MappedFile的列表mappedFiles,MappedFileQueue在getLastMappedFile会按需创建MappedFile对象。MappedFile是存放消息的具体类,最后都是通过MappedFile的函数appendMessage写入消息。
代码:MappedFile#appendMessage
/**
* 附加消息到文件。
* 实际是插入映射文件buffer
*
* @param msg 消息
* @param cb 逻辑
* @return 附加消息结果
*/
public AppendMessageResult appendMessage(final MessageExtBrokerInner msg, final AppendMessageCallback cb) {
assert msg != null;
assert cb != null;
int currentPos = this.wrotePosition.get();
if (currentPos < this.fileSize) {
ByteBuffer byteBuffer = writeBuffer != null ? writeBuffer.slice() : this.mappedByteBuffer.slice();
byteBuffer.position(currentPos);
AppendMessageResult result = cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, msg);
this.wrotePosition.addAndGet(result.getWroteBytes());
this.storeTimestamp = result.getStoreTimestamp();
return result;
}
log.error("MappedFile.appendMessage return null, wrotePosition: " + currentPos + " fileSize: "
+ this.fileSize);
return new AppendMessageResult(AppendMessageStatus.UNKNOWN_ERROR);
}
通过代码可以看出,最后是通过回调类AppendMessageCallback中的函数doAppend来写入数据,具体实现在类DefaultAppendMessageCallback中。
代码:DefaultAppendMessageCallback#doAppend
@Override
public AppendMessageResult doAppend(final long fileFromOffset, final ByteBuffer byteBuffer, final int maxBlank, final MessageExtBrokerInner msgInner) {
// STORETIMESTAMP + STOREHOSTADDRESS + OFFSET <br>
//物理写入偏移量,也就是当前文件里已经写入的数据位置
long wroteOffset = fileFromOffset + byteBuffer.position();
// 计算commitLog里的msgId
this.resetByteBuffer(hostHolder, 8);
//创建messageId,由 "ip+port+wroteOffset" 组成
String msgId = MessageDecoder.createMessageId(this.msgIdMemory, msgInner.getStoreHostBytes(hostHolder), wroteOffset);
// Record ConsumeQueue information 获取队列offset
keyBuilder.setLength(0);
keyBuilder.append(msgInner.getTopic());
keyBuilder.append('-');
keyBuilder.append(msgInner.getQueueId());
//tpoic-queueId
String key = keyBuilder.toString();
//记录这条消息的消费信息在当前队列的序号,也就是第几条消息
Long queueOffset = CommitLog.this.topicQueueTable.get(key);
if (null == queueOffset) {
queueOffset = 0L;
CommitLog.this.topicQueueTable.put(key, queueOffset);
}
// Transaction messages that require special handling
final int tranType = MessageSysFlag.getTransactionValue(msgInner.getSysFlag());
switch (tranType) {
// Prepared and Rollback message is not consumed, will not enter the
// consumer queue
case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
queueOffset = 0L;
break;
case MessageSysFlag.TRANSACTION_NOT_TYPE:
case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
default:
break;
}
// 计算消息长度
final byte[] propertiesData =
msgInner.getPropertiesString() == null ? null : msgInner.getPropertiesString().getBytes(MessageDecoder.CHARSET_UTF8);
final int propertiesLength = propertiesData == null ? 0 : propertiesData.length;
if (propertiesLength > Short.MAX_VALUE) {
log.warn("putMessage message properties length too long. length={}", propertiesData.length);
return new AppendMessageResult(AppendMessageStatus.PROPERTIES_SIZE_EXCEEDED);
}
final byte[] topicData = msgInner.getTopic().getBytes(MessageDecoder.CHARSET_UTF8);
final int topicLength = topicData.length;
final int bodyLength = msgInner.getBody() == null ? 0 : msgInner.getBody().length;
final int msgLen = calMsgLength(bodyLength, topicLength, propertiesLength);
// Exceeds the maximum message
if (msgLen > this.maxMessageSize) {
CommitLog.log.warn("message size exceeded, msg total size: " + msgLen + ", msg body size: " + bodyLength
+ ", maxMessageSize: " + this.maxMessageSize);
return new AppendMessageResult(AppendMessageStatus.MESSAGE_SIZE_EXCEEDED);
}
// 如果文件已经接近满额,剩余空间容纳不下当前消息, maxBlank :当前剩余空间
if ((msgLen + END_FILE_MIN_BLANK_LENGTH) > maxBlank) {
this.resetByteBuffer(this.msgStoreItemMemory, maxBlank);
// 1 TOTAL_SIZE
this.msgStoreItemMemory.putInt(maxBlank);
// 2 MAGIC_CODE
this.msgStoreItemMemory.putInt(CommitLog.BLANK_MAGIC_CODE);
// 3 The remaining space may be any value
//
// Here the length of the specially set maxBlank
final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
byteBuffer.put(this.msgStoreItemMemory.array(), 0, maxBlank);
return new AppendMessageResult(AppendMessageStatus.END_OF_FILE, wroteOffset, maxBlank, msgId, msgInner.getStoreTimestamp(),
queueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
}
// Initialization of storage space
this.resetByteBuffer(msgStoreItemMemory, msgLen);
// 1 TOTAL_SIZE
this.msgStoreItemMemory.putInt(msgLen);
// 2 MAGIC_CODE
this.msgStoreItemMemory.putInt(CommitLog.MESSAGE_MAGIC_CODE);
// 3 BODY_CRC
this.msgStoreItemMemory.putInt(msgInner.getBodyCRC());
// 4 QUEUE_ID
this.msgStoreItemMemory.putInt(msgInner.getQueueId());
// 5 FLAG
this.msgStoreItemMemory.putInt(msgInner.getFlag());
// 6 QUEUE_OFFSET
this.msgStoreItemMemory.putLong(queueOffset);
// 7 PHYSICAL_OFFSET
this.msgStoreItemMemory.putLong(fileFromOffset + byteBuffer.position());
// 8 SYS_FLAG
this.msgStoreItemMemory.putInt(msgInner.getSysFlag());
// 9 BORN_TIMESTAMP
this.msgStoreItemMemory.putLong(msgInner.getBornTimestamp());
// 10 BORN_HOST
this.resetByteBuffer(hostHolder, 8);
this.msgStoreItemMemory.put(msgInner.getBornHostBytes(hostHolder));
// 11 STORE_TIMESTAMP
this.msgStoreItemMemory.putLong(msgInner.getStoreTimestamp());
// 12 STORE_HOST_ADDRESS
this.resetByteBuffer(hostHolder, 8);
this.msgStoreItemMemory.put(msgInner.getStoreHostBytes(hostHolder));
//this.msgStoreItemMemory.put(msgInner.getStoreHostBytes());
// 13 RECONSUME_TIMES
this.msgStoreItemMemory.putInt(msgInner.getReconsumeTimes());
// 14 Prepared Transaction Offset
this.msgStoreItemMemory.putLong(msgInner.getPreparedTransactionOffset());
// 15 BODY
this.msgStoreItemMemory.putInt(bodyLength);
if (bodyLength > 0) { this.msgStoreItemMemory.put(msgInner.getBody()); }
// 16 TOPIC
this.msgStoreItemMemory.put((byte)topicLength);
this.msgStoreItemMemory.put(topicData);
// 17 PROPERTIES
this.msgStoreItemMemory.putShort((short)propertiesLength);
if (propertiesLength > 0) { this.msgStoreItemMemory.put(propertiesData); }
final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
// Write messages to the queue buffer
byteBuffer.put(this.msgStoreItemMemory.array(), 0, msgLen);
AppendMessageResult result = new AppendMessageResult(AppendMessageStatus.PUT_OK, wroteOffset, msgLen, msgId,
msgInner.getStoreTimestamp(), queueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
switch (tranType) {
//事务类型为PREPARED_TYPE和ROLLBACK_TYPE的消息不会进ConsumeQueue,所以这里不会增加queueOffset的值
case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
break;
case MessageSysFlag.TRANSACTION_NOT_TYPE:
case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
// The next update ConsumeQueue information 更新队列的offset
CommitLog.this.topicQueueTable.put(key, ++queueOffset);
break;
default:
break;
}
return result;
}
在函数DefaultAppendMessageCallback#doAppend中,想将消息进行组装,然后调用byteBuffer的put方法将消息内容写入堆外内存ByteBuffer中。再写入ByteBuffer中以后,系统会异步将ByteBuffer中的内容flush到磁盘中。
综上,RocketMQ 消息存储核心流程如下图所示: