一、CommitLog功能概述
CommitLog可以说是RocketMQ存储中,比较核心的一类文件。它存储了RocketMQ的所有消息。本篇CommitLog提供的功能及核心原理(本篇分析只针对CommitLog类,不涉及DLedgerCommitLog的实现)。
源码分析
创建与恢复
首先,先从CommitLog的成员变量看起:
// Message's MAGIC CODE daa320a7
public final static int MESSAGE_MAGIC_CODE = -626843481;
protected static final InternalLogger log = InternalLoggerFactory.getLogger(LoggerName.STORE_LOGGER_NAME);
// End of file empty MAGIC CODE cbd43194
protected final static int BLANK_MAGIC_CODE = -875286124;
protected final MappedFileQueue mappedFileQueue;
protected final DefaultMessageStore defaultMessageStore;
// 负责消息刷盘
private final FlushCommitLogService flushCommitLogService;
//If TransientStorePool enabled, we must flush message to FileChannel at fixed periods
//负责消息提交
private final FlushCommitLogService commitLogService;
private final AppendMessageCallback appendMessageCallback;
private final ThreadLocal<MessageExtBatchEncoder> batchEncoderThreadLocal;
protected HashMap<String/* topic-queueid */, Long/* offset */> topicQueueTable = new HashMap<String, Long>(1024);
protected volatile long confirmOffset = -1L;
private volatile long beginTimeInLock = 0;
protected final PutMessageLock putMessageLock;
从成员变量我们暂时可以得到如下结论:
- CommitLog底层存储是基于MappedFileQueu实现的(前面有文章分析过MappedFileQueue)。
- CommitLog消息写入依托MappedFileQueue,也会分为三个阶段:先写入直接内存缓冲、再写到PageCache、最后刷盘到磁盘。猜测是通过commitLogService、flushCommitLogService实现的。
接下来我们看构造方法:
public CommitLog(final DefaultMessageStore defaultMessageStore) {
// 参数值:
// System.getProperty("user.home") + File.separator + "store" + File.separator + "commitlog";
// 1024 * 1024 * 1024
this.mappedFileQueue = new MappedFileQueue(defaultMessageStore.getMessageStoreConfig().getStorePathCommitLog(),
defaultMessageStore.getMessageStoreConfig().getMappedFileSizeCommitLog(), defaultMessageStore.getAllocateMappedFileService());
this.defaultMessageStore = defaultMessageStore;
if (FlushDiskType.SYNC_FLUSH == defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
this.flushCommitLogService = new GroupCommitService();
} else {
this.flushCommitLogService = new FlushRealTimeService();
}
this.commitLogService = new CommitRealTimeService();
this.appendMessageCallback = new DefaultAppendMessageCallback(defaultMessageStore.getMessageStoreConfig().getMaxMessageSize());
batchEncoderThreadLocal = new ThreadLocal<MessageExtBatchEncoder>() {
@Override
protected MessageExtBatchEncoder initialValue() {
return new MessageExtBatchEncoder(defaultMessageStore.getMessageStoreConfig().getMaxMessageSize());
}
};
this.putMessageLock = defaultMessageStore.getMessageStoreConfig().isUseReentrantLockWhenPutMessage() ? new PutMessageReentrantLock() : new PutMessageSpinLock();
}
CommitLog构造方法做了以下事情:
首先,创建MappedFileQueue对象,MappedFileQueue存储的位置是
System.getProperty(“user.home”) + File.separator + “store” + File.separator + “commitlog”,单个文件大小是1024 * 1024 * 1024,也就是128MB。
随后会根据配置初始化刷盘策略,即flushCommitLogService的实现,GroupCommitService实现了同步刷盘的逻辑、FlushRealTimeService实现了异步刷盘的逻辑。
接着会初始化消息提交策略,即初始化commitLogService,其实现类是CommitRealTimeService。
随后初始化appendMessageCallback、batchEncoderThreadLocal、putMessageLock等成员变量,其具体作用我们在后面分析。
CommitLog创建完成后,需要执行load()、和start()方法才能开始工作,这两个方法的实现细节如下:
public boolean load() {
boolean result = this.mappedFileQueue.load();
log.info("load commit log " + (result ? "OK" : "Failed"));
return result;
}
public void start() {
this.flushCommitLogService.start();
if (defaultMessageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
this.commitLogService.start();
}
}
load()方法的作用就是加载MappedFileQueue,MappedFileQueue具体做了哪些事情本篇不再分析,前面的文章中有关于这部分的分析内容。
start()方法中,启动了flushCommitLogService和commitLogService两个任务,我们前面说过,消息提交分三个阶段,commitLogService任务就负责将提交到直接内存中的消息,提交到文件的PageCache中,flushCommitLogService负责刷盘。具体实现细节我们下面会分析。
接下来我们看一下CommitLog恢复逻辑,恢复逻辑主要分两种情况,一种是正常退出,一种是异常退出。正常退出会调用recoverNormally这个方法来恢复:
/**
* 校验每个文件的CRC码,以及数据长度和文件中数据长度字段是否一致。如果校验失败,就丢弃第一个校验失败的文件之后的所有数据
* When the normal exit, data recovery, all memory data have been flush
*/
public void recoverNormally(long maxPhyOffsetOfConsumeQueue) {
boolean checkCRCOnRecover = this.defaultMessageStore.getMessageStoreConfig().isCheckCRCOnRecover();
final List<MappedFile> mappedFiles = this.mappedFileQueue.getMappedFiles();
if (!mappedFiles.isEmpty()) {
// Began to recover from the last third file
// 恢复最后三个文件
int index = mappedFiles.size() - 3;
if (index < 0)
index = 0;
MappedFile mappedFile = mappedFiles.get(index);
ByteBuffer byteBuffer = mappedFile.sliceByteBuffer();
long processOffset = mappedFile.getFileFromOffset();
long mappedFileOffset = 0;
while (true) {
// 检验单条消息
DispatchRequest dispatchRequest = this.checkMessageAndReturnSize(byteBuffer, checkCRCOnRecover);
int size = dispatchRequest.getMsgSize();
// Normal data
if (dispatchRequest.isSuccess() && size > 0) {
mappedFileOffset += size;
}
// Come the end of the file, switch to the next file Since the
// return 0 representatives met last hole,
// this can not be included in truncate offset
else if (dispatchRequest.isSuccess() && size == 0) {
index++;
if (index >= mappedFiles.size()) {
// Current branch can not happen
log.info("recover last 3 physics file over, last mapped file " + mappedFile.getFileName());
break;
} else {
mappedFile = mappedFiles.get(index);
byteBuffer = mappedFile.sliceByteBuffer();
processOffset = mappedFile.getFileFromOffset();
mappedFileOffset = 0;
log.info("recover next physics file, " + mappedFile.getFileName());
}
}
// Intermediate file read error
else if (!dispatchRequest.isSuccess()) {
log.info("recover physics file end, " + mappedFile.getFileName());
break;
}
}
// 出现不一致的文件就丢弃从不一致文件开始之后的所有数据
processOffset += mappedFileOffset;
this.mappedFileQueue.setFlushedWhere(processOffset);
this.mappedFileQueue.setCommittedWhere(processOffset);
this.mappedFileQueue.truncateDirtyFiles(processOffset);
// Clear ConsumeQueue redundant data
if (maxPhyOffsetOfConsumeQueue >= processOffset) {
log.warn("maxPhyOffsetOfConsumeQueue({}) >= processOffset({}), truncate dirty logic files", maxPhyOffsetOfConsumeQueue, processOffset);
this.defaultMessageStore.truncateDirtyLogicFiles(processOffset);
}
} else {
// Commitlog case files are deleted
log.warn("The commitlog files are deleted, and delete the consume queue files");
this.mappedFileQueue.setFlushedWhere(0);
this.mappedFileQueue.setCommittedWhere(0);
this.defaultMessageStore.destroyLogics();
}
}
我们可以看到正常退出的恢复逻辑:正常退出恢复,只会校验最新的三个MappedFile。校验的过程也很简单,从倒数第三个MappedFile文件开始,循环调用*public DispatchRequest checkMessageAndReturnSize(java.nio.ByteBuffer byteBuffer, final boolean checkCRC)*方法来进行单条消息的校验。一直到成功校验完最后三个MappedFile文件,或者有单条消息校验失败的情况下才会停止。我们看到,方法中使用mappedFileOffset变量来记录在单个MappedFile文件中,校验成功的消息的总大小。使用processOffset来表示,最后一个校验成功的消息的下一个指针。最后把整个MappedFileQueue的commit指针、flush指针都设置为processOffset。这就表示CommitLog在校验过程中如果发现有消息校验失败,就会丢弃第一个校验失败的消息及其以后的所有数据。
下面来看一下单条消息校验的逻辑,通过这里,可以分析出message在CommitLog的存储格式:
/**
* 主要校验body的crc码、文件存储的数据量和数据长度字段的大小是否一致
* check the message and returns the message size
*
* @return 0 Come the end of the file // >0 Normal messages // -1 Message checksum failure
*/
public DispatchRequest checkMessageAndReturnSize(java.nio.ByteBuffer byteBuffer, final boolean checkCRC,
final boolean readBody) {
try {
// 1 TOTAL SIZE
int totalSize = byteBuffer.getInt();
// 2 MAGIC CODE
int magicCode = byteBuffer.getInt();
switch (magicCode) {
case MESSAGE_MAGIC_CODE:
break;
case BLANK_MAGIC_CODE:
return new DispatchRequest(0, true /* success */);
default:
log.warn("found a illegal magic code 0x" + Integer.toHexString(magicCode));
return new DispatchRequest(-1, false /* success */);
}
byte[] bytesContent = new byte[totalSize];
int bodyCRC = byteBuffer.getInt();
int queueId = byteBuffer.getInt();
int flag = byteBuffer.getInt();
long queueOffset = byteBuffer.getLong();
long physicOffset = byteBuffer.getLong();
int sysFlag = byteBuffer.getInt();
long bornTimeStamp = byteBuffer