最近在使用rocketmq时遇到一个问题,测试同学说通知收不到了
通过查看程序日志发现
service not available now. It may be caused by one of the following reasons:the broker's disk is full
表示磁盘满了消息存储器已经停止
赶紧用df
命令查看磁盘占用情况,发现已用%
指标还没有达到百分之百呀,怎么会提示磁盘占满了呢?
看下rocketmq的源码
发现在SendMessageProcessor
类中handlePutMessageResult
方法中有如下代码
...
case SERVICE_NOT_AVAILABLE:
response.setCode(ResponseCode.SERVICE_NOT_AVAILABLE);
response.setRemark(
"service not available now. It may be caused by one of the following reasons: " +
"the broker's disk is full [" + diskUtil() + "], messages are put to the slave, message store has been shut down, etc.");
break;
...
继续找是什么导致出现SERVICE_NOT_AVAILABLE
的原因
找到了
在org.apache.rocketmq.store.DefaultMessageStore#checkStoreStatus
方法里会有三种原因返回PutMessageStatus.SERVICE_NOT_AVAILABLE
1. 当消息存储器DefaultMessageStore
已经shutdown时
2. 当消息存储器配置的broker角色为SLAVE时
3. 当his.runningFlags.isWriteable()
不可写时
明显我们的问题应该是第三种情况,那么什么时候his.runningFlags.isWriteable()
返回false呢,继续深入源码看看,找到如下代码
if ((this.flagBits & (NOT_WRITEABLE_BIT | WRITE_LOGICS_QUEUE_ERROR_BIT | DISK_FULL_BIT | WRITE_INDEX_FILE_ERROR_BIT)) == 0) {
return true;
}
return false;
这是我们发现一个关键的变量DISK_FULL_BIT
和报错日志很有关联呀,继续找,找到如下代码
public boolean getAndMakeDiskFull() {
boolean result = !((this.flagBits & DISK_FULL_BIT) == DISK_FULL_BIT);
this.flagBits |= DISK_FULL_BIT;
return result;
}
public boolean getAndMakeDiskOK() {
boolean result = !((this.flagBits & DISK_FULL_BIT) == DISK_FULL_BIT);
this.flagBits &= ~DISK_FULL_BIT;
return result;
}
哇、找到了两个方法,根据字面意思,相信我们能够知道应该继续看哪个,当然是getAndMakeDiskFull
方法啦 哈哈
上图显示有三个地方会设置磁盘满的标志位
看下第一个代码如下
double physicRatio = UtilAll.getDiskPartitionSpaceUsedPercent(getStorePathPhysic());
if (physicRatio > diskSpaceWarningLevelRatio) {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskFull();
if (diskok) {
DefaultMessageStore.log.error("physic disk maybe full soon " + physicRatio + ", so mark disk full");
}
cleanImmediately = true;
} else if (physicRatio > diskSpaceCleanForciblyRatio) {
cleanImmediately = true;
} else {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskOK();
if (!diskok) {
DefaultMessageStore.log.info("physic disk space OK " + physicRatio + ", so mark disk ok");
}
}
第二个代码如下
String storePathLogics = StorePathConfigHelper
.getStorePathConsumeQueue(DefaultMessageStore.this.getMessageStoreConfig().getStorePathRootDir());
double logicsRatio = UtilAll.getDiskPartitionSpaceUsedPercent(storePathLogics);
if (logicsRatio > diskSpaceWarningLevelRatio) {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskFull();
if (diskok) {
DefaultMessageStore.log.error("logics disk maybe full soon " + logicsRatio + ", so mark disk full");
}
cleanImmediately = true;
} else if (logicsRatio > diskSpaceCleanForciblyRatio) {
cleanImmediately = true;
} else {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskOK();
if (!diskok) {
DefaultMessageStore.log.info("logics disk space OK " + logicsRatio + ", so mark disk ok");
}
}
if (logicsRatio < 0 || logicsRatio > ratio) {
DefaultMessageStore.log.info("logics disk maybe full soon, so reclaim space, " + logicsRatio);
return true;
}
第三个代码如下
String storePathPhysic = DefaultMessageStore.this.getMessageStoreConfig().getStorePathCommitLog();
double physicRatio = UtilAll.getDiskPartitionSpaceUsedPercent(storePathPhysic);
double ratio = DefaultMessageStore.this.getMessageStoreConfig().getDiskMaxUsedSpaceRatio() / 100.0;
if (physicRatio > ratio) {
DefaultMessageStore.log.info("physic disk of commitLog used: " + physicRatio);
}
if (physicRatio > this.diskSpaceWarningLevelRatio) {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskFull();
if (diskok) {
DefaultMessageStore.log.error("physic disk of commitLog maybe full soon, used " + physicRatio + ", so mark disk full");
}
return true;
} else {
boolean diskok = DefaultMessageStore.this.runningFlags.getAndMakeDiskOK();
if (!diskok) {
DefaultMessageStore.log.info("physic disk space of commitLog OK " + physicRatio + ", so mark disk ok");
}
return false;
}
最终我们会找到如下代码
private final double diskSpaceWarningLevelRatio =
Double.parseDouble(System.getProperty("rocketmq.broker.diskSpaceWarningLevelRatio", "0.90"));
因此我们可以发现,当磁盘占用达到百分之90时,消息就不会被发送啦,那么怎么解决这个问题呢