背景
在生产环境上发现某条消息无法通过console进行重发,报错信息如下:
org.apache.rocketmq.client.exception.MQClientException:
CODE: 208 DESC: query message by key finished, but no message.
For more information, please visit the url, http://rocketmq.apache.org/docs/faq/
原因
MQAdminImpl.class
public MessageExt queryMessageByUniqKey(String topic,
String uniqKey) throws InterruptedException, MQClientException {
QueryResult qr = queryMessageByUniqKey(topic, uniqKey, 32,
MessageClientIDSetter.getNearlyTimeFromID(uniqKey).getTime() - 1000, Long.MAX_VALUE);
if (qr != null && qr.getMessageList() != null && qr.getMessageList().size() > 0) {
return qr.getMessageList().get(0);
} else {
return null;
}
}
从msgId中解析出时间,从该时间点开始查询broker中的信息,接下来用mqadmin命令查看这条消息的相关时间信息
./mqadmin queryMsgByUniqueKey -n xxx:xxx-t xxx-i 7F0000015xxxxxxx0159
Topic: xxx
Tags: [xxx]
Keys: [xxx]
Queue ID: 7
Queue Offset: 15697
CommitLog Offset: 549800974789
Reconsume Times: 0
Born Timestamp: 2023-04-19 17:31:23,025
Store Timestamp: 2023-04-19 17:31:22,937
Born Host: ip1:port1
Store Host: ip2:port2
System Flag: 0
分析
- 注意Born和Store两个时间差异,正常来讲 born应该早于store,但实际情况是born 晚于 store,也就时生产者和这台broker之间的系统时间有差异, producer的时间快于broker的时间,导致在查询时候通过 开始时间:2023-04-19 17:31:23,025-1000= 2023-04-19 17:31:22,025查询不到msg,因为在broker中这条消息的存储时间是2023-04-19 17:31:22,937
解决方案
producer、 broker ntp时间同步,误差在1秒以内都能够正常查询到消息