场景:
最近rocketmq事务消息使用不当导致了线上问题;现象为本地事务执行失败,但是半消息还被发送出去了,最终导致了数据不一致;
rocketmq事务消息执行步骤:
事故还原:
发送半消息及执行本地事务
问题1:本地事务执行异常后,没有被下面的
catch
抓住
问题2:按理来说,本地事务执行失败后,rocketMq服务器应该调用客户端回查事务状态的接口,结果没有回查
问题3:半消息最后被roeketMq发送出去了
事故分析:
我们来跟踪下rocket客户端的源码;
@Override
public SendResult send(final Message message, final LocalTransactionExecuter executer, Object arg) {
this.checkONSProducerServiceState(this.transactionMQProducer.getDefaultMQProducerImpl());
com.aliyun.openservices.shade.com.alibaba.rocketmq.common.message.Message msgRMQ = ONSUtil.msgConvert(message);
com.aliyun.openservices.shade.com.alibaba.rocketmq.client.producer.TransactionSendResult sendResultRMQ = null;
try {
// 发送半消息
sendResultRMQ = transactionMQProducer.sendMessageInTransaction(msgRMQ,
new com.aliyun.openservices.shade.com.alibaba.rocketmq.client.producer.LocalTransactionExecuter() {
@Override
public LocalTransactionState executeLocalTransactionBranch(
com.aliyun.openservices.shade.com.alibaba.rocketmq.common.message.Message msg,
Object arg) {
String msgId = msg.getProperty(Constants.TRANSACTION_ID);
message.setMsgID(msgId);
TransactionStatus transactionStatus = executer.execute(message, arg);
if (TransactionStatus.CommitTransaction == transactionStatus)