参考官方设计文档:事务消息设计
文章目录
事务消息流程图
步骤流程
- 发送 half 消息给 Broker。
- 服务端响应消息写入结果。
- 根据发送结果执行本地事务(如果写入失败,此时 half 消息对业务不可见,本地逻辑不执行)。
- 根据本地事务状态执行 Commit 或者 Rollback(Commit 操作生成消息索引,消息对消费者可见)。
- 对没有Commit/Rollback的事务消息(pending状态的消息),从服务端发起一次“回查”。
- Producer收到回查消息,检查回查消息对应的本地事务的状态。
- 根据本地事务状态,重新Commit或者Rollback。
设计关键点
- half 消息存储在 RMQ_SYS_TRANS_HALF_TOPIC 主题中,消费者不可见。
- 开启一个定时任务消费此 half 消息,向 Producer 回查执行结果,重新发送一条消息到 HALF 主题,HALF 主题消费队列偏移量推进。
- 确定 half 消息 Commit 或者 Rollback 后,将对应的处理结果消息(存放的是 HALF 消费队列的偏移量 )放到 RMQ_SYS_TRANS_OP_HALF_TOPIC 主题中。
- half 消息不会删除,OP_HALF 主题中存在 half 的消息结果代表 half 消息已被处理。
- Commit 消息还会恢复原消息,发送到真实的主题下,这样消费者就能消费了。
- 超过15次回查仍不能确定状态,或者 CommitLog 文件超过 72h 过期,回滚此消息。
接下来按照流程对照源码一步步分析
详细步骤
步骤1:事务消息的发送
org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl#sendMessageInTransaction
发送消息前设置两个属性,标记是事务消息,存储生产者组
MessageAccessor.putProperty(msg, MessageConst.PROPERTY_TRANSACTION_PREPARED, "true");
MessageAccessor.putProperty(msg, MessageConst.PROPERTY_PRODUCER_GROUP, this.defaultMQProducer.getProducerGroup());
同步发送消息
this.sendDefaultImpl(msg, CommunicationMode.SYNC, null, timeout);
设置消息系统状态
final String tranMsg = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (tranMsg != null && Boolean.parseBoolean(tranMsg)) {
sysFlag |= MessageSysFlag.TRANSACTION_PREPARED_TYPE;
}
步骤2:Broker写Half消息
org.apache.rocketmq.broker.processor.SendMessageProcessor
接收并处理消息请求,判断是否是事务消息
String traFlag = oriProps.get(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (traFlag != null && Boolean.parseBoolean(traFlag)) {
if (this.brokerController.getBrokerConfig().isRejectTransactionMessage()) {
// 不能处理直接返回
return response;
}
// 事务消息特殊处理
putMessageResult = this.brokerController.getTransactionalMessageService().prepareMessage(msgInner);
} else {
putMessageResult = this.brokerController.getMessageStore().putMessage(msgInner);
}
org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge#putHalfMessage
解析出 Half 消息并存储
public PutMessageResult putHalfMessage(MessageExtBrokerInner messageInner) {
return store.putMessage(parseHalfMessageInner(messageInner));
}
解析过程:把真实主题和队列放到属性中,重置系统状态,重设主题为 RMQ_SYS_TRANS_HALF_TOPIC,重设队列为 0
存储到 CommitLog 中,此消息无法被消费者消费。
private MessageExtBrokerInner parseHalfMessageInner(MessageExtBrokerInner msgInner) {
MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_TOPIC, msgInner.getTopic());
MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_QUEUE_ID,
String.valueOf(msgInner.getQueueId()));
msgInner.setSysFlag(
MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), MessageSysFlag.TRANSACTION_NOT_TYPE));
msgInner.setTopic(TransactionalMessageUtil.buildHalfTopic());
msgInner.setQueueId(0);
msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgInner.getProperties()));
return msgInner;
}
步骤3:执行本地事务
同步调用消息发送接口,等待Broker返回执行结果,异常就抛出错误不执行后续本地事务
SendResult sendResult = null;
try {
sendResult = this.send(msg);
} catch (Exception e) {
throw new MQClientException("send message Exception", e);
}
若返回成功状态,执行本地事务。若返回失败状态,包括刷盘超时、同步Slave超时、Slave不可用,不执行本地事务,并标记本地事务执行状态为回滚。
LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
switch (sendResult.getSendStatus()) {
case SEND_OK: {
try {
...
localTransactionState = transactionListener.executeLocalTransaction(msg, arg);
// 本地事务无返回状态,默认未知
if (null == localTransactionState) {
localTransactionState = LocalTransactionState.UNKNOW;
}
...
}
break;
case FLUSH_DISK_TIMEOUT:
case FLUSH_SLAVE_TIMEOUT:
case SLAVE_NOT_AVAILABLE:
localTransactionState = LocalTransactionState.ROLLBACK_MESSAGE;
break;
default:
break;
}
按照本地事务的执行结果,向Broker发送 Commit 或者 Rollback 请求。
try {
this.endTransaction(sendResult, localTransactionState, localException);
} catch (Exception e) {
}
RemotingCommand 命令代码:END_TRANSACTION,发送方式 Oneway。发送失败也无所谓,后续 Broker 会回查本地事务状态。
EndTransactionRequestHeader requestHeader = new EndTransactionRequestHeader();
requestHeader.setTransactionId(transactionId);
requestHeader.setCommitLogOffset(id.getOffset());
switch (localTransactionState) {
case COMMIT_MESSAGE:
requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_COMMIT_TYPE);
break;
case ROLLBACK_MESSAGE:
requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_ROLLBACK_TYPE);
break;
case UNKNOW:
requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_NOT_TYPE);
break;
default:
break;
}
this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, requestHeader, remark,
this.defaultMQProducer.getSendMsgTimeout());
步骤4:Broker提交或者回滚Half消息
Broker 接收 END_TRANSACTION 命令
org.apache.rocketmq.broker.processor.EndTransactionProcessor#processRequest
非 Commit 或者 Rollback 不处理
switch (requestHeader.getCommitOrRollback()) {
case MessageSysFlag.TRANSACTION_NOT_TYPE: {
return null;
}
case MessageSysFlag.TRANSACTION_COMMIT_TYPE: {
break;
}
case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE: {
break;
}
default:
return null;
}
处理 Commit 或者 Rollback
OperationResult result = new OperationResult();
if (MessageSysFlag.TRANSACTION_COMMIT_TYPE == requestHeader.getCommitOrRollback()) {
// 获取 Half 消息
result = this.brokerController.getTransactionalMessageService().commitMessage(requestHeader);
// 是否获取成功
if (result.getResponseCode() == ResponseCode.SUCCESS) {
// 验证 Half 消息是否一致
RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
if (res.getCode() == ResponseCode.SUCCESS) {
// 根据 Half 消息恢复出原消息
MessageExtBrokerInner msgInner = endMessageTransaction(result.getPrepareMessage());
msgInner.setSysFlag(MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), requestHeader.getCommitOrRollback()));
msgInner.setQueueOffset(requestHeader.getTranStateTableOffset());
msgInner.setPreparedTransactionOffset(requestHeader.getCommitLogOffset());
msgInner.setStoreTimestamp(result.getPrepareMessage().getStoreTimestamp());
// 原消息重新进入 CommitLog,让消费者正常消费
RemotingCommand sendResult = sendFinalMessage(msgInner);
if (sendResult.getCode() == ResponseCode.SUCCESS) {
// 刷盘成功,删除 Half 消息
this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
}
// 失败,等待后续回查
return sendResult;
}
return res;
}
} else if (MessageSysFlag.TRANSACTION_ROLLBACK_TYPE == requestHeader.getCommitOrRollback()) {
result = this.brokerController.getTransactionalMessageService().rollbackMessage(requestHeader);
if (result.getResponseCode() == ResponseCode.SUCCESS) {
RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
if (res.getCode() == ResponseCode.SUCCESS) {
// 删除 Half 消息
this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
}
return res;
}
}
获取 Half 消息
public OperationResult commitMessage(EndTransactionRequestHeader requestHeader) {
return getHalfMessageByOffset(requestHeader.getCommitLogOffset());
}
public OperationResult rollbackMessage(EndTransactionRequestHeader requestHeader) {
return getHalfMessageByOffset(requestHeader.getCommitLogOffset());
}
根据 Half 消息 result 验证生产者组、事务状态、偏移量是否和请求传过来的一致
private RemotingCommand checkPrepareMessage(MessageExt msgExt, EndTransactionRequestHeader requestHeader) {
final RemotingCommand response = RemotingCommand.createResponseCommand(null);
if (msgExt != null) {
final String pgroupRead = msgExt.getProperty(MessageConst.PROPERTY_PRODUCER_GROUP);
if (!pgroupRead.equals(requestHeader.getProducerGroup())) {
...
}
if (msgExt.getQueueOffset() != requestHeader.getTranStateTableOffset()) {
...
}
if (msgExt.getCommitLogOffset() != requestHeader.getCommitLogOffset()) {
...
}
} else {
response.setCode(ResponseCode.SYSTEM_ERROR);
response.setRemark("Find prepared transaction message failed");
return response;
}
response.setCode(ResponseCode.SUCCESS);
return response;
}
Commit 消息提交,需要先恢复出原消息
private MessageExtBrokerInner endMessageTransaction(MessageExt msgExt) {
MessageExtBrokerInner msgInner = new MessageExtBrokerInner();
// 恢复属性中的主题和队列
msgInner.setTopic(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_TOPIC));
msgInner.setQueueId(Integer.parseInt(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_QUEUE_ID)));
msgInner.setBody(msgExt.getBody());
msgInner.setFlag(msgExt.getFlag());
msgInner.setBornTimestamp(msgExt.getBornTimestamp());
msgInner.setBornHost(msgExt.getBornHost());
msgInner.setStoreHost(msgExt.getStoreHost());
msgInner.setReconsumeTimes(msgExt.getReconsumeTimes());
msgInner.setWaitStoreMsgOK(false);
msgInner.setTransactionId(msgExt.getUserProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
msgInner.setSysFlag(msgExt.getSysFlag());
TopicFilterType topicFilterType =
(msgInner.getSysFlag() & MessageSysFlag.MULTI_TAGS_FLAG) == MessageSysFlag.MULTI_TAGS_FLAG ? TopicFilterType.MULTI_TAG
: TopicFilterType.SINGLE_TAG;
long tagsCodeValue = MessageExtBrokerInner.tagsString2tagsCode(topicFilterType, msgInner.getTags());
msgInner.setTagsCode(tagsCodeValue);
MessageAccessor.setProperties(msgInner, msgExt.getProperties());
msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgExt.getProperties()));
// 清空属性中的主题和队列
MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_REAL_TOPIC);
MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_REAL_QUEUE_ID);
return msgInner;
}
Commit 消息,重新将原消息存储到 CommitLog,并返回存储结果。Slave 的刷盘结果不影响此次消息成功返回
private RemotingCommand sendFinalMessage(MessageExtBrokerInner msgInner) {
final RemotingCommand response = RemotingCommand.createResponseCommand(null);
final PutMessageResult putMessageResult = this.brokerController.getMessageStore().putMessage(msgInner);
if (putMessageResult != null) {
switch (putMessageResult.getPutMessageStatus()) {
// Success
case PUT_OK:
case FLUSH_DISK_TIMEOUT:
case FLUSH_SLAVE_TIMEOUT:
case SLAVE_NOT_AVAILABLE:
response.setCode(ResponseCode.SUCCESS);
response.setRemark(null);
break;
// Failed
case CREATE_MAPEDFILE_FAILED:
case MESSAGE_ILLEGAL:
case PROPERTIES_SIZE_EXCEEDED:
case SERVICE_NOT_AVAILABLE:
case OS_PAGECACHE_BUSY:
case UNKNOWN_ERROR:
default:
response.setCode(ResponseCode.SYSTEM_ERROR);
response.setRemark("UNKNOWN_ERROR DEFAULT");
break;
}
return response;
} else {
response.setCode(ResponseCode.SYSTEM_ERROR);
response.setRemark("store putMessage return null");
}
return response;
}
原消息刷盘成功,删除 Half 消息,整个事务消息结束
public boolean deletePrepareMessage(MessageExt msgExt) {
if (this.transactionalMessageBridge.putOpMessage(msgExt, TransactionalMessageUtil.REMOVETAG)) {
return true;
} else {
return false;
}
}
public boolean putOpMessage(MessageExt messageExt, String opType) {
MessageQueue messageQueue = new MessageQueue(messageExt.getTopic(),
this.brokerController.getBrokerConfig().getBrokerName(), messageExt.getQueueId());
if (TransactionalMessageUtil.REMOVETAG.equals(opType)) {
return addRemoveTagInTransactionOp(messageExt, messageQueue);
}
return true;
}
Op 消息存储的是 Half 消息的消费队列偏移量
private boolean addRemoveTagInTransactionOp(MessageExt messageExt, MessageQueue messageQueue) {
Message message = new Message(TransactionalMessageUtil.buildOpTopic(), TransactionalMessageUtil.REMOVETAG,
String.valueOf(messageExt.getQueueOffset()).getBytes(TransactionalMessageUtil.charset));
writeOp(message, messageQueue);
return true;
}
步骤5:Broker定时回查
Broker 存储了 Half 消息后,若收不到 Commit 或者 Rollback 命令,定时执行回查。
public class TransactionalMessageCheckService extends ServiceThread {
public void run() {
// 默认间隔 60s
long checkInterval = brokerController.getBrokerConfig().getTransactionCheckInterval();
while (!this.isStopped()) {
this.waitForRunning(checkInterval);
}
}
@Override
protected void onWaitEnd() {
// 默认一条消息存储超过6s才执行回查
long timeout = brokerController.getBrokerConfig().getTransactionTimeOut();
// 一条消息最大回查次数,默认15次后删除 Half 消息
int checkMax = brokerController.getBrokerConfig().getTransactionCheckMax();
this.brokerController.getTransactionalMessageService().check(timeout, checkMax, this.brokerController.getTransactionalMessageCheckListener());
}
}
具体执行回查逻辑
public void check(long transactionTimeout, int transactionCheckMax,
AbstractTransactionalMessageCheckListener listener) {
try {
String topic = MixAll.RMQ_SYS_TRANS_HALF_TOPIC;
// 获取到所有的 Half 消费队列
Set<MessageQueue> msgQueues = transactionalMessageBridge.fetchMessageQueues(topic);
if (msgQueues == null || msgQueues.size() == 0) {
log.warn("The queue of topic is empty :" + topic);
return;
}
log.debug("Check topic={}, queues={}", topic, msgQueues);
for (MessageQueue messageQueue : msgQueues) {
long startTime = System.currentTimeMillis();
// 根据 Half 队列获取到 Op 队列,Broker名称和队列序号都一样只是主题不一样
MessageQueue opQueue = getOpQueue(messageQueue);
// 获取到 Half 消息的消费进度偏移量
long halfOffset = transactionalMessageBridge.fetchConsumeOffset(messageQueue);
// 获取到 Op 消息的消费进度偏移量
long opOffset = transactionalMessageBridge.fetchConsumeOffset(opQueue);
log.info("Before check, the queue={} msgOffset={} opOffset={}", messageQueue, halfOffset, opOffset);
if (halfOffset < 0 || opOffset < 0) {
log.error("MessageQueue: {} illegal offset read: {}, op offset: {},skip this queue", messageQueue,
halfOffset, opOffset);
continue;
}
// 已处理的Op消息
List<Long> doneOpOffset = new ArrayList<>();
// 准备处理的 Half 消息
HashMap<Long, Long> removeMap = new HashMap<>();
// 对比两个队列,默认从 Op 主题拉取32条消息
PullResult pullResult = fillOpRemoveMap(removeMap, opQueue, opOffset, halfOffset, doneOpOffset);
if (null == pullResult) {
log.error("The queue={} check msgOffset={} with opOffset={} failed, pullResult is null",
messageQueue, halfOffset, opOffset);
continue;
}
// single thread
int getMessageNullCount = 1;// 获取空消息的次数
long newOffset = halfOffset; // half消息的最新进度
long i = halfOffset;
while (true) {
// 一次检查任务只执行60s,然后退出等下个检查任务去执行
if (System.currentTimeMillis() - startTime > MAX_PROCESS_TIME_LIMIT) {
log.info("Queue={} process time reach max={}", messageQueue, MAX_PROCESS_TIME_LIMIT);
break;
}
// 消息已经被处理过了,跳过
if (removeMap.containsKey(i)) {
log.info("Half offset {} has been committed/rolled back", i);
removeMap.remove(i);
} else {
// 获取half消息
GetResult getResult = getHalfMsg(messageQueue, i);
MessageExt msgExt = getResult.getMsg();
if (msgExt == null) {
// 未获取到,进行一次重试
if (getMessageNullCount++ > MAX_RETRY_COUNT_WHEN_HALF_NULL) {
break;
}
if (getResult.getPullResult().getPullStatus() == PullStatus.NO_NEW_MSG) {
log.debug("No new msg, the miss offset={} in={}, continue check={}, pull result={}", i,
messageQueue, getMessageNullCount, getResult.getPullResult());
// 此队列无消息,结束此队列的查询任务
break;
} else {
log.info("Illegal offset, the miss offset={} in={}, continue check={}, pull result={}",
i, messageQueue, getMessageNullCount, getResult.getPullResult());
i = getResult.getPullResult().getNextBeginOffset();
newOffset = i;
// 重新拉取
continue;
}
}
// 是否回查,次数超过15次丢弃,消息文件超过72h过期了跳过
if (needDiscard(msgExt, transactionCheckMax) || needSkip(msgExt)) {
listener.resolveDiscardMsg(msgExt);
// 处理进度加一
newOffset = i + 1;
i++;
continue;
}
// 读到新消息结束
if (msgExt.getStoreTimestamp() >= startTime) {
log.debug("Fresh stored. the miss offset={}, check it later, store={}", i,
new Date(msgExt.getStoreTimestamp()));
break;
}
// 已存储时间
long valueOfCurrentMinusBorn = System.currentTimeMillis() - msgExt.getBornTimestamp();
// 消息存储超过6s才回查
long checkImmunityTime = transactionTimeout;
String checkImmunityTimeStr = msgExt.getUserProperty(MessageConst.PROPERTY_CHECK_IMMUNITY_TIME_IN_SECONDS);
if (null != checkImmunityTimeStr) {
checkImmunityTime = getImmunityTime(checkImmunityTimeStr, transactionTimeout);
if (valueOfCurrentMinusBorn < checkImmunityTime) {
if (checkPrepareQueueOffset(removeMap, doneOpOffset, msgExt)) {
newOffset = i + 1;
i++;
continue;
}
}
} else {
if ((0 <= valueOfCurrentMinusBorn) && (valueOfCurrentMinusBorn < checkImmunityTime)) {
log.debug("New arrived, the miss offset={}, check it later checkImmunity={}, born={}", i,
checkImmunityTime, new Date(msgExt.getBornTimestamp()));
break;
}
}
List<MessageExt> opMsg = pullResult.getMsgFoundList();
// 如果没有已处理的消息且本次处理时间超过最小时间限制
// 或者队列中最后一条消息满足回查时间限制
boolean isNeedCheck = (opMsg == null && valueOfCurrentMinusBorn > checkImmunityTime)
|| (opMsg != null && (opMsg.get(opMsg.size() - 1).getBornTimestamp() - startTime > transactionTimeout))
|| (valueOfCurrentMinusBorn <= -1);
// 需要回查
if (isNeedCheck) {
// 将half消息再次发送到CommitLog,进度向前推,存储重试次数
if (!putBackHalfMsgQueue(msgExt, i)) {
continue;
}
// 向生产者发送查询请求
listener.resolveHalfMsg(msgExt);
} else {
// 无法判断,加载更多的op消息
pullResult = fillOpRemoveMap(removeMap, opQueue, pullResult.getNextBeginOffset(), halfOffset, doneOpOffset);
log.info("The miss offset:{} in messageQueue:{} need to get more opMsg, result is:{}", i,
messageQueue, pullResult);
continue;
}
}
newOffset = i + 1;
i++;
}
// 保存回查进度
if (newOffset != halfOffset) {
transactionalMessageBridge.updateConsumeOffset(messageQueue, newOffset);
}
long newOpOffset = calculateOpOffset(doneOpOffset, opOffset);
// 保存回查进度
if (newOpOffset != opOffset) {
transactionalMessageBridge.updateConsumeOffset(opQueue, newOpOffset);
}
}
} catch (Exception e) {
e.printStackTrace();
log.error("Check error", e);
}
}
发送回查请求时,先将调用 putBackHalfMsgQueue 将 Half 消息再次存入 CommitLog,处理进度加一,MQ保证顺序写,无法真正的删除消息。
然后开启一个线程去执行回调查询,不等待查询结果,因为Producer会发送一条处理结果回来。
public void resolveHalfMsg(final MessageExt msgExt) {
executorService.execute(new Runnable() {
@Override
public void run() {
try {
sendCheckMessage(msgExt);
} catch (Exception e) {
LOGGER.error("Send check message error!", e);
}
}
});
}
构造发送请求,根据消息的生产者组名称,从生产者组中轮询选择一个生产者发送回调查询请求
public void sendCheckMessage(MessageExt msgExt) throws Exception {
CheckTransactionStateRequestHeader checkTransactionStateRequestHeader = new CheckTransactionStateRequestHeader();
checkTransactionStateRequestHeader.setCommitLogOffset(msgExt.getCommitLogOffset());
checkTransactionStateRequestHeader.setOffsetMsgId(msgExt.getMsgId());
checkTransactionStateRequestHeader.setMsgId(msgExt.getUserProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
checkTransactionStateRequestHeader.setTransactionId(checkTransactionStateRequestHeader.getMsgId());
checkTransactionStateRequestHeader.setTranStateTableOffset(msgExt.getQueueOffset());
msgExt.setTopic(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_TOPIC));
msgExt.setQueueId(Integer.parseInt(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_QUEUE_ID)));
msgExt.setStoreSize(0);
String groupId = msgExt.getProperty(MessageConst.PROPERTY_PRODUCER_GROUP);
Channel channel = brokerController.getProducerManager().getAvaliableChannel(groupId);
if (channel != null) {
brokerController.getBroker2Client().checkProducerTransactionState(groupId, channel, checkTransactionStateRequestHeader, msgExt);
} else {
LOGGER.warn("Check transaction failed, channel is null. groupId={}", groupId);
}
}
不管是生产者还是消费者都会向所有的Broker发送心跳,找到第一个有效的客户端 Channel 通道
public Channel getAvaliableChannel(String groupId) {
HashMap<Channel, ClientChannelInfo> channelClientChannelInfoHashMap = groupChannelTable.get(groupId);
List<Channel> channelList = new ArrayList<Channel>();
if (channelClientChannelInfoHashMap != null) {
for (Channel channel : channelClientChannelInfoHashMap.keySet()) {
channelList.add(channel);
}
int index = positiveAtomicCounter.incrementAndGet() % size;
Channel channel = channelList.get(index);
int count = 0;
boolean isOk = channel.isActive() && channel.isWritable();
while (count++ < GET_AVALIABLE_CHANNEL_RETRY_COUNT) {
if (isOk) {
return channel;
}
index = (++index) % size;
channel = channelList.get(index);
isOk = channel.isActive() && channel.isWritable();
}
} else {
log.warn("Check transaction failed, channel table is empty. groupId={}", groupId);
return null;
}
return null;
}
具体的 RemotingCommand 请求命令是 CHECK_TRANSACTION_STATE
步骤6:Producer查询本地事务
生产者接收 Broker 回查请求,解析出 Broker 地址
org.apache.rocketmq.client.impl.ClientRemotingProcessor#checkTransactionState
final String addr = RemotingHelper.parseChannelRemoteAddr(ctx.channel());
// 检查本地事务
producer.checkTransactionState(addr, messageExt, requestHeader);
生产者开启了一个线程池用来处理 Broker 的回调查询请求
public void checkTransactionState(final String addr, final MessageExt msg,
final CheckTransactionStateRequestHeader header) {
Runnable request = new Runnable() {
private final String brokerAddr = addr;
private final MessageExt message = msg;
private final CheckTransactionStateRequestHeader checkRequestHeader = header;
private final String group = DefaultMQProducerImpl.this.defaultMQProducer.getProducerGroup();
@Override
public void run() {
// 检查本地事务监听是否存在
TransactionCheckListener transactionCheckListener = DefaultMQProducerImpl.this.checkListener();
TransactionListener transactionListener = getCheckListener();
if (transactionCheckListener != null || transactionListener != null) {
LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
Throwable exception = null;
try {
// 执行本地事务结果查询,这里是需要生产者实现的地方
if (transactionCheckListener != null) {
// 区分新旧版本,旧的事务接口已被标记弃用
localTransactionState = transactionCheckListener.checkLocalTransactionState(message);
} else if (transactionListener != null) {
log.debug("Used new check API in transaction message");
localTransactionState = transactionListener.checkLocalTransaction(message);
} else {
log.warn("CheckTransactionState, pick transactionListener by group[{}] failed", group);
}
} catch (Throwable e) {
log.error("Broker call checkTransactionState, but checkLocalTransactionState exception", e);
exception = e;
}
// 按照本地查询结果,再次处理 Broker 事务状态
this.processTransactionState(
localTransactionState,
group,
exception);
} else {
log.warn("CheckTransactionState, pick transactionCheckListener by group[{}] failed", group);
}
}
// 和 endTransaction 方法类似
private void processTransactionState(
final LocalTransactionState localTransactionState,
final String producerGroup,
final Throwable exception) {
...
try {
DefaultMQProducerImpl.this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, thisHeader, remark,
3000);
} catch (Exception e) {
}
}
};
this.checkExecutor.submit(request);
}
步骤7:Broker重新提交或者回滚Half消息
再次调用 org.apache.rocketmq.broker.processor.EndTransactionProcessor#processRequest,逻辑一样。
事务消息主题目录
总结
事务消息先写 Half 消息,对消息的 Topic 和 Queue 等属性进行替换,同时将原来的 Topic 和 Queue 信息存储到消息的属性中,正因为消息主题被替换,故消息并不会转发到该原主题的消息消费队列,消费者无法感知消息的存在,不会消费。
在完成一阶段写入一条对用户不可见的消息后,二阶段如果是 Commit 操作,则需要让消息对用户可见,恢复原消息重新存储到 CommitLog,并删除一阶段的消息;如果是 Rollback 则需要撤销一阶段的消息。
RocketMQ 无法去真正的删除一条消息,因为是顺序写文件的。RocketMQ 使用 Op 消息标识事务消息已经确定的状态(Commit或者Rollback)。Op 消息的主题是一个内部的 Topic(像Half消息的Topic一样),不会被用户消费。
Op 消息的内容为对应的 Half 消息的存储的Offset,这样通过 Op 消息能索引到 Half 消息进行后续的回查操作。如果一条事务消息没有对应的 Op 消息,说明这个事务的状态还无法确定(可能是二阶段失败了)。对比 Half 消息和 Op 消息进行事务消息的回查并且推进处理进度。
不能确定的消息,Broker端会发起回查,将消息发送到对应的Producer端(同一个Group的Producer),由Producer根据消息来检查本地事务的状态,进而执行Commit或者Rollback。