事务消息

总览

RocketMQ事务消息(Transactional Message)是指应用本地事务和发送消息操作可以被定义到全局事务中,要么同时成功,要么同时失败。RocketMQ的事务消息提供类似 X/Open XA 的分布事务功能,通过事务消息能达到分布式事务的最终一致。

RocketMQ 事务消息设计则主要是为了解决 Producer 端的消息发送与本地事务执行的原子性问题,RocketMQ 的设计中 broker 与 producer 端的双向通信能力,使得 broker 天生可以作为一个事务协调者存在;而 RocketMQ 本身提供的存储机制,则为事务消息提供了持久化能力;RocketMQ 的高可用机制以及可靠消息设计,则为事务消息在系统在发生异常时,依然能够保证事务的最终一致性达成。

事务消息设计

Apache RocketMQ在4.3.0版中已经支持分布式事务消息,这里RocketMQ采用了2PC的思想来实现了提交事务消息,同时增加一个补偿逻辑(事务状态回查)来处理二阶段超时或者失败的消息,如下图所示。

image.png

  1. 事务发起方首先发送 prepare 消息到 MQ。
  2. 在发送 prepare 消息成功后执行本地事务。
  3. 根据本地事务执行结果返回 commit 或者是 rollback。
  4. 发送操作消息Operation Message,commit/rollback的消息;如果消息是 rollback,MQ 将删除该 prepare 消息不进行下发,如果是 commit 消息,MQ 将会把这个消息发送给 consumer 端。
  5. 如果执行本地事务过程中,执行端挂掉,或者超时,MQ 将会不停的询问其同组的其它 producer 来获取状态。
  6. Consumer 端的消费成功机制有 MQ 保证。
RocketMQ事务消息实现

RocketMQ 事务消息在实现上充分利用了 RocketMQ 本身机制,在实现零依赖的基础上,同样实现了高性能、可扩展、全异步等一系列特性。

在具体实现上,RocketMQ 通过使用 Half Topic 以及 Operation Topic 两个内部队列来存储事务消息推进状态,如下图所示:

image.png

其中,Half Topic 对应队列中存放着 prepare 消息,Operation Topic 对应的队列则存放了 prepare message 对应的 commit/rollback 消息,消息体中则是 prepare message 对应的 offset,服务端通过比对两个队列的差值来找到尚未提交的超时事务,进行回查。

在具体实现上,事务消息作为普通消息的一个应用场景,在实现过程中进行了分层抽象,从而避免了对 RocketMQ 原有存储机制的修改,如下图所示:

image.png

从用户侧来说,用户需要分别实现本地事务执行以及本地事务回查方法,因此只需关注本地事务的执行状态即可;而在 service 层,则对事务消息的两阶段提交进行了抽象,同时针对超时事务实现了回查逻辑,通过不断扫描当前事务推进状态,来不断反向请求 Producer 端获取超时事务的执行状态,在避免事务挂起的同时,也避免了 Producer 端的单点故障。而在存储层,RocketMQ 通过 Bridge 封装了与底层队列存储的相关操作,用以操作两个对应的内部队列,用户也可以依赖其它存储介质实现自己的 service,RocketMQ 会通过 ServiceProvider 加载进来。

从上述事务消息设计中可以看到,RocketMQ 事务消息较好的解决了事务的最终一致性问题,事务发起方仅需要关注本地事务执行以及实现回查接口给出事务状态判定等实现,而且在上游事务峰值高时,可以通过消息队列,避免对下游服务产生过大压力。

事务消息初始化

初始化事务消息环境,执行检测本地事务执行状态的执行器checkExecutor赋值。

public void initTransactionEnv() {
    TransactionMQProducer producer = (TransactionMQProducer) this.defaultMQProducer;
    // 推荐采用在producer为checkExecutor赋值
    if (producer.getExecutorService() != null) {
        this.checkExecutor = producer.getExecutorService();
    } else {
        // 过时接口,不推荐,推荐采用在producer为checkExecutor赋值
        // 事务未知状态消息存放队列
        this.checkRequestQueue = new LinkedBlockingQueue<Runnable>(producer.getCheckRequestHoldMax());
        this.checkExecutor = new ThreadPoolExecutor(
            producer.getCheckThreadPoolMinSize(),
            producer.getCheckThreadPoolMaxSize(),
            1000 * 60,
            TimeUnit.MILLISECONDS,
            this.checkRequestQueue);
    }
}
消息发送存储

事务消息发送的流程和一般消息发送总体流程一样,其中掺杂着少量的事务代码的处理。

事务消息会将prepare消息发送到RMQ_SYS_TRANS_HALF_TOPIC的halfTopic,然后通过异步的方式进行消息落盘。

producer端发送处理
// 发送事务消息
public TransactionSendResult sendMessageInTransaction(final Message msg,
    final LocalTransactionExecuter localTransactionExecuter, final Object arg)
    throws MQClientException {
    // 获取事务监听器
    TransactionListener transactionListener = getCheckListener();
    if (null == localTransactionExecuter && null == transactionListener) {
        throw new MQClientException("tranExecutor is null", null);
    }
    // 事务消息不支持延时消息和批量消息。
    // ignore DelayTimeLevel parameter
    if (msg.getDelayTimeLevel() != 0) {
        MessageAccessor.clearProperty(msg, MessageConst.PROPERTY_DELAY_TIME_LEVEL);
    }
    // 发送的消息检测
    Validators.checkMessage(msg, this.defaultMQProducer);

    SendResult sendResult = null;
    MessageAccessor.putProperty(msg, MessageConst.PROPERTY_TRANSACTION_PREPARED, "true");
    MessageAccessor.putProperty(msg, MessageConst.PROPERTY_PRODUCER_GROUP, this.defaultMQProducer.getProducerGroup());
    try {
        // 同步发送消息
        sendResult = this.send(msg);
    } catch (Exception e) {
        throw new MQClientException("send message Exception", e);
    }
    // 初始化本地事务状态
    LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
    Throwable localException = null;
    switch (sendResult.getSendStatus()) {
        case SEND_OK: {
            try {
                // 放置事务Id属性
                if (sendResult.getTransactionId() != null) {
                    msg.putUserProperty("__transactionId__", sendResult.getTransactionId());
                }
                // 消息UNIQ_KEY作为事务Id
                String transactionId = msg.getProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX);
                if (null != transactionId && !"".equals(transactionId)) {
                    msg.setTransactionId(transactionId);
                }
                // 过时接口,推荐采用else
                if (null != localTransactionExecuter) {
                    localTransactionState = localTransactionExecuter.executeLocalTransactionBranch(msg, arg);
                } else if (transactionListener != null) {
                    log.debug("Used new transaction API");
                    // 执行本地事务
                    localTransactionState = transactionListener.executeLocalTransaction(msg, arg);
                }
                if (null == localTransactionState) {
                    localTransactionState = LocalTransactionState.UNKNOW;
                }
                // CommitMessage非成功,记录日志
                if (localTransactionState != LocalTransactionState.COMMIT_MESSAGE) {
                    log.info("executeLocalTransactionBranch return {}", localTransactionState);
                    log.info(msg.toString());
                }
            } catch (Throwable e) {
                log.info("executeLocalTransactionBranch exception", e);
                log.info(msg.toString());
                localException = e;
            }
        }
        break;
        // 返回回滚状态
        case FLUSH_DISK_TIMEOUT:
        case FLUSH_SLAVE_TIMEOUT:
        case SLAVE_NOT_AVAILABLE:
            localTransactionState = LocalTransactionState.ROLLBACK_MESSAGE;
            break;
        default:
            break;
    }

    try {
        // 结束事务,发送commit/rollback的Operation消息
        this.endTransaction(sendResult, localTransactionState, localException);
    } catch (Exception e) {
        log.warn("local transaction execute " + localTransactionState + ", but end broker transaction failed", e);
    }
    // 事务消费发送是否成功状态
    TransactionSendResult transactionSendResult = new TransactionSendResult();
    transactionSendResult.setSendStatus(sendResult.getSendStatus());
    transactionSendResult.setMessageQueue(sendResult.getMessageQueue());
    transactionSendResult.setMsgId(sendResult.getMsgId());
    transactionSendResult.setQueueOffset(sendResult.getQueueOffset());
    transactionSendResult.setTransactionId(sendResult.getTransactionId());
    transactionSendResult.setLocalTransactionState(localTransactionState);
    return transactionSendResult;
}

sendResult = this.send(msg);
同步发送消息走的是普通消息的消息发送流程,其中有少量事务代码,比如:

DefaultMQProducerImpl.sendKernelImpl
// 是否是事务 prepare 消息,TRAN_MSG
final String tranMsg = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (tranMsg != null && Boolean.parseBoolean(tranMsg)) {
    sysFlag |= MessageSysFlag.TRANSACTION_PREPARED_TYPE;
}

// 设置事务消息消息类型为准备的Half消息
String isTrans = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (isTrans != null && isTrans.equals("true")) {
    // 设置消息事务类型,准备消息类型
    context.setMsgType(MessageType.Trans_Msg_Half);
}
broker端消息存储

broker端消息存储和普通消息存储流程一样,存在着一个topic转换的过程,事务消息会将prepare消息发送到RMQ_SYS_TRANS_HALF_TOPIC的halfTopic,然后通过异步的方式进行消息落盘。

SendMessageProcessor.asyncSendMessage具体区分事务消息的存储方式;事务消息通过TransactionalMessageService.asyncPrepareMessage()进行消息存储。

// 事务消息
String transFlag = origProps.get(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (transFlag != null && Boolean.parseBoolean(transFlag)) {
    // 事务消息被拒绝
    if (this.brokerController.getBrokerConfig().isRejectTransactionMessage()) {
        response.setCode(ResponseCode.NO_PERMISSION);
        response.setRemark(
                "the broker[" + this.brokerController.getBrokerConfig().getBrokerIP1()
                        + "] sending transaction message is forbidden");
        return CompletableFuture.completedFuture(response);
    }
    // 异步处理事务Prepare消息
    putMessageResult = this.brokerController.getTransactionalMessageService().asyncPrepareMessage(msgInner);
} else {
    // 非事务消息存放具体处理
    putMessageResult = this.brokerController.getMessageStore().asyncPutMessage(msgInner);
}

然后通过TransactionalMessageBridge.asyncPutHalfMessage()异步进行prepare消息的halfTopic落盘操作,其中涉及prepare消息topic、queueId的转换,转换为RMQ_SYS_TRANS_HALF_TOPIC的halfTopic、为0的queueId,再进行消息的异步落盘操作。

TransactionalMessageBridge封装了事务消息服务对HalfMessage、OperationMessage与底层队列存储相关的操作,用以操作这两个Topic以及Topic下的内部队列。

// 处理事务Prepare消息,异步进行prepare消息的halfTopic落盘操作
public CompletableFuture<PutMessageResult> asyncPutHalfMessage(MessageExtBrokerInner messageInner) {
    return store.asyncPutMessage(parseHalfMessageInner(messageInner));
}
// 解析prepare消息,进行topic、queueId转换
private MessageExtBrokerInner parseHalfMessageInner(MessageExtBrokerInner msgInner) {
    // 真实topic、queueId放入属性中
    MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_TOPIC, msgInner.getTopic());
    MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_QUEUE_ID,
        String.valueOf(msgInner.getQueueId()));
    // 设置事务相关系统标记
    msgInner.setSysFlag(
        MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), MessageSysFlag.TRANSACTION_NOT_TYPE));
    // 设置prepare消息的topic为RMQ_SYS_TRANS_HALF_TOPIC;
    msgInner.setTopic(TransactionalMessageUtil.buildHalfTopic());
    // 队列Id为0
    msgInner.setQueueId(0);
    // 放入属性字符串中
    msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgInner.getProperties()));
    return msgInner;
}
发送Operation消息

Operation消息就是操作消息,记录操作日志的消息,这里指RocketMQ发送事务消息到broker,broker返回成功或失败,然后producer接受到broker返回的成功或失败,然后执行producer本地事务,发送事务的commit或rollback这样的Operation消息到broker的Operation的Topic,确保了事务的最终一致性。

在broker存储prepare消息成功后,返回存储状态,producer根据这个状态来结束事务,发送commit、rollback的Operation消息,然后broker根据这个事务状态进行相关处理;
成功状态:broker将prepare消息commit到真实的topic中,等待消费者消息消费
回滚状态:broker将prepare进行producer事务状态回查,再此确认事务状态,做一个事务补偿的措施。
rocketmq并不会无休止的的信息事务状态回查,默认回查15次,如果15次回查还是无法得知事务状态,rocketmq默认回滚该消息。

producer结束事务

producer接收到broker返回的存储结果,执行本地事务,然后执行endTransaction(),根据这个状态来结束事务,发送commit、rollback的Operation消息。

DefaultMQProducerImpl.sendMessageInTransaction()
try {
        // 同步发送消息
        sendResult = this.send(msg);
    } catch (Exception e) {
        throw new MQClientException("send message Exception", e);
    }
    // 初始化本地事务状态
    LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
    Throwable localException = null;
    switch (sendResult.getSendStatus()) {
        case SEND_OK: {
            try {
                // 放置事务Id属性
                if (sendResult.getTransactionId() != null) {
                    msg.putUserProperty("__transactionId__", sendResult.getTransactionId());
                }
                // 消息UNIQ_KEY作为事务Id
                String transactionId = msg.getProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX);
                if (null != transactionId && !"".equals(transactionId)) {
                    msg.setTransactionId(transactionId);
                }
                // 过时接口,推荐采用else
                if (null != localTransactionExecuter) {
                    localTransactionState = localTransactionExecuter.executeLocalTransactionBranch(msg, arg);
                } else if (transactionListener != null) {
                    log.debug("Used new transaction API");
                    // 执行本地事务
                    localTransactionState = transactionListener.executeLocalTransaction(msg, arg);
                }
                if (null == localTransactionState) {
                    localTransactionState = LocalTransactionState.UNKNOW;
                }
                // CommitMessage非成功,记录日志
                if (localTransactionState != LocalTransactionState.COMMIT_MESSAGE) {
                    log.info("executeLocalTransactionBranch return {}", localTransactionState);
                    log.info(msg.toString());
                }
            } catch (Throwable e) {
                log.info("executeLocalTransactionBranch exception", e);
                log.info(msg.toString());
                localException = e;
            }
        }
        break;
        // 返回回滚状态
        case FLUSH_DISK_TIMEOUT:
        case FLUSH_SLAVE_TIMEOUT:
        case SLAVE_NOT_AVAILABLE:
            localTransactionState = LocalTransactionState.ROLLBACK_MESSAGE;
            break;
        default:
            break;
    }

    try {
        // 结束事务,发送commit/rollback的Operation消息
        this.endTransaction(sendResult, localTransactionState, localException);
    } catch (Exception e) {
        log.warn("local transaction execute " + localTransactionState + ", but end broker transaction failed", e);
    }
    // 事务消费发送是否成功状态
    TransactionSendResult transactionSendResult = new TransactionSendResult();
    transactionSendResult.setSendStatus(sendResult.getSendStatus());
    transactionSendResult.setMessageQueue(sendResult.getMessageQueue());
    transactionSendResult.setMsgId(sendResult.getMsgId());
    transactionSendResult.setQueueOffset(sendResult.getQueueOffset());
    transactionSendResult.setTransactionId(sendResult.getTransactionId());
    transactionSendResult.setLocalTransactionState(localTransactionState);
    return transactionSendResult;
}

根据不同的本地事务状态,提交不同的commit、rollback的Operation消息,发送END_TRANSACTION请求到broker端。

/**
 * 结束事务,向broker发送的Operation消息,commit/rollback消息
 * @param sendResult 发送事务消息返回结果
 * @param localTransactionState 本地事务状态
 * @param localException 执行本地事务是否异常
 */
public void endTransaction(
    final SendResult sendResult,
    final LocalTransactionState localTransactionState,
    final Throwable localException) throws RemotingException, MQBrokerException, InterruptedException, UnknownHostException {
    // producer向broker发送事务Operation消息
    final MessageId id;

    if (sendResult.getOffsetMsgId() != null) {
        // broker端消息的唯一标识
        id = MessageDecoder.decodeMessageId(sendResult.getOffsetMsgId());
    } else {
        // producer端消息的唯一标识
        id = MessageDecoder.decodeMessageId(sendResult.getMsgId());
    }
    // 获取事务Id
    String transactionId = sendResult.getTransactionId();
    // 消息所在broker地址
    final String brokerAddr = this.mQClientFactory.findBrokerAddressInPublish(sendResult.getMessageQueue().getBrokerName());
    // 构造Operation消息请求
    EndTransactionRequestHeader requestHeader = new EndTransactionRequestHeader();
    requestHeader.setTransactionId(transactionId);
    requestHeader.setCommitLogOffset(id.getOffset());
    switch (localTransactionState) {
        case COMMIT_MESSAGE:
            // 提交消息
            requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_COMMIT_TYPE);
            break;
        case ROLLBACK_MESSAGE:
            // 回滚消息
            requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_ROLLBACK_TYPE);
            break;
        case UNKNOW:
            // 事务状态未知类型
            requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_NOT_TYPE);
            break;
        default:
            break;
    }

    requestHeader.setProducerGroup(this.defaultMQProducer.getProducerGroup());
    requestHeader.setTranStateTableOffset(sendResult.getQueueOffset());
    requestHeader.setMsgId(sendResult.getMsgId());
    // 本地事务异常信息
    String remark = localException != null ? ("executeLocalTransactionBranch exception: " + localException.toString()) : null;
    // 事务结束,发送Operation消息
    this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, requestHeader, remark,
        this.defaultMQProducer.getSendMsgTimeout());
}
Broker端处理结束事务请求

Broker端处理结束事务请求,Producer有两种情况会发送这个请求:

  1. 正常结束事务,发送事务结束请求
  2. broker发起事务回查,Producer相应会查结果,发送这个结束事务的请求。

EndTransactionProcessor用来处理处理结束事务END_TRANSACTION请求;根据Producer本地事务commit/rollback/unknow,broker进行相应的Operation消息存储,提交消息或回滚消息,都将这条消息的Operation消息设置为d删除状态,写入到commitlog文件中。

这个请求包含三种事务状态,对应两种Operation消息的记录。

事务状态:

  • 成功
    获取prepare消息,将消息还原为真实topic等属性,将真实消息写入到commitlog文件中,删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中。
  • 失败
    获取prepare消息,删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中。
  • 未知
    对于未知事务状态,直接返回空;producer发送endTransaction的Operation消息是单向的,producer不用关心是否获取broker的返回结果,所以在前面直接返回null,那这个通信的可靠性从哪里来呢?从broker控制会查次数,默认最大15次,来实现这个通信的业务是否最终成功。
// broker处理Operation消息,根据commit/rollback消息
@Override
    public RemotingCommand processRequest(ChannelHandlerContext ctx, RemotingCommand request) throws
    RemotingCommandException {
    final RemotingCommand response = RemotingCommand.createResponseCommand(null);
    final EndTransactionRequestHeader requestHeader =
        (EndTransactionRequestHeader)request.decodeCommandCustomHeader(EndTransactionRequestHeader.class);
    LOGGER.debug("Transaction request:{}", requestHeader);
    // Slave节点不处理事务消息
    if (BrokerRole.SLAVE == brokerController.getMessageStoreConfig().getBrokerRole()) {
        response.setCode(ResponseCode.SLAVE_NOT_AVAILABLE);
        LOGGER.warn("Message store is slave mode, so end transaction is forbidden. ");
        return response;
    }
    //
    // 表示是否是回查检查消息。
    // 结束事务发送的请求时fromTransactionCheck是false
    // 处理broker发送的会查请求,根据本地事务状态,producer再次发送commit、rollback消息时fromTransactionCheck是true
    // 回查事务和结束事务,执行功能是一样的;只是针对Commit事务状态处理不同;
    if (requestHeader.getFromTransactionCheck()) {
        switch (requestHeader.getCommitOrRollback()) {
            // 未知事务状态,直接返回空
            case MessageSysFlag.TRANSACTION_NOT_TYPE: {
                LOGGER.warn("Check producer[{}] transaction state, but it's pending status."
                        + "RequestHeader: {} Remark: {}",
                    RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                    requestHeader.toString(),
                    request.getRemark());
                return null;
            }
            // Commit事务状态
            case MessageSysFlag.TRANSACTION_COMMIT_TYPE: {
                LOGGER.warn("Check producer[{}] transaction state, the producer commit the message."
                        + "RequestHeader: {} Remark: {}",
                    RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                    requestHeader.toString(),
                    request.getRemark());

                break;
            }
            // 回滚事务状态
            case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE: {
                LOGGER.warn("Check producer[{}] transaction state, the producer rollback the message."
                        + "RequestHeader: {} Remark: {}",
                    RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                    requestHeader.toString(),
                    request.getRemark());
                break;
            }
            default:
                return null;
        }
    } else {
        switch (requestHeader.getCommitOrRollback()) {
            // 未知事务状态
            case MessageSysFlag.TRANSACTION_NOT_TYPE: {
                LOGGER.warn("The producer[{}] end transaction in sending message,  and it's pending status."
                        + "RequestHeader: {} Remark: {}",
                    RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                    requestHeader.toString(),
                    request.getRemark());
                return null;
            }

            case MessageSysFlag.TRANSACTION_COMMIT_TYPE: {
                break;
            }
            // 回滚事务状态
            case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE: {
                LOGGER.warn("The producer[{}] end transaction in sending message, rollback the message."
                        + "RequestHeader: {} Remark: {}",
                    RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                    requestHeader.toString(),
                    request.getRemark());
                break;
            }
            default:
                return null;
        }
    }
    OperationResult result = new OperationResult();
    // 对于未知事务状态,直接返回空;producer发送endTransaction的Operation消息是单向的,producer不用关心是否获取broker的返回结果,
    // 所以在前面直接返回null,那这个通信的可靠性从哪里来呢?从broker控制会查次数,默认最大15次,来实现这个通信的业务是否最终成功。

    // 提交消息或回滚消息,都将这条消息的Operation消息设置为d删除状态,写入到commitlog文件中
    // 提交事务Operation消息
    if (MessageSysFlag.TRANSACTION_COMMIT_TYPE == requestHeader.getCommitOrRollback()) {
        // 根据offset获取需要commitlog中的prepare消息;
        result = this.brokerController.getTransactionalMessageService().commitMessage(requestHeader);
        if (result.getResponseCode() == ResponseCode.SUCCESS) {
            RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
            if (res.getCode() == ResponseCode.SUCCESS) {
                // prepare消息,还原真实消息的topic等属性
                MessageExtBrokerInner msgInner = endMessageTransaction(result.getPrepareMessage());
                // 重置系统事务属性
                msgInner.setSysFlag(MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), requestHeader.getCommitOrRollback()));
                msgInner.setQueueOffset(requestHeader.getTranStateTableOffset());
                msgInner.setPreparedTransactionOffset(requestHeader.getCommitLogOffset());
                msgInner.setStoreTimestamp(result.getPrepareMessage().getStoreTimestamp());
                // 事务要提交,清除事务消息标识
                MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_TRANSACTION_PREPARED);
                // 将prepare消息还原成真实的消息,并将消息存储到commitlog中
                RemotingCommand sendResult = sendFinalMessage(msgInner);
                // 成功,删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中
                if (sendResult.getCode() == ResponseCode.SUCCESS) {
                    this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
                }
                return sendResult;
            }
            return res;
        }
    // 回滚事务Operation消息
    } else if (MessageSysFlag.TRANSACTION_ROLLBACK_TYPE == requestHeader.getCommitOrRollback()) {
        // 回滚事务,根据offset获取,commitlog中的prepare消息
        result = this.brokerController.getTransactionalMessageService().rollbackMessage(requestHeader);
        if (result.getResponseCode() == ResponseCode.SUCCESS) {
            RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
            if (res.getCode() == ResponseCode.SUCCESS) {
                // 成功,删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中
                this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
            }
            return res;
        }
    }
    response.setCode(result.getResponseCode());
    response.setRemark(result.getResponseRemark());
    return response;
}

写入Operation消息

删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中。

/**
 * 删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中
 * @param msgExt prepare消息
 * @return
 */
@Override
public boolean deletePrepareMessage(MessageExt msgExt) {
    // 删除prepare消息标识d,将这条消息的Operation消息标识为删除状态
    if (this.transactionalMessageBridge.putOpMessage(msgExt, TransactionalMessageUtil.REMOVETAG)) {
        log.debug("Transaction op message write successfully. messageId={}, queueId={} msgExt:{}", msgExt.getMsgId(), msgExt.getQueueId(), msgExt);
        return true;
    } else {
        log.error("Transaction op message write failed. messageId is {}, queueId is {}", msgExt.getMsgId(), msgExt.getQueueId());
        return false;
    }
}
    /**
 * 删除prepare消息,将这条事务消息的Operation消息标记为d,将Operation消息写入Operation的topic对应的commitlog文件中
 * @param messageExt prepare消息
 * @param opType
 * @return
 */
public boolean putOpMessage(MessageExt messageExt, String opType) {
    // 构建prepare消息的MessageQueue
    MessageQueue messageQueue = new MessageQueue(messageExt.getTopic(),
        this.brokerController.getBrokerConfig().getBrokerName(), messageExt.getQueueId());
    // 是否删除这条option消息,将它从Operation的topic中的消息表示删除为d
    if (TransactionalMessageUtil.REMOVETAG.equals(opType)) {
        return addRemoveTagInTransactionOp(messageExt, messageQueue);
    }
    return true;
}
    /**
 * 在事务消息的commit、rollback时,将这个事务消息的offset对应的Operation的topic下队列的Operation消息标记为d;
 * 也就是这个标示为d的Operation消息,存入Operation的topic下的消息队列中,然后写入commitlog文件中。
 * Use this function while transaction msg is committed or rollback write a flag 'd' to operation queue for the
 * msg's offset
 *
 * @param messageExt prepare消息
 * @param messageQueue prepare消息的MessageQueue
 * @return This method will always return true.
 */
private boolean addRemoveTagInTransactionOp(MessageExt messageExt, MessageQueue messageQueue) {
    // 新建被标记删除的Operation消息,删除标识赋值给消息的tag标签
    Message message = new Message(TransactionalMessageUtil.buildOpTopic(), TransactionalMessageUtil.REMOVETAG,
        String.valueOf(messageExt.getQueueOffset()).getBytes(TransactionalMessageUtil.charset));
    // Operation消息写入commitlog中
    writeOp(message, messageQueue);
    return true;
}

    /**
 * @param message operation消息
 * @param mq prepare消息的MessageQueue
 */
private void writeOp(Message message, MessageQueue mq) {
    MessageQueue opQueue;
    if (opQueueMap.containsKey(mq)) {
        opQueue = opQueueMap.get(mq);
    } else {
        opQueue = getOpQueueByHalf(mq);
        MessageQueue oldQueue = opQueueMap.putIfAbsent(mq, opQueue);
        if (oldQueue != null) {
            opQueue = oldQueue;
        }
    }
    if (opQueue == null) {
        // Operation消息Queue
        opQueue = new MessageQueue(TransactionalMessageUtil.buildOpTopic(), mq.getBrokerName(), mq.getQueueId());
    }
    // 存储消息到commitlog文件中
    putMessage(makeOpMessageInner(message, opQueue));
}

public boolean putMessage(MessageExtBrokerInner messageInner) {
    PutMessageResult putMessageResult = store.putMessage(messageInner);
    if (putMessageResult != null
        && putMessageResult.getPutMessageStatus() == PutMessageStatus.PUT_OK) {
        return true;
    } else {
        LOGGER.error("Put message failed, topic: {}, queueId: {}, msgId: {}",
            messageInner.getTopic(), messageInner.getQueueId(), messageInner.getMsgId());
        return false;
    }
}

流程总览

image.png


事务补偿机制
流程总览

image.png

Broker端

BrokerController初始化事务服务,在Master角色的Broker端开启事务回查服务。

private void initialTransaction() {
    this.transactionalMessageService = ServiceProvider.loadClass(ServiceProvider.TRANSACTION_SERVICE_ID, TransactionalMessageService.class);
    if (null == this.transactionalMessageService) {
        this.transactionalMessageService = new TransactionalMessageServiceImpl(new TransactionalMessageBridge(this, this.getMessageStore()));
        log.warn("Load default transaction message hook service: {}", TransactionalMessageServiceImpl.class.getSimpleName());
    }
    this.transactionalMessageCheckListener = ServiceProvider.loadClass(ServiceProvider.TRANSACTION_LISTENER_ID, AbstractTransactionalMessageCheckListener.class);
    if (null == this.transactionalMessageCheckListener) {
        this.transactionalMessageCheckListener = new DefaultTransactionalMessageCheckListener();
        log.warn("Load default discard message hook service: {}", DefaultTransactionalMessageCheckListener.class.getSimpleName());
    }
    this.transactionalMessageCheckListener.setBrokerController(this);
    this.transactionalMessageCheckService = new TransactionalMessageCheckService(this);
}

public void start() throws Exception {
    // 是否开启DLeger同步CommitLog,不开启DLeger
    if (!messageStoreConfig.isEnableDLegerCommitLog()) {
        // Master节点检测事务消息状态
        startProcessorByHa(messageStoreConfig.getBrokerRole());
        // Slave向Master发送请求,同步Master的Topic配置信息、消费者组消费的consumeroffset、同步延迟消费的偏移量信息、订阅者组信息等
        handleSlaveSynchronize(messageStoreConfig.getBrokerRole());
        // 单个Broker同步信息完毕,然后将这个Broker注册到所有的NameServer
        this.registerBrokerAll(true, false, true);
    }
}

TransactionalMessageCheckService是broker端定时对比half消息和operation消息,针对事务状态为未知的消息,进行消息事务状态回查,默认60秒进行一次事务回查。

@Override
public void run() {
    log.info("Start transaction check service thread!");
    // 60秒事务状态回查间隔
    long checkInterval = brokerController.getBrokerConfig().getTransactionCheckInterval();
    while (!this.isStopped()) {
        this.waitForRunning(checkInterval);
    }
    log.info("End transaction check service thread!");
}

@Override
protected void onWaitEnd() {
    long timeout = brokerController.getBrokerConfig().getTransactionTimeOut();
    int checkMax = brokerController.getBrokerConfig().getTransactionCheckMax();
    long begin = System.currentTimeMillis();
    log.info("Begin to check prepare message, begin time:{}", begin);
    // 检测事务状态
    this.brokerController.getTransactionalMessageService().check(timeout, checkMax, this.brokerController.getTransactionalMessageCheckListener());
    log.info("End to check prepare message, consumed time:{}", System.currentTimeMillis() - begin);
}
状态回查

回查事务状态的核心check()方法,处理未提交或未回滚的halfMessage,发送请求到producer进行事务状态回查。

@Override
public void check(long transactionTimeout, int transactionCheckMax,
    AbstractTransactionalMessageCheckListener listener) {
    try {
        String topic = TopicValidator.RMQ_SYS_TRANS_HALF_TOPIC;
        // 获取prepare消息队列
        Set<MessageQueue> msgQueues = transactionalMessageBridge.fetchMessageQueues(topic);
        if (msgQueues == null || msgQueues.size() == 0) {
            log.warn("The queue of topic is empty :" + topic);
            return;
        }
        log.debug("Check topic={}, queues={}", topic, msgQueues);
        // 遍历prepare消息队列
        for (MessageQueue messageQueue : msgQueues) {
            long startTime = System.currentTimeMillis();
            // 获取对应的Operation的MessageQueue
            MessageQueue opQueue = getOpQueue(messageQueue);
            // prepare消息offset
            long halfOffset = transactionalMessageBridge.fetchConsumeOffset(messageQueue);
            // operation消息offset
            long opOffset = transactionalMessageBridge.fetchConsumeOffset(opQueue);
            log.info("Before check, the queue={} msgOffset={} opOffset={}", messageQueue, halfOffset, opOffset);
            if (halfOffset < 0 || opOffset < 0) {
                log.error("MessageQueue: {} illegal offset read: {}, op offset: {},skip this queue", messageQueue,
                    halfOffset, opOffset);
                continue;
            }

            List<Long> doneOpOffset = new ArrayList<>();
            HashMap<Long, Long> removeMap = new HashMap<>();
            // 找出将要校验是否需要回查额的记录中已经commit或者rollback的消息存入removeMap中
            PullResult pullResult = fillOpRemoveMap(removeMap, opQueue, opOffset, halfOffset, doneOpOffset);
            if (null == pullResult) {
                log.error("The queue={} check msgOffset={} with opOffset={} failed, pullResult is null",
                    messageQueue, halfOffset, opOffset);
                continue;
            }
            // single thread
            int getMessageNullCount = 1;
            long newOffset = halfOffset;
            long i = halfOffset;
            while (true) {
                // 处理时间60秒
                if (System.currentTimeMillis() - startTime > MAX_PROCESS_TIME_LIMIT) {
                    log.info("Queue={} process time reach max={}", messageQueue, MAX_PROCESS_TIME_LIMIT);
                    break;
                }
                // 如果已经commit或者rollback,继续查找下一个
                if (removeMap.containsKey(i)) {
                    log.info("Half offset {} has been committed/rolled back", i);
                    Long removedOpOffset = removeMap.remove(i);
                    doneOpOffset.add(removedOpOffset);
                } else {
                    // 获取当前的这条事务消息
                    GetResult getResult = getHalfMsg(messageQueue, i);
                    MessageExt msgExt = getResult.getMsg();
                    if (msgExt == null) {
                        if (getMessageNullCount++ > MAX_RETRY_COUNT_WHEN_HALF_NULL) {
                            break;
                        }
                        if (getResult.getPullResult().getPullStatus() == PullStatus.NO_NEW_MSG) {
                            log.debug("No new msg, the miss offset={} in={}, continue check={}, pull result={}", i,
                                messageQueue, getMessageNullCount, getResult.getPullResult());
                            break;
                        } else {
                            log.info("Illegal offset, the miss offset={} in={}, continue check={}, pull result={}",
                                i, messageQueue, getMessageNullCount, getResult.getPullResult());
                            i = getResult.getPullResult().getNextBeginOffset();
                            newOffset = i;
                            continue;
                        }
                    }
                    // needDiscard()是否超过最大回查次数,每回查一次消息属性TRANSACTION_CHECK_TIMES增加1,默认最大的回查次数15
                    // needSkip()判断当前的消息是否超过了,系统的文件过期时间默认72小时,可broker配置文件中配置
                    if (needDiscard(msgExt, transactionCheckMax) || needSkip(msgExt)) {
                        listener.resolveDiscardMsg(msgExt);
                        newOffset = i + 1;
                        i++;
                        continue;
                    }
                    if (msgExt.getStoreTimestamp() >= startTime) {
                        log.debug("Fresh stored. the miss offset={}, check it later, store={}", i,
                            new Date(msgExt.getStoreTimestamp()));
                        break;
                    }
                    // 消息已存储的时间
                    long valueOfCurrentMinusBorn = System.currentTimeMillis() - msgExt.getBornTimestamp();
                    // 检测事务状态的时间即开始回查的时间,事务提交后需要一段时间才能开启回查,默认是6秒
                    long checkImmunityTime = transactionTimeout;
                    // 获取用户自定义的回查时间
                    String checkImmunityTimeStr = msgExt.getUserProperty(MessageConst.PROPERTY_CHECK_IMMUNITY_TIME_IN_SECONDS);
                    if (null != checkImmunityTimeStr) {
                        checkImmunityTime = getImmunityTime(checkImmunityTimeStr, transactionTimeout);
                        // 事务消息的存储时间小于开启回查的间隔时间
                        if (valueOfCurrentMinusBorn < checkImmunityTime) {
                            if (checkPrepareQueueOffset(removeMap, doneOpOffset, msgExt)) {
                                newOffset = i + 1;
                                i++;
                                continue;
                            }
                        }
                    } else {
                        if ((0 <= valueOfCurrentMinusBorn) && (valueOfCurrentMinusBorn < checkImmunityTime)) {
                            log.debug("New arrived, the miss offset={}, check it later checkImmunity={}, born={}", i,
                                checkImmunityTime, new Date(msgExt.getBornTimestamp()));
                            break;
                        }
                    }
                    List<MessageExt> opMsg = pullResult.getMsgFoundList();
                    // 判断回查条件是否满足
                    boolean isNeedCheck = (opMsg == null && valueOfCurrentMinusBorn > checkImmunityTime)
                        || (opMsg != null && (opMsg.get(opMsg.size() - 1).getBornTimestamp() - startTime > transactionTimeout))
                        || (valueOfCurrentMinusBorn <= -1);

                    if (isNeedCheck) {
                        // 满足条件需要从新存储一个新的事务消息
                        if (!putBackHalfMsgQueue(msgExt, i)) {
                            continue;
                        }
                        // 异步发送回查消息,msgExt是最新的offset
                        listener.resolveHalfMsg(msgExt);
                    } else {
                        pullResult = fillOpRemoveMap(removeMap, opQueue, pullResult.getNextBeginOffset(), halfOffset, doneOpOffset);
                        log.debug("The miss offset:{} in messageQueue:{} need to get more opMsg, result is:{}", i,
                            messageQueue, pullResult);
                        continue;
                    }
                }
                // 循环下一个
                newOffset = i + 1;
                i++;
            }
            // 更新事务消息的消费进度
            if (newOffset != halfOffset) {
                transactionalMessageBridge.updateConsumeOffset(messageQueue, newOffset);
            }
            // 计算最新的OP队列消费进度,更新进度
            long newOpOffset = calculateOpOffset(doneOpOffset, opOffset);
            if (newOpOffset != opOffset) {
                transactionalMessageBridge.updateConsumeOffset(opQueue, newOpOffset);
            }
        }
    } catch (Throwable e) {
        log.error("Check error", e);
    }

}

重点介绍fillOpRemoveMap()方法,找出将要校验是否需要回查额的记录中已经commit或者rollback的消息存入removeMap中。

/**
 * 找出将要校验是否需要回查额的记录中已经commit或者rollback的消息存入removeMap中
 * Read op message, parse op message, and fill removeMap
 *
 * @param removeMap Half message to be remove, key:halfOffset, value: opOffset.
 * @param opQueue Op message queue.
 * @param pullOffsetOfOp The begin offset of op message queue.
 * @param miniOffset The current minimum offset of half message queue.
 * @param doneOpOffset Stored op messages that have been processed.
 * @return Op message result.
 */
private PullResult fillOpRemoveMap(HashMap<Long, Long> removeMap,
    MessageQueue opQueue, long pullOffsetOfOp, long miniOffset, List<Long> doneOpOffset) {
    // 拉取32条消息
    PullResult pullResult = pullOpMsg(opQueue, pullOffsetOfOp, 32);
    if (null == pullResult) {
        return null;
    }
    if (pullResult.getPullStatus() == PullStatus.OFFSET_ILLEGAL
        || pullResult.getPullStatus() == PullStatus.NO_MATCHED_MSG) {
        log.warn("The miss op offset={} in queue={} is illegal, pullResult={}", pullOffsetOfOp, opQueue,
            pullResult);
        transactionalMessageBridge.updateConsumeOffset(opQueue, pullResult.getNextBeginOffset());
        return pullResult;
    } else if (pullResult.getPullStatus() == PullStatus.NO_NEW_MSG) {
        log.warn("The miss op offset={} in queue={} is NO_NEW_MSG, pullResult={}", pullOffsetOfOp, opQueue,
            pullResult);
        return pullResult;
    }
    List<MessageExt> opMsg = pullResult.getMsgFoundList();
    if (opMsg == null) {
        log.warn("The miss op offset={} in queue={} is empty, pullResult={}", pullOffsetOfOp, opQueue, pullResult);
        return pullResult;
    }
    for (MessageExt opMessageExt : opMsg) {
        // op队列中存储的内容是half队列事务消息已经commit和rollback的消息的offset
        Long queueOffset = getLong(new String(opMessageExt.getBody(), TransactionalMessageUtil.charset));
        log.debug("Topic: {} tags: {}, OpOffset: {}, HalfOffset: {}", opMessageExt.getTopic(),
            opMessageExt.getTags(), opMessageExt.getQueueOffset(), queueOffset);
        if (TransactionalMessageUtil.REMOVETAG.equals(opMessageExt.getTags())) {
            if (queueOffset < miniOffset) {
                doneOpOffset.add(opMessageExt.getQueueOffset());
            } else {
                removeMap.put(queueOffset, opMessageExt.getQueueOffset());
            }
        } else {
            log.error("Found a illegal tag in opMessageExt= {} ", opMessageExt);
        }
    }
    log.debug("Remove map: {}", removeMap);
    log.debug("Done op list: {}", doneOpOffset);
    return pullResult;
}
Half消息队列和Operation消息队列关系

画图分析Half消息队列和Operation消息队列关系,以及如何查询需要回查的消息。

带着图分析:

  1. removeMap是个Map集合的键值对key是half队列的消息offset,value是op队列的消息offset,图中看有两对(100005,80002)、(100004,80003)

  2. doneOpOffset是一个List集合,其中存储的是op队列的消息offset,图中只有8004

  3. check()循环查找half队列中的消息时,100004已经在removeMap中了,跳过下面业务继续循环下一个100005进行下一个逻辑,判断其是否具有回查消息的条件isNeedCheck。
    image.png

Producer本地事务状态回查

对于网络延迟等消失事务状态未知的消息,RocketMQ提出了事务补偿机制,broker进行producer的本地事务状态回查。

ClientRemotingProcessor.processRequest处理broker端发送的事务状态检查请求CHECK_TRANSACTION_STATE;

将请求包装可以执行的Runnable对象,提交到checkExecutor这个事务检查检测池中进行处理。

switch (request.getCode()) {
	// 处理事务消息的回查请求
	case RequestCode.CHECK_TRANSACTION_STATE:
	    return this.checkTransactionState(ctx, request);
}
/**
 * producer接收broker的事务消息回查请求,检测事务消息状态
 * @param addr
 * @param msg
 * @param header
 */
@Override
public void checkTransactionState(final String addr, final MessageExt msg,
    final CheckTransactionStateRequestHeader header) {
    // 检查本地事务可执行的请求,最后提交到checkExecutor这个事务检查检测池中进行处理。
    Runnable request = new Runnable() {
        private final String brokerAddr = addr;
        private final MessageExt message = msg;
        private final CheckTransactionStateRequestHeader checkRequestHeader = header;
        private final String group = DefaultMQProducerImpl.this.defaultMQProducer.getProducerGroup();

        @Override
        public void run() {
            TransactionCheckListener transactionCheckListener = DefaultMQProducerImpl.this.checkListener();
            TransactionListener transactionListener = getCheckListener();
            if (transactionCheckListener != null || transactionListener != null) {
                LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
                Throwable exception = null;
                try {
                    // transactionCheckListener将会被移除,不再推荐
                    if (transactionCheckListener != null) {
                        localTransactionState = transactionCheckListener.checkLocalTransactionState(message);
                    } else if (transactionListener != null) {
                        log.debug("Used new check API in transaction message");
                        // 检查本地事务
                        localTransactionState = transactionListener.checkLocalTransaction(message);
                    } else {
                        log.warn("CheckTransactionState, pick transactionListener by group[{}] failed", group);
                    }
                } catch (Throwable e) {
                    log.error("Broker call checkTransactionState, but checkLocalTransactionState exception", e);
                    exception = e;
                }
                // 根据本地事务状态,向Broker发送commit/rollback的Operation消息
                this.processTransactionState(
                    localTransactionState,
                    group,
                    exception);
            } else {
                log.warn("CheckTransactionState, pick transactionCheckListener by group[{}] failed", group);
            }
        }

        /**
         * 根据本地事务状态,向Broker发送commit/rollback的Operation消息
         * @param localTransactionState
         * @param producerGroup
         * @param exception
         */
        private void processTransactionState(
            final LocalTransactionState localTransactionState,
            final String producerGroup,
            final Throwable exception) {
            // Operation消息请求头
            final EndTransactionRequestHeader thisHeader = new EndTransactionRequestHeader();
            thisHeader.setCommitLogOffset(checkRequestHeader.getCommitLogOffset());
            thisHeader.setProducerGroup(producerGroup);
            thisHeader.setTranStateTableOffset(checkRequestHeader.getTranStateTableOffset());
            thisHeader.setFromTransactionCheck(true);

            String uniqueKey = message.getProperties().get(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX);
            if (uniqueKey == null) {
                uniqueKey = message.getMsgId();
            }
            thisHeader.setMsgId(uniqueKey);
            thisHeader.setTransactionId(checkRequestHeader.getTransactionId());
            switch (localTransactionState) {
                // 事务消息状态标志
                case COMMIT_MESSAGE:
                    thisHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_COMMIT_TYPE);
                    break;
                case ROLLBACK_MESSAGE:
                    thisHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_ROLLBACK_TYPE);
                    log.warn("when broker check, client rollback this transaction, {}", thisHeader);
                    break;
                case UNKNOW:
                    thisHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_NOT_TYPE);
                    log.warn("when broker check, client does not know this transaction state, {}", thisHeader);
                    break;
                default:
                    break;
            }

            String remark = null;
            if (exception != null) {
                remark = "checkLocalTransactionState Exception: " + RemotingHelper.exceptionSimpleDesc(exception);
            }

            try {
                // 检查事务结束,发送Operation消息
                DefaultMQProducerImpl.this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, thisHeader, remark,
                    3000);
            } catch (Exception e) {
                log.error("endTransactionOneway exception", e);
            }
        }
    };
    // 提交事务状态检测请求到线程池
    this.checkExecutor.submit(request);
}
参考

https://www.infoq.cn/article/2018/08/rocketmq-4.3-release

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值