producer发送消息的tps较低时rocketmq_client_consumer_pull_rt的值较高原因分析


前言

最近在本地做了一个测试发现,当producer端每分钟发送一条数据时,consumer端正常消费的情况下rocketmq-client-consumer-pull-rt指标的值却很高,本篇文章就来分析下原因。


一、场景描述

在本地启动一个producer和一个consumer,producer发送消息是每分钟发送一条,consumer消费消息正常消费即可,此时在grafana上发现rocketmq_client_consumer_pull_rt指标的值很高,具体如下图所示,rocketmq_client_consumer_pull_rt指标的含义是consumer拉取消息的响应时间。
在这里插入图片描述

二、原因分析

按照前面的场景描述会发现该场景其实可以概括为在consumer端要拉取的消息还没有被写入broker端的情况下rocketmq_client_consumer_pull_rt的指标值会较高。出现该现象的原因是broker端针对这种情况不会立即返回客户端响应而是通过轮询机制暂时将该请求hold一段时间,在这段时间内如果consumer请求的消息发送过来则会将消息返回给consumer,如果在这段时间内如果consumer请求的消息没有发送过来并且该请求在broker端hold的时间超过BROKER_SUSPEND_MAX_TIME_MILLIS时会将响应返回给客户端。接下来就来分析下RocketMQ的轮询机制。

1.broker端PullMessageProcessor处理consumer的拉取消息请求

(1)根据consumequeue从commitlog获取消息
如下所示是broker根据consumequeue从commitlog获取消息的关键代码,由于此时consumer端获取的消息producer还没有发送过来,所以getMessage方法会返回GetMessageStatus.OFFSET_OVERFLOW_ONE状态,接着PullMessageProcessor会根据该状态将其转换为ResponseCode.PULL_NOT_FOUND的响应码。

final GetMessageResult getMessageResult =
            this.brokerController.getMessageStore().getMessage(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
                requestHeader.getQueueId(), requestHeader.getQueueOffset(), requestHeader.getMaxMsgNums(), messageFilter);

public GetMessageResult getMessage(final String group, final String topic, final int queueId, final long offset,
        final int maxMsgNums,
        final MessageFilter messageFilter) {
        if (this.shutdown) {
            log.warn("message store has shutdown, so getMessage is forbidden");
            return null;
        }

        if (!this.runningFlags.isReadable()) {
            log.warn("message store is not readable, so getMessage is forbidden " + this.runningFlags.getFlagBits());
            return null;
        }

        long beginTime = this.getSystemClock().now();

        GetMessageStatus status = GetMessageStatus.NO_MESSAGE_IN_QUEUE;
        long nextBeginOffset = offset;
        long minOffset = 0;
        long maxOffset = 0;

        GetMessageResult getResult = new GetMessageResult();

        final long maxOffsetPy = this.commitLog.getMaxOffset();

        ConsumeQueue consumeQueue = findConsumeQueue(topic, queueId);
        if (consumeQueue != null) {
            minOffset = consumeQueue.getMinOffsetInQueue();
            maxOffset = consumeQueue.getMaxOffsetInQueue();

            if (maxOffset == 0) {
                status = GetMessageStatus.NO_MESSAGE_IN_QUEUE;
                nextBeginOffset = nextOffsetCorrection(offset, 0);
            } else if (offset < minOffset) {
                status = GetMessageStatus.OFFSET_TOO_SMALL;
                nextBeginOffset = nextOffsetCorrection(offset, minOffset);
            } else if (offset == maxOffset) {
                status = GetMessageStatus.OFFSET_OVERFLOW_ONE;
                nextBeginOffset = nextOffsetCorrection(offset, offset);
            } else if (offset > maxOffset) {
                status = GetMessageStatus.OFFSET_OVERFLOW_BADLY;
                if (0 == minOffset) {
                    nextBeginOffset = nextOffsetCorrection(offset, minOffset);
                } else {
                    nextBeginOffset = nextOffsetCorrection(offset, maxOffset);
                }
            } else {
                SelectMappedBufferResult bufferConsumeQueue = consumeQueue.getIndexBuffer(offset);
                if (bufferConsumeQueue != null) {
                    try {
                        status = GetMessageStatus.NO_MATCHED_MESSAGE;

                        long nextPhyFileStartOffset = Long.MIN_VALUE;
                        long maxPhyOffsetPulling = 0;
                        ...
}

switch (getMessageResult.getStatus()) {
                case FOUND:
                    response.setCode(ResponseCode.SUCCESS);
                    break;
                case MESSAGE_WAS_REMOVING:
                    response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                    break;
                case NO_MATCHED_LOGIC_QUEUE:
                case NO_MESSAGE_IN_QUEUE:
                    if (0 != requestHeader.getQueueOffset()) {
                        response.setCode(ResponseCode.PULL_OFFSET_MOVED);

                        // XXX: warn and notify me
                        log.info("the broker store no queue data, fix the request offset {} to {}, Topic: {} QueueId: {} Consumer Group: {}",
                            requestHeader.getQueueOffset(),
                            getMessageResult.getNextBeginOffset(),
                            requestHeader.getTopic(),
                            requestHeader.getQueueId(),
                            requestHeader.getConsumerGroup()
                        );
                    } else {
                        response.setCode(ResponseCode.PULL_NOT_FOUND);
                    }
                    break;
                case NO_MATCHED_MESSAGE:
                    response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                    break;
                case OFFSET_FOUND_NULL:
                    response.setCode(ResponseCode.PULL_NOT_FOUND);
                    break;
                case OFFSET_OVERFLOW_BADLY:
                    response.setCode(ResponseCode.PULL_OFFSET_MOVED);
                    // XXX: warn and notify me
                    log.info("the request offset: {} over flow badly, broker max offset: {}, consumer: {}",
                        requestHeader.getQueueOffset(), getMessageResult.getMaxOffset(), channel.remoteAddress());
                    break;
                case OFFSET_OVERFLOW_ONE:
                    response.setCode(ResponseCode.PULL_NOT_FOUND);
                    break;
                case OFFSET_TOO_SMALL:
                    response.setCode(ResponseCode.PULL_OFFSET_MOVED);
                    log.info("the request offset too small. group={}, topic={}, requestOffset={}, brokerMinOffset={}, clientIp={}",
                        requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueOffset(),
                        getMessageResult.getMinOffset(), channel.remoteAddress());
                    break;
                default:
                    assert false;
                    break;
            }

接下来PullMessageProcessor会根据不同的响应码做不同的处理,这里先介绍两个参数:

  • brokerAllowSuspend

当PullMessageProcessor第一次处理拉取消息请求时会将brokerAllowSuspend设置为true;

case ResponseCode.PULL_NOT_FOUND:

                    if (brokerAllowSuspend && hasSuspendFlag) {
                        long pollingTimeMills = suspendTimeoutMillisLong;
                        if (!this.brokerController.getBrokerConfig().isLongPollingEnable()) {
                            pollingTimeMills = this.brokerController.getBrokerConfig().getShortPollingTimeMills();
                        }

                        String topic = requestHeader.getTopic();
                        long offset = requestHeader.getQueueOffset();
                        int queueId = requestHeader.getQueueId();
                        PullRequest pullRequest = new PullRequest(request, channel, pollingTimeMills,
                            this.brokerController.getMessageStore().now(), offset, subscriptionData, messageFilter);
                        this.brokerController.getPullRequestHoldService().suspendPullRequest(topic, queueId, pullRequest);
                        response = null;
                        break;
                    }
                    
public RemotingCommand processRequest(final ChannelHandlerContext ctx,
        RemotingCommand request) throws RemotingCommandException {
        return this.processRequest(ctx.channel(), request, true);
    }
    
  • hasSuspendFlag
    hasSuspendFlag是在consumer端拉取消息的请求中设置为true

在这里插入图片描述

  • suspendTimeoutMillisLong
    suspendTimeoutMillisLong是consumer端拉取消息的请求中设置,其值为15秒
    在这里插入图片描述在这里插入图片描述
    由于brokerAllowSuspend和hasSuspendFlag的值均为true,所以会进行以下操作:

  • 首先将pollingTimeMills设置为1000 * 15

  • 判断broker端是否启用长轮询,如果没有则pollingTimeMills被设置为1000(默认情况下broker端是开启长轮询机制的,是通过broker配置项中的longPollingEnable设置的)

  • 获取topic名称、queueId和queueoffset,然后构建请求pullRequest

  • 调用PullRequestHoldService的suspendPullRequest方法将请求hold,具体是将请求存储在一个map结构pullRequestTable,key是topic@queueId,value是对应的请求列表

public void suspendPullRequest(final String topic, final int queueId, final PullRequest pullRequest) {
        String key = this.buildKey(topic, queueId);
        ManyPullRequest mpr = this.pullRequestTable.get(key);
        if (null == mpr) {
            mpr = new ManyPullRequest();
            ManyPullRequest prev = this.pullRequestTable.putIfAbsent(key, mpr);
            if (prev != null) {
                mpr = prev;
            }
        }

        mpr.addPullRequest(pullRequest);
    }
  • 将返回给consumer的response设置为null,由于response被设置为null,所以这里并没有触发consumer端的读事件,不会触发对响应结果的处理,具体见下图所示代码
    在这里插入图片描述

2.PullRequestHoldService

接下来看看PullRequestHoldService的工作原理。

  • 如果broker端开启长轮询,则会每5秒去遍历pullRequestTable,否则每1秒去遍历pullRequestTable;
  • 调用checkHoldRequest方法遍历pullRequestTable,对每个请求PullRequestHoldService会先获取该topic在其queueId上的最大的queueoffset,然后比较queueoffset和请求中记录的待拉取的queueId的offset,如果后者大则会再次获取该queueId上的最大的queueoffset并进行比较,如果此时queueoffset大于待拉取的offset则说明有新消息到达,此时会对消息进行tag过滤,如果匹配则会调用executeRequestWhenWakeup方法进行消息拉取并返回给consumer;如果待拉取的offset大并且已经达到超时时间则会调用executeRequestWhenWakeup方法进行消息拉取并将响应结果返回给consumer;如果待拉取的offset大并且还没有到达超时时间则会重新将请求放入pullRequestTable中
    注意:从notifyMessageArriving方法中不难看出当待拉取的offset大于queueoffset时会再次获取queueoffset进行比较;还有一点就是超时时间判断,每个请求中会记录其开始挂起的时间以及超时时间,如果broker端开启长轮询则该超时时间的值就是BROKER_SUSPEND_MAX_TIME_MILLIS否则该值为1秒;
public void run() {
        log.info("{} service started", this.getServiceName());
        while (!this.isStopped()) {
            try {
                if (this.brokerController.getBrokerConfig().isLongPollingEnable()) {
                    this.waitForRunning(5 * 1000);
                } else {
                    this.waitForRunning(this.brokerController.getBrokerConfig().getShortPollingTimeMills());
                }

                long beginLockTimestamp = this.systemClock.now();
                this.checkHoldRequest();
                long costTime = this.systemClock.now() - beginLockTimestamp;
                if (costTime > 5 * 1000) {
                    log.info("[NOTIFYME] check hold request cost {} ms.", costTime);
                }
            } catch (Throwable e) {
                log.warn(this.getServiceName() + " service has exception. ", e);
            }
        }

        log.info("{} service end", this.getServiceName());
    }

public void notifyMessageArriving(final String topic, final int queueId, final long maxOffset, final Long tagsCode,
        long msgStoreTime, byte[] filterBitMap, Map<String, String> properties) {
        String key = this.buildKey(topic, queueId);
        ManyPullRequest mpr = this.pullRequestTable.get(key);
        if (mpr != null) {
            List<PullRequest> requestList = mpr.cloneListAndClear();
            if (requestList != null) {
                List<PullRequest> replayList = new ArrayList<PullRequest>();

                for (PullRequest request : requestList) {
                    long newestOffset = maxOffset;
                    if (newestOffset <= request.getPullFromThisOffset()) {
                        newestOffset = this.brokerController.getMessageStore().getMaxOffsetInQueue(topic, queueId);
                    }

                    if (newestOffset > request.getPullFromThisOffset()) {
                        boolean match = request.getMessageFilter().isMatchedByConsumeQueue(tagsCode,
                            new ConsumeQueueExt.CqExtUnit(tagsCode, msgStoreTime, filterBitMap));
                        // match by bit map, need eval again when properties is not null.
                        if (match && properties != null) {
                            match = request.getMessageFilter().isMatchedByCommitLog(null, properties);
                        }

                        if (match) {
                            try {
                                this.brokerController.getPullMessageProcessor().executeRequestWhenWakeup(request.getClientChannel(),
                                    request.getRequestCommand());
                            } catch (Throwable e) {
                                log.error("execute request when wakeup failed.", e);
                            }
                            continue;
                        }
                    }

                    if (System.currentTimeMillis() >= (request.getSuspendTimestamp() + request.getTimeoutMillis())) {
                        try {
                            this.brokerController.getPullMessageProcessor().executeRequestWhenWakeup(request.getClientChannel(),
                                request.getRequestCommand());
                        } catch (Throwable e) {
                            log.error("execute request when wakeup failed.", e);
                        }
                        continue;
                    }

                    replayList.add(request);
                }

                if (!replayList.isEmpty()) {
                    mpr.addPullRequest(replayList);
                }
            }
        }
    }

最后还需要注意一点:executeRequestWhenWakeup方法在提交RequestTask给PullMessageProcessor时brokerAllowSuspend参数被设置为false,也就是这次不论是否能够拉取到消息都不会将该请求挂起。
在这里插入图片描述

3.broker端有新消息到达

当broker端有新消息到达则会由ReputMessageService构建对应consumequeue,然后这里有一个关键逻辑需要注意,代码如下,即如果当前broker是master节点、开启长轮询机制并且messageArrivingListener不为null的情况下(在broker启动过程中会构建messageArrivingListener,所以在里不为null),调用arriving方法,arriving方法会调用PullRequestHoldService的notifyMessageArriving来遍历pullRequestTable,看看新达到的消息是否是请求中待拉取的消息。
在这里插入图片描述

总结

最后笔者画了一幅图用于总结整个过程:
在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值