监控平台报RocketMQ消费者组停止消费原因排查记录

背景:

一个TOPIC有TAGA、TAGB两个TGA,分别有两个消费者GROUPA、GROUPB分别监听这两个TAG, 一个TAGA有生产者发送很多消息,一个BTAG一直生产者没有消息

现象:

监控平台报TAGB的消费者组GROUPB停止消费,且消费点位落后几千条消息

原因:

代码中消费者组拉取消息是带着TAG,如果一个TAG一直没有消息,就不会修改对应消费者组的消费点位,被监控平台误判为MQ停止消息

Rocketmq源码解析(org.apache.rocketmq4.9.1版本):

org.apache.rocketmq.client.impl.consumer.PullAPIWrapper#pullKernelImpl

public PullResult pullKernelImpl(MessageQueue mq, String subExpression, String expressionType, long subVersion, long offset, int maxNums, int sysFlag, long commitOffset, long brokerSuspendMaxTimeMillis, long timeoutMillis, CommunicationMode communicationMode, PullCallback pullCallback) throws MQClientException, RemotingException, MQBrokerException, InterruptedException {
    FindBrokerResult findBrokerResult = this.mQClientFactory.findBrokerAddressInSubscribe(mq.getBrokerName(), this.recalculatePullFromWhichNode(mq), false);
    if (null == findBrokerResult) {
        this.mQClientFactory.updateTopicRouteInfoFromNameServer(mq.getTopic());
        findBrokerResult = this.mQClientFactory.findBrokerAddressInSubscribe(mq.getBrokerName(), this.recalculatePullFromWhichNode(mq), false);
    }
 
    if (findBrokerResult != null) {
        if (!ExpressionType.isTagType(expressionType) && findBrokerResult.getBrokerVersion() < Version.V4_1_0_SNAPSHOT.ordinal()) {
            throw new MQClientException("The broker[" + mq.getBrokerName() + ", " + findBrokerResult.getBrokerVersion() + "] does not upgrade to support for filter message by " + expressionType, (Throwable)null);
        } else {
            int sysFlagInner = sysFlag;
            if (findBrokerResult.isSlave()) {
                sysFlagInner = PullSysFlag.clearCommitOffsetFlag(sysFlag);
            }
 
            PullMessageRequestHeader requestHeader = new PullMessageRequestHeader();
            requestHeader.setConsumerGroup(this.consumerGroup);
            requestHeader.setTopic(mq.getTopic());
            requestHeader.setQueueId(mq.getQueueId());
            requestHeader.setQueueOffset(offset);
            requestHeader.setMaxMsgNums(maxNums);
            requestHeader.setSysFlag(sysFlagInner);
            requestHeader.setCommitOffset(commitOffset);
            requestHeader.setSuspendTimeoutMillis(brokerSuspendMaxTimeMillis);
            // TODO 拉取数据是传入了TAG过滤
            requestHeader.setSubscription(subExpression);
            requestHeader.setSubVersion(subVersion);
            requestHeader.setExpressionType(expressionType);
            String brokerAddr = findBrokerResult.getBrokerAddr();
            if (PullSysFlag.hasClassFilterFlag(sysFlagInner)) {
                brokerAddr = this.computePullFromWhichFilterServer(mq.getTopic(), brokerAddr);
            }
 
            PullResult pullResult = this.mQClientFactory.getMQClientAPIImpl().pullMessage(brokerAddr, requestHeader, timeoutMillis, communicationMode, pullCallback);
            return pullResult;
        }
    } else {
        throw new MQClientException("The broker[" + mq.getBrokerName() + "] not exist", (Throwable)null);
    }
}

org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl#pullMessage

public void pullMessage(final PullRequest pullRequest) {
    final ProcessQueue processQueue = pullRequest.getProcessQueue();
    if (processQueue.isDropped()) {
        this.log.info("the pull request[{}] is dropped.", pullRequest.toString());
    } else {
        pullRequest.getProcessQueue().setLastPullTimestamp(System.currentTimeMillis());
 
        try {
            this.makeSureStateOK();
        } catch (MQClientException var21) {
            this.log.warn("pullMessage exception, consumer state not ok", var21);
            this.executePullRequestLater(pullRequest, this.pullTimeDelayMillsWhenException);
            return;
        }
 
        if (this.isPause()) {
            this.log.warn("consumer was paused, execute pull request later. instanceName={}, group={}", this.defaultMQPushConsumer.getInstanceName(), this.defaultMQPushConsumer.getConsumerGroup());
            this.executePullRequestLater(pullRequest, 1000L);
        } else {
            long cachedMessageCount = processQueue.getMsgCount().get();
            long cachedMessageSizeInMiB = processQueue.getMsgSize().get() / 1048576L;
            if (cachedMessageCount > (long)this.defaultMQPushConsumer.getPullThresholdForQueue()) {
                this.executePullRequestLater(pullRequest, 50L);
                if (this.queueFlowControlTimes++ % 1000L == 0L) {
                    this.log.warn("the cached message count exceeds the threshold {}, so do flow control, minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}", new Object[]{this.defaultMQPushConsumer.getPullThresholdForQueue(), processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, this.queueFlowControlTimes});
                }
 
            } else if (cachedMessageSizeInMiB > (long)this.defaultMQPushConsumer.getPullThresholdSizeForQueue()) {
                this.executePullRequestLater(pullRequest, 50L);
                if (this.queueFlowControlTimes++ % 1000L == 0L) {
                    this.log.warn("the cached message size exceeds the threshold {} MiB, so do flow control, minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}", new Object[]{this.defaultMQPushConsumer.getPullThresholdSizeForQueue(), processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, this.queueFlowControlTimes});
                }
 
            } else {
                if (!this.consumeOrderly) {
                    if (processQueue.getMaxSpan() > (long)this.defaultMQPushConsumer.getConsumeConcurrentlyMaxSpan()) {
                        this.executePullRequestLater(pullRequest, 50L);
                        if (this.queueMaxSpanFlowControlTimes++ % 1000L == 0L) {
                            this.log.warn("the queue's messages, span too long, so do flow control, minOffset={}, maxOffset={}, maxSpan={}, pullRequest={}, flowControlTimes={}", new Object[]{processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), processQueue.getMaxSpan(), pullRequest, this.queueMaxSpanFlowControlTimes});
                        }
 
                        return;
                    }
                } else {
                    if (!processQueue.isLocked()) {
                        this.executePullRequestLater(pullRequest, this.pullTimeDelayMillsWhenException);
                        this.log.info("pull message later because not locked in broker, {}", pullRequest);
                        return;
                    }
 
                    if (!pullRequest.isPreviouslyLocked()) {
                        long offset = -1L;
 
                        try {
                            offset = this.rebalanceImpl.computePullFromWhereWithException(pullRequest.getMessageQueue());
                        } catch (Exception var20) {
                            this.executePullRequestLater(pullRequest, this.pullTimeDelayMillsWhenException);
                            this.log.error("Failed to compute pull offset, pullResult: {}", pullRequest, var20);
                            return;
                        }
 
                        boolean brokerBusy = offset < pullRequest.getNextOffset();
                        this.log.info("the first time to pull message, so fix offset from broker. pullRequest: {} NewOffset: {} brokerBusy: {}", new Object[]{pullRequest, offset, brokerBusy});
                        if (brokerBusy) {
                            this.log.info("[NOTIFYME]the first time to pull message, but pull request offset larger than broker consume offset. pullRequest: {} NewOffset: {}", pullRequest, offset);
                        }
 
                        pullRequest.setPreviouslyLocked(true);
                        pullRequest.setNextOffset(offset);
                    }
                }
 
                final SubscriptionData subscriptionData = (SubscriptionData)this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
                if (null == subscriptionData) {
                    this.executePullRequestLater(pullRequest, this.pullTimeDelayMillsWhenException);
                    this.log.warn("find the consumer's subscription failed, {}", pullRequest);
                } else {
                    final long beginTimestamp = System.currentTimeMillis();
                    PullCallback pullCallback = new PullCallback() {
                        public void onSuccess(PullResult pullResult) {
                            if (pullResult != null) {
                                pullResult = DefaultMQPushConsumerImpl.this.pullAPIWrapper.processPullResult(pullRequest.getMessageQueue(), pullResult, subscriptionData);
                                switch (pullResult.getPullStatus()) {
                                    case FOUND:
                                        long prevRequestOffset = pullRequest.getNextOffset();
                                        pullRequest.setNextOffset(pullResult.getNextBeginOffset());
                                        long pullRT = System.currentTimeMillis() - beginTimestamp;
                                        DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullRT(pullRequest.getConsumerGroup(), pullRequest.getMessageQueue().getTopic(), pullRT);
                                        long firstMsgOffset = Long.MAX_VALUE;
                                        if (pullResult.getMsgFoundList() != null && !pullResult.getMsgFoundList().isEmpty()) {
                                            firstMsgOffset = ((MessageExt)pullResult.getMsgFoundList().get(0)).getQueueOffset();
                                            DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullTPS(pullRequest.getConsumerGroup(), pullRequest.getMessageQueue().getTopic(), (long)pullResult.getMsgFoundList().size());
                                            boolean dispatchToConsume = processQueue.putMessage(pullResult.getMsgFoundList());
                                            DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(pullResult.getMsgFoundList(), processQueue, pullRequest.getMessageQueue(), dispatchToConsume);
                                            if (DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval() > 0L) {
                                                DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval());
                                            } else {
                                                DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                                            }
                                        } else {
                                            DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                                        }
 
                                        if (pullResult.getNextBeginOffset() < prevRequestOffset || firstMsgOffset < prevRequestOffset) {
                                            DefaultMQPushConsumerImpl.this.log.warn("[BUG] pull message result maybe data wrong, nextBeginOffset: {} firstMsgOffset: {} prevRequestOffset: {}", new Object[]{pullResult.getNextBeginOffset(), firstMsgOffset, prevRequestOffset});
                                        }
                                        break;
                                    // TODO 拉取在回调中进行offset提交,如果pullResult(拉取结果)为NO_NEW_MSG,不会回写offset
                                    case NO_NEW_MSG:
                                    case NO_MATCHED_MSG:
                                        pullRequest.setNextOffset(pullResult.getNextBeginOffset());
                                        DefaultMQPushConsumerImpl.this.correctTagsOffset(pullRequest);
                                        DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                                        break;
                                    case OFFSET_ILLEGAL:
                                        DefaultMQPushConsumerImpl.this.log.warn("the pull request offset illegal, {} {}", pullRequest.toString(), pullResult.toString());
                                        pullRequest.setNextOffset(pullResult.getNextBeginOffset());
                                        pullRequest.getProcessQueue().setDropped(true);
                                        DefaultMQPushConsumerImpl.this.executeTaskLater(new Runnable() {
                                            public void run() {
                                                try {
                                                    DefaultMQPushConsumerImpl.this.offsetStore.updateOffset(pullRequest.getMessageQueue(), pullRequest.getNextOffset(), false);
                                                    DefaultMQPushConsumerImpl.this.offsetStore.persist(pullRequest.getMessageQueue());
                                                    DefaultMQPushConsumerImpl.this.rebalanceImpl.removeProcessQueue(pullRequest.getMessageQueue());
                                                    DefaultMQPushConsumerImpl.this.log.warn("fix the pull request offset, {}", pullRequest);
                                                } catch (Throwable var2) {
                                                    DefaultMQPushConsumerImpl.this.log.error("executeTaskLater Exception", var2);
                                                }
 
                                            }
                                        }, 10000L);
                                }
                            }
 
                        }
 
                        public void onException(Throwable e) {
                            if (!pullRequest.getMessageQueue().getTopic().startsWith("%RETRY%")) {
                                DefaultMQPushConsumerImpl.this.log.warn("execute the pull request exception", e);
                            }
 
                            DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, DefaultMQPushConsumerImpl.this.pullTimeDelayMillsWhenException);
                        }
                    };
                    boolean commitOffsetEnable = false;
                    long commitOffsetValue = 0L;
                    if (MessageModel.CLUSTERING == this.defaultMQPushConsumer.getMessageModel()) {
                        commitOffsetValue = this.offsetStore.readOffset(pullRequest.getMessageQueue(), ReadOffsetType.READ_FROM_MEMORY);
                        if (commitOffsetValue > 0L) {
                            commitOffsetEnable = true;
                        }
                    }
 
                    String subExpression = null;
                    boolean classFilter = false;
                    SubscriptionData sd = (SubscriptionData)this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
                    if (sd != null) {
                        if (this.defaultMQPushConsumer.isPostSubscriptionWhenPull() && !sd.isClassFilterMode()) {
                            subExpression = sd.getSubString();
                        }
 
                        classFilter = sd.isClassFilterMode();
                    }
 
                    int sysFlag = PullSysFlag.buildSysFlag(commitOffsetEnable, true, subExpression != null, classFilter);
 
                    try {
                        this.pullAPIWrapper.pullKernelImpl(pullRequest.getMessageQueue(), subExpression, subscriptionData.getExpressionType(), subscriptionData.getSubVersion(), pullRequest.getNextOffset(), this.defaultMQPushConsumer.getPullBatchSize(), sysFlag, commitOffsetValue, 15000L, 30000L, CommunicationMode.ASYNC, pullCallback);
                    } catch (Exception var19) {
                        this.log.error("pullKernelImpl exception", var19);
                        this.executePullRequestLater(pullRequest, this.pullTimeDelayMillsWhenException);
                    }
 
                }
            }
        }
    }
}
 
 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值