RocketMQ源码级实现原理

使用MQ的好处就是异步、解耦、削峰,缺点就是增加复杂性、降低一致性/可用性

Commitlog同步/异步刷盘机制

467701cf74cc4743b0b5224e08de5548.png

81579fa4cbf3485289c5b64f74ef7d4f.png

同步刷盘

4cf3d13731154dfaa7752a12f0b56c32.png

异步刷盘

fad5acb97f114655a4ef3e4f3ecbf99a.png

5b18ca5fbb8c4c62b05de4aefab35726.png

这里一共有三种线程,FlushRealTimeService,CommitRealTimeService,broker写入主线程。CommitLog这个类负责来协调这三种线程,这三种线程沟通的桥梁,就是MappedFileQueue中的lastMappedFile。

broker写入主线程,和CommitRealTimeService负责往桥梁中写入消息并唤醒FlushRealTimeService线程,而FlushRealTimeService线程被前面两个线程唤醒后,就去将桥梁中新写入的消息flush到磁盘上去。

三个线程具体是如何协调的:

开启了堆外内存后,broker写入主线程会把消息写入到从堆外内存池借来的byteBuffer中,再由CommitRealTimeService线程,通过当前要写入的MappedFile的fileChannel字段,调用fileChannel.write(byteBuffer)来将byteBuffer中的消息,写入PageCache。CommitRealTimeService写入完成后,就调用FlushRealTimeService#wakeup()唤醒FlushRealTimeService线程,FlushRealTimeService线程,最后去将桥梁中新写入的消息flush到磁盘上去。

没开启堆外内存时,broker写入主线程会直接把消息写入到当前MappedFile的mappedByteBuffer字段中(也就是直接写入了PageCache中)。写入完成后,broker写入主线程就调用FlushRealTimeService#wakeup()唤醒FlushRealTimeService线程,FlushRealTimeService线程,最后去将桥梁中新写入的消息flush到磁盘上去。

一般有两种,有两种方式进行读写
(1)第一种,Mmap+PageCache的方式,读写消息都走的是pageCache,这样子读写都在pagecache里面不可避免会有锁的问题,在并发的读写操作情况下,会出现缺页中断降低,内存加锁,污染页的回写。
(2)第二种,DirectByteBuffer(堆外内存)+PageCache的两层架构方式,这样子可以实现读写消息分离,写入消息时候写到的是DirectByteBuffer——堆外内存中,读消息走的是PageCache(对于DirectByteBuffer是两步刷盘,一步是刷到PageCache,还有一步是刷到磁盘文件中),带来的好处就是,避免了内存操作的很多容易堵的地方,降低了时延,比如说缺页中断降低,内存加锁,污染页的回写。
 

消息消费总览

cdf321452ce2435aa4d40b2bbf9bd691.png

注意这三个Offset的含义,physical offset就是commitLog中的全局偏移量

eed9ec9006fa4149884338453490a4ec.png

f909087704e34540b88b729dca138358.png

分发dispatch

64f6ed70f8164359b1b9b4b8ec4a635f.png

可以看到,pull message和consume message实际上是两个过程,但是对于用户是透明的 

08994a3b20614d2d9e7d987fa538322f.png

如上图,Topic的每个queue,都绑定了唯一的一个pullRequest对象,每个pullRequest对象也都绑定了唯一的一个红黑树队列processQueue

每个虚线框,代表一个Java类、相同的颜色方框代表是一个线程中的调用,右上角蓝色方框就代表CallBack线程   

  • RebalanceService线程在进行rebalance时,会为每个pushConsumer(pullConsumer不归PullMessageService线程负责)负责的每个queue都分配一个专门的PullRequest,然后调用PullMessageService#executePullRequestImmediately(),把每个PullRequest都丢进PullMessageService线程的阻塞队列pullRequestQueue中去
  • PullMessageService线程自己的run()方法就负责不断从pullRequestQueue中拿PullRequest,并根据PullRequest中记录的offset从broker去拉取消息放入本地红黑树

每个消费者客户端有一个PullMessageService线程,负责多个topic的消息拉取

PullMessageService线程,在broker端有充足的消息时,

PullMessageService线程,通过一个while循环loop,来执行3个步骤:

  • 从pullResuqstQueue中take获取pullRequest
  • 获取到pullRequest后,执行一些流控动作;
  • 给broker发送一个异步的拉消息请求(并给broker传递一个回调函数)

当发送到broker端的拉消息请求拉回了消息后,会发送响应到客户端,客户端的netty IO线程会接收到响应,并把响应转给NettyClientPublicExcutor线程池中的线程,然后这个线程池中的线程会调用前面的回调函数,来把消息写入到TreeMap中

一般异步调用,典型的都会传递一个callback或者listener进去。最后会有另外一个IO线程来调起这个callback或者listener

doRebalance线程,有两个场景会将它唤醒:

  • waitForRunning 20s后自己主动醒来
  • broker端wake doRebalance线程

PullMessageService线程实现呢

public class PullMessageService extends ServiceThread {

    private final LinkedBlockingQueue<PullRequest> pullRequestQueue = new LinkedBlockingQueue<PullRequest>();
    private final MQClientInstance mQClientFactory;
    private final ScheduledExecutorService scheduledExecutorService = Executors.newSingleThreadScheduledExecutor(new ThreadFactory() {
            @Override
            public Thread newThread(Runnable r) {
                return new Thread(r, "PullMessageServiceScheduledThread");
            }
        });

    public PullMessageService(MQClientInstance mQClientFactory) {
        this.mQClientFactory = mQClientFactory;
    }

    public void executePullRequestLater(final PullRequest pullRequest, final long timeDelay) {
        if (!isStopped()) {
            this.scheduledExecutorService.schedule(new Runnable() {
                @Override
                public void run() {
                    PullMessageService.this.executePullRequestImmediately(pullRequest);
                }
            }, timeDelay, TimeUnit.MILLISECONDS);
        } else {
            log.warn("PullMessageServiceScheduledThread has shutdown");
        }
    }

    public void executePullRequestImmediately(final PullRequest pullRequest) {
        try {
            this.pullRequestQueue.put(pullRequest);
        } catch (InterruptedException e) {
            log.error("executePullRequestImmediately pullRequestQueue.put", e);
        }
    }

    private void pullMessage(final PullRequest pullRequest) {
        // 一台消费者机器就用一个MQClientInstance表示,
        //一个MQClientInstance中可能存放了很多不同的consumer,这些consumer订阅着不同的consumeGroup
        final MQConsumerInner consumer = this.mQClientFactory.selectConsumer(pullRequest.getConsumerGroup());
        if (consumer != null) {

            // 注意看,这里仅有DefaultMQPushConsumerImpl ,PullComsumer不在此列
            DefaultMQPushConsumerImpl impl = (DefaultMQPushConsumerImpl) consumer;
            impl.pullMessage(pullRequest);
        } else {
            log.warn("No matched consumer for the PullRequest {}, drop it", pullRequest);
        }
    }

    @Override
    public void run() {
        log.info(this.getServiceName() + " service started");

        while (!this.isStopped()) {
            try {
                PullRequest pullRequest = this.pullRequestQueue.take();
                this.pullMessage(pullRequest);
            } catch (InterruptedException ignored) {
            } catch (Exception e) {
                log.error("Pull Message Service Run Method exception", e);
            }
        }

        log.info(this.getServiceName() + " service end");
    }

}

一台消费者机器就用一个MQClientInstance表示

public class MQClientInstance {
    private final ClientConfig clientConfig;
    private final String clientId;

    private final ConcurrentMap<String/* group */, MQProducerInner> producerTable = new ConcurrentHashMap<String, MQProducerInner>();
    private final ConcurrentMap<String/* group */, MQConsumerInner> consumerTable = new ConcurrentHashMap<String, MQConsumerInner>();

}

一个MQClientInstance的consumerTable中可能存放了很多不同的consumer,这些consumer订阅着不同的consumeGroup,不同的consumer就对应着有不同的DefaultMQPushConsumerImpl实例,不同的DefaultMQPushConsumerImpl实例都会通过自己的start()方法来启动很多只属于自己的服务,比如每个DefaultMQPushConsumerImpl实例都会启动自己专属的coreSize和maxSize都为20的“ConsumeMessageThread_”消费线程池

这么看起来MQClientInstance的consumerTable中是没有放PullComsumer的

public class PullRequest {
    private String consumerGroup;
    private MessageQueue messageQueue;
    private ProcessQueue processQueue; // 本地红黑树缓存队列
    private long nextOffset;
    private boolean lockedFirst = false;
}

PullRequest是和消费者组消费者的某一个队列绑定的,如果同一台机器上,两个不同的消费者组的消费者订阅消费着同一个topic下的同一个队列,那么也会有两个不同的PullRequest,会被丢进同一个PullMessageService的阻塞队列pullRequestQueue中去

4c12c390105b476e84d25efd8e7a1f30.png

72626eebc2834c6ea5fff9a9faa8c9a6.png

可以看出,topic的queue越多,也就代表着有更大的并行度潜力

f60716ab943f464e94a432d9b254b9d1.png

715116ac8ddc40baa5d3114b453a40aa.png

rocketmq的rebalance:每个broker都执行上面的逻辑,对所有的队列,和所有的消费者进行排序。让每个broker都看到同样的视图

kafka的rebalance是在某一台机器上执行的,kafka需要借助zk来选择出leader

b555f7d60a6a4673ae9123f546791227.png

左侧的PullMessage就是PullMessageService线程;

左侧的线程执行完channel.writeAndFlush来把PullRequest从消费端发送到broker端该channel对应的recv_queue后,就可以直接返回,就可以去做下一件事了,不用还继续去等待broker返回该PullRequest对应的响应。这样做的好处就是线程非常轻量

粉红色条状,代表一个业务线程池。rocketmq的broker,定义了很多的key/value形式的pair对,key就是不同具体业务的编码,value就是不同编码对应的业务线程池,从而实现不同的业务隔离。

后续,netty的IO线程(worker线程组中的某个线程)接收到它负责的selector下的某个channel中的READ事件后,就会调用channelRead()方法,把PullRequest从内核的recv_queue中读取出来,然后netty的IO线程就把该PullRequest丢给专门负责去CommitLog中搜取消息的业务线程池(丢完后,该IO线程就可以去干下一件事了),让这个业务线程池中的某一个线程根据该PullRequest中的offset去ConsumeQueue->Commitlog中拉去当前批次的32条消息,成功拉取到后,是由这个业务线程池中的当前线程,执行channel.writeAndFlush方法,把这32条消息发回Consumer端。

Netty Thread的channelRead()拿到broker端传过来的pullResult后,把它丢入NettyClientPublicExceutor这个线程池中去,NettyClientPublicExceutor线程实际上也就是执行callback线程或者执行异步回调listener的线程;
callback线程也就是,客户端最开始拿到消息的线程
1. NettyClientPublicExceutor线程池中的线程,对pullResult内的东西进行反序列化,得到一批msg(一般为32条)
2. 再将这些msg放入红黑树中
3. 再把红黑树中的消息,丢入专门负责消费的一个线程池中去

fd2cc549a1da4b90b4c915c2c9a82a3f.png

有两个时机会上报消费进度:

  • 定时器每5s会定时上报一次本地offsetTable中的消费进度
  • 客户端每次异步发送PullRequest请求到broker端时,也会携带本地的消费进度值给broker

2db13235a2064b48bb4dfb6c53363304.png

Consumer端,是有一个ConcurrentMap<MessageQueue,AtomicLong> 的本地缓存,当前消费线程消费的是哪个队列的消息,消费完这一条消息后,无论消费成功还是失败,都会把该缓存的MessageQueue对应的AtomicLong更新为红黑树的firstKey的值,也就是红黑树中的最小消息offset的值(这个值,也就是Consumer端存在本地的消费进度值)

32条消息,先被丢入红黑树队列中,然后会被分别包装成32个Runnable类型的ConsumeRequest,这32个ConsumeRequest会被丢入客户端的消费线程池,消费线程池中的线程们,会并发的消费这32条消息,消费线程池中的线程1 2 3 … 20,会回调程序员注册进来的ConsumeMessageListener的consumeMessage()方法来执行真正的消费逻辑,不管consumeMessage()方法是消费成功、失败、抛异常,消费线程1 2 3 … 20,都会把自己当前正在消费的消息从红黑树队列中删除,并同时更新consumer本地的消费进度缓存。注意,由于消费线程池并发消费消息,也就有可能并发的从红黑树中remove消息,所以红黑树的这个remove方法,需要加锁

唯一的区别是,consumeMessage()方法消费失败时,会把retryTimes➕1,并把消息重新发回broker以便第二次重试消费,而消费成功时,则不会再再把该条消息发回broker端。但是,不管消费成功,还是消费失败,该条消息对应的offset,已经被顺利度过了,以后就不会再消费该offset的消息了,以后就算是之前被发回broker的重试消费的消息再次被拉回客户端消费时,则已经不是原来的offset了,此时的消息offset,肯定早已经比原来的offset大了

520d26adc864467499ae64df8615d9f7.png

751307387bda4ae2b15343e99d6c1a4d.png

这一块pageCache一直在内存中,一段时间内,读取的都是这块热点区域的消息,所以读取的消息是有保证的,这也是为什么rocketmq高效的原因之一;大量用了PageCache 

5477173e0cb843da8f9cc87b4219deb7.png

堆外内存,是原作者在公司的性能调优的利器,排名第二的调优才是jvm gc调优

但是这个堆外内存池有一个缺点,就是jvm crash时,会丢失数据,jvm正常是不会丢消息的,也没什么毛刺非常平稳,阿里双十一几十万的tps

cba2e2ccf19d4bcb8353ad9808111539.png

消息拉取限流

本地红黑树中消息大于1000条,是拉取动作会暂停,红黑树中消息大于100m也会暂停,红黑树中最小和最大消息offset超过2000,也会限流

public class Consumer {

    public static void main(String[] args) throws InterruptedException, MQClientException {
        DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("please_rename_unique_group_name_4");

        consumer.setConsumeFromWhere(ConsumeFromWhere.CONSUME_FROM_FIRST_OFFSET);

        consumer.subscribe("TopicTest", "*");

        consumer.registerMessageListener(new MessageListenerConcurrently() {

            @Override
            public ConsumeConcurrentlyStatus consumeMessage(List<MessageExt> msgs,
                ConsumeConcurrentlyContext context) {
                System.out.printf("%s Receive New Messages: %s %n", Thread.currentThread().getName(), msgs);
                return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
            }
        });

        consumer.start();

        System.out.printf("Consumer Started.%n");
    }
}

之前某位同事,在注册的监听器中,调用了某个公司的某个tcp方法,导致16个队列,就某一个队列不消费了。那么此时首先怀疑的就是这个监听器中可能有某个远程调用是阻塞住了的,最怀疑的就应该是这个监听器里面的HttpClient的远程调用,就这个调用没设置超时参数,就会一直卡住,从而一直导致一直打印:"the cached message count exceeds the threshold {}, so do flow control,之类的限流日志

客户端

package org.apache.rocketmq.client.impl.consumer;

public class PullMessageService extends ServiceThread {
    private final InternalLogger log = ClientLogger.getLog();
    private final LinkedBlockingQueue<PullRequest> pullRequestQueue = new LinkedBlockingQueue<PullRequest>();
    private final MQClientInstance mQClientFactory;
    
    // 大量使用调度线程池,来做延时调度
    private final ScheduledExecutorService scheduledExecutorService = Executors.newSingleThreadScheduledExecutor(new ThreadFactory() {
            @Override
            public Thread newThread(Runnable r) {
                return new Thread(r, "PullMessageServiceScheduledThread");
            }
        });

    public PullMessageService(MQClientInstance mQClientFactory) {
        this.mQClientFactory = mQClientFactory;
    }

    public void executePullRequestLater(final PullRequest pullRequest, final long timeDelay) {
        if (!isStopped()) {
            this.scheduledExecutorService.schedule(new Runnable() {
                @Override
                public void run() {
                    PullMessageService.this.executePullRequestImmediately(pullRequest);
                }
            }, timeDelay, TimeUnit.MILLISECONDS);
        } else {
            log.warn("PullMessageServiceScheduledThread has shutdown");
        }
    }

    public void executePullRequestImmediately(final PullRequest pullRequest) {
        try {
            this.pullRequestQueue.put(pullRequest);
        } catch (InterruptedException e) {
            log.error("executePullRequestImmediately pullRequestQueue.put", e);
        }
    }

    public void executeTaskLater(final Runnable r, final long timeDelay) {
        if (!isStopped()) {
            this.scheduledExecutorService.schedule(r, timeDelay, TimeUnit.MILLISECONDS);
        } else {
            log.warn("PullMessageServiceScheduledThread has shutdown");
        }
    }

    public ScheduledExecutorService getScheduledExecutorService() {
        return scheduledExecutorService;
    }

    private void pullMessage(final PullRequest pullRequest) {
        final MQConsumerInner consumer = this.mQClientFactory.selectConsumer(pullRequest.getConsumerGroup());
        if (consumer != null) {
            DefaultMQPushConsumerImpl impl = (DefaultMQPushConsumerImpl) consumer;
            impl.pullMessage(pullRequest);
        } else {
            log.warn("No matched consumer for the PullRequest {}, drop it", pullRequest);
        }
    }

    @Override
    public void run() {
        log.info(this.getServiceName() + " service started");

        while (!this.isStopped()) {
            try {
                // 从阻塞队列中拿 pullRequest 
                PullRequest pullRequest = this.pullRequestQueue.take();
                this.pullMessage(pullRequest);
            } catch (InterruptedException ignored) {
            } catch (Exception e) {
                log.error("Pull Message Service Run Method exception", e);
            }
        }

        log.info(this.getServiceName() + " service end");
    }

    @Override
    public void shutdown(boolean interrupt) {
        super.shutdown(interrupt);
        ThreadUtils.shutdownGracefully(this.scheduledExecutorService, 1000, TimeUnit.MILLISECONDS);
    }

    @Override
    public String getServiceName() {
        return PullMessageService.class.getSimpleName();
    }

}

Topic的每个queue,都有一个唯一对应的PullRequest

public class PullRequest {
    private String consumerGroup;
    private MessageQueue messageQueue;
    private ProcessQueue processQueue;
    private long nextOffset;
    private boolean lockedFirst = false;
}

nextOffset就是下一次要从messageQueue的哪个位置开始拉取消息,nextOffset在此处代表messageQueue的queue offset(0 1 2...)

public class DefaultMQPushConsumerImpl implements MQConsumerInner {

    public void pullMessage(final PullRequest pullRequest) {

        /*
        * 给每个pullRequest都附带一个红黑树,
          然后每个pullRequest从broker端拿回消息后,就往自己附带的红黑树中丢;
        * */
        final ProcessQueue processQueue = pullRequest.getProcessQueue();
        if (processQueue.isDropped()) {
            log.info("the pull request[{}] is dropped.", pullRequest.toString());
            return;
        }

        pullRequest.getProcessQueue().setLastPullTimestamp(System.currentTimeMillis());

        try {
            this.makeSureStateOK();
        } catch (MQClientException e) {
            log.warn("pullMessage exception, consumer state not ok", e);
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            return;
        }

        if (this.isPause()) {
            log.warn("consumer was paused, execute pull request later. instanceName={}, group={}", this.defaultMQPushConsumer.getInstanceName(), this.defaultMQPushConsumer.getConsumerGroup());
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_SUSPEND);
            return;
        }

        long cachedMessageCount = processQueue.getMsgCount().get();
        long cachedMessageSizeInMiB = processQueue.getMsgSize().get() / (1024 * 1024);

        /*
        * 限流:消息消费端红黑树队列中积压的消息不超过 1000 条

        * 当限流判断不通过时,是直接就return了,
        * 都不会再去走调用broker端的逻辑,从而缓解broker端的压力
        * 
        * */
        if (cachedMessageCount > this.defaultMQPushConsumer.getPullThresholdForQueue()) {
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
            if ((queueFlowControlTimes++ % 1000) == 0) {
                log.warn( "the cached message count exceeds the threshold {}, so do flow control, " +
                                "minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}",
                    this.defaultMQPushConsumer.getPullThresholdForQueue(), processQueue.getMsgTreeMap().firstKey(),
                        processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, queueFlowControlTimes);
            }
            return;
        }

        /*
        * 消息处理队列中积压的消息总大小超过 100M
        * */
        if (cachedMessageSizeInMiB > this.defaultMQPushConsumer.getPullThresholdSizeForQueue()) {
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
            if ((queueFlowControlTimes++ % 1000) == 0) {
                log.warn( "the cached message size exceeds the threshold {} MiB, so do flow control," +
                                " minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}",
                    this.defaultMQPushConsumer.getPullThresholdSizeForQueue(), processQueue.getMsgTreeMap().firstKey(),
                        processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, queueFlowControlTimes);
            }
            return;
        }

        if (!this.consumeOrderly) {
            /*
            * 如果是非顺序消费:
            * 消息处理队列中尽管积压没有超过 1000 条,但红黑树队列中,消息的最大偏移量与最小偏移量的差值超过 2000
            * */
            if (processQueue.getMaxSpan() > this.defaultMQPushConsumer.getConsumeConcurrentlyMaxSpan()) {
                this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
                if ((queueMaxSpanFlowControlTimes++ % 1000) == 0) {
                    log.warn( "the queue's messages, span too long, so do flow control, minOffset={}, maxOffset={}, maxSpan={}, pullRequest={}, flowControlTimes={}",
                        processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), processQueue.getMaxSpan(),
                        pullRequest, queueMaxSpanFlowControlTimes);
                }
                return;
            }
        } else {
            if (processQueue.isLocked()) {
                if (!pullRequest.isLockedFirst()) {
                    final long offset = this.rebalanceImpl.computePullFromWhere(pullRequest.getMessageQueue());
                    boolean brokerBusy = offset < pullRequest.getNextOffset();
                    log.info("the first time to pull message, so fix offset from broker. pullRequest: {} NewOffset: {} brokerBusy: {}",
                        pullRequest, offset, brokerBusy);
                    if (brokerBusy) {
                        log.info("[NOTIFYME]the first time to pull message, but pull request offset larger than broker consume offset. pullRequest: {} NewOffset: {}",
                            pullRequest, offset);
                    }

                    pullRequest.setLockedFirst(true);
                    pullRequest.setNextOffset(offset);
                }
            } else {
                this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
                log.info("pull message later because not locked in broker, {}", pullRequest);
                return;
            }
        }

        final SubscriptionData subscriptionData = this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
        if (null == subscriptionData) {
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            log.warn("find the consumer's subscription failed, {}", pullRequest);
            return;
        }

        final long beginTimestamp = System.currentTimeMillis();

        PullCallback pullCallback = new PullCallback() {

            /*
            * 这里,拿到broker端发送给客户端的pullResult
            *
                channelRead()拿到broker端传过来的pullResult后,把它丢入 NettyClientPublicExecutor 这个线程池中去,
                NettyClientPublicExecutor线程实际上也就是执行callback的线程; callback线程就根据成功还是失败调用下面的函数
                1. NettyClientPublicExecutor线程池中的线程,对pullResult内的东西进行反序列化,得到一些msg(一般为32条)
                2. 先将这批msg放入红黑树中,
                3. 再把这批消息,丢入专门负责消费的一个线程池中去NettyClientPublicExecutor
            */
            @Override
            public void onSuccess(PullResult pullResult) {
                if (pullResult != null) {
                    /*
                    * 1. 客户端发送了一个拉消息的请求,broker端返回FOUND代表查到了消息
                    *
                    * 实际上,broker端发回的消息,是很多的二进制流,但是业务程序员在listener内,拿到的消息是List<Msg>
                      这里涉及的转换从左,就是在processPullResult这里
                    * */
                    pullResult = DefaultMQPushConsumerImpl.this.pullAPIWrapper.processPullResult(
                            pullRequest.getMessageQueue(),
                            pullResult,
                            subscriptionData);

                    switch (pullResult.getPullStatus()) {
                        /*
                         * 1. 拿到broker返回回来的消息响应,先反序列化
                         * */
                        case FOUND:
                            /*
                            * getNextBeginOffset,表示下一轮要拉的消息的offset,然后继续发送拉消息请求
                            * */
                            long prevRequestOffset = pullRequest.getNextOffset();
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());
                            long pullRT = System.currentTimeMillis() - beginTimestamp;
                            DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullRT(pullRequest.getConsumerGroup(),
                                pullRequest.getMessageQueue().getTopic(), pullRT);

                            long firstMsgOffset = Long.MAX_VALUE;
                            if (pullResult.getMsgFoundList() == null || pullResult.getMsgFoundList().isEmpty()) {
                                DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                            } else {
                                firstMsgOffset = pullResult.getMsgFoundList().get(0).getQueueOffset();

                                DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullTPS(pullRequest.getConsumerGroup(),
                                    pullRequest.getMessageQueue().getTopic(), pullResult.getMsgFoundList().size());

                                /*
                                * Topic的每个queue,都绑定了唯一的一个pullRequest对象
                                *
                                * 给每个pullRequest都附带一个红黑树,然后每个pullRequest拿到消息以后,就往自己附带的红黑树中丢;
                                * processQueue里面实际放的就是一个TreeMap红黑树
                                *
                                * 将从broker拿回来的消息(默认一次拉32条),丢入processQueue中
                                * */
                                boolean dispatchToConsume = processQueue.putMessage(pullResult.getMsgFoundList());

                                /*
                                * 将拉回来的32条消息,丢入消费线程池
                                * */
                                DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(
                                    pullResult.getMsgFoundList(),
                                    processQueue,
                                    pullRequest.getMessageQueue(),
                                    dispatchToConsume);

                                /*
                                * 将更新过nextOffset后的新的pullRequest,重新放入到pullRequestQueue中去;
                                * */
                                if (DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval() > 0) {
                                    DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest,
                                        DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval());
                                } else {
                                    DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                                }
                            }

                            if (pullResult.getNextBeginOffset() < prevRequestOffset
                                || firstMsgOffset < prevRequestOffset) {
                                log.warn(
                                    "[BUG] pull message result maybe data wrong, nextBeginOffset: {} firstMsgOffset: {} prevRequestOffset: {}",
                                    pullResult.getNextBeginOffset(),
                                    firstMsgOffset,
                                    prevRequestOffset);
                            }

                            break;
                        case NO_NEW_MSG:
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());

                            DefaultMQPushConsumerImpl.this.correctTagsOffset(pullRequest);

                            DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                            break;
                        case NO_MATCHED_MSG:
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());

                            DefaultMQPushConsumerImpl.this.correctTagsOffset(pullRequest);

                            DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                            break;
                        case OFFSET_ILLEGAL:
                            log.warn("the pull request offset illegal, {} {}",
                                pullRequest.toString(), pullResult.toString());
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());

                            pullRequest.getProcessQueue().setDropped(true);
                            DefaultMQPushConsumerImpl.this.executeTaskLater(new Runnable() {

                                @Override
                                public void run() {
                                    try {
                                        DefaultMQPushConsumerImpl.this.offsetStore.updateOffset(pullRequest.getMessageQueue(),
                                            pullRequest.getNextOffset(), false);

                                        DefaultMQPushConsumerImpl.this.offsetStore.persist(pullRequest.getMessageQueue());

                                        DefaultMQPushConsumerImpl.this.rebalanceImpl.removeProcessQueue(pullRequest.getMessageQueue());

                                        log.warn("fix the pull request offset, {}", pullRequest);
                                    } catch (Throwable e) {
                                        log.error("executeTaskLater Exception", e);
                                    }
                                }
                            }, 10000);
                            break;
                        default:
                            break;
                    }
                }
            }

            @Override
            public void onException(Throwable e) {
                if (!pullRequest.getMessageQueue().getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                    log.warn("execute the pull request exception", e);
                }

                DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            }
        };

        boolean commitOffsetEnable = false;
        long commitOffsetValue = 0L;
        if (MessageModel.CLUSTERING == this.defaultMQPushConsumer.getMessageModel()) {
            commitOffsetValue = this.offsetStore.readOffset(pullRequest.getMessageQueue(), ReadOffsetType.READ_FROM_MEMORY);
            if (commitOffsetValue > 0) {
                commitOffsetEnable = true;
            }
        }

        String subExpression = null;
        boolean classFilter = false;
        SubscriptionData sd = this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
        if (sd != null) {
            if (this.defaultMQPushConsumer.isPostSubscriptionWhenPull() && !sd.isClassFilterMode()) {
                subExpression = sd.getSubString();
            }

            classFilter = sd.isClassFilterMode();
        }

        int sysFlag = PullSysFlag.buildSysFlag(
            commitOffsetEnable, // commitOffset
            true, // suspend
            subExpression != null, // subscription
            classFilter // class filter
        );
        try {

            /*
            * 这里把queue id,nextOffset传给broker端,
            * broker端的 PullMessageProcessor#processRequest()接收到拉消息的请求
            * */
            this.pullAPIWrapper.pullKernelImpl(

                    /*
                    * 告诉broker,要从哪个topic的哪个MessageQueue取消息
                    * */
                pullRequest.getMessageQueue(),
                subExpression,
                subscriptionData.getExpressionType(),
                subscriptionData.getSubVersion(),
                /*
                * 告诉broker,要从哪个topic的哪个MessageQueue,取该queue的 queue offset为几的索引块,比如取第1或者第2块
                * 拿到索引块,就能拿到该索引块中的 全局物理偏移量 + 消息size
                * */
                pullRequest.getNextOffset(),
                this.defaultMQPushConsumer.getPullBatchSize(),
                sysFlag,
                /*
                * 这里也在上报消费进度,即每次拉消息的同时,也在上报消费进度
                *
                * 也就是说,上报消费进度有两个途径:
                * 1. 这里拉消息是上报
                * 2. 每隔persistConsumerOffsetInterval = 1000 * 5,往broker端的 ConsumerOffsetManager 的offsetTable中,更新一次消费进度
                 * */
                commitOffsetValue,
                /*
                * 长轮询的最大超时时间,不能让broker无限制的hold住consumer端的拉请求
                * 不然这样,当broker hold住的请求太多时,broker的内存会扛不住的(设置超时时间,是系统兜底的重要策略)
                * */
                BROKER_SUSPEND_MAX_TIME_MILLIS,
                CONSUMER_TIMEOUT_MILLIS_WHEN_SUSPEND,
                CommunicationMode.ASYNC,
                /*
                向broker发送拉消息的请求,并向broker传递一个pullCallback
                */
                pullCallback
            );
        } catch (Exception e) {
            log.error("pullKernelImpl exception", e);
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
        }
    }

}

消息的解码和客户端二次过滤

public class PullAPIWrapper {

    public PullResult processPullResult(final MessageQueue mq, final PullResult pullResult,
        final SubscriptionData subscriptionData) {
        PullResultExt pullResultExt = (PullResultExt) pullResult;

        this.updatePullFromWhichNode(mq, pullResultExt.getSuggestWhichBrokerId());
        if (PullStatus.FOUND == pullResult.getPullStatus()) {
            /*
            * decodes:反序列化
            *
            * 将broker端,通过网络请求发回的二进制流格式的消息,解码成实际的一条条消息
            * */
            ByteBuffer byteBuffer = ByteBuffer.wrap(pullResultExt.getMessageBinary());
            List<MessageExt> msgList = MessageDecoder.decodes(byteBuffer);

            /*
            * 消息的二次过滤(客户端过滤)
            *
            * 这里不再像broker端用hashCode值比对了,而是用tag本身进行比对
            * */
            List<MessageExt> msgListFilterAgain = msgList;
            if (!subscriptionData.getTagsSet().isEmpty() && !subscriptionData.isClassFilterMode()) {
                msgListFilterAgain = new ArrayList<MessageExt>(msgList.size());
                for (MessageExt msg : msgList) {
                    if (msg.getTags() != null) {
                        // 用tag 比对
                        if (subscriptionData.getTagsSet().contains(msg.getTags())) {
                            msgListFilterAgain.add(msg);
                        }
                    }
                }
            }

            if (this.hasHook()) {
                FilterMessageContext filterMessageContext = new FilterMessageContext();
                filterMessageContext.setUnitMode(unitMode);
                filterMessageContext.setMsgList(msgListFilterAgain);
                this.executeHook(filterMessageContext);
            }

            for (MessageExt msg : msgListFilterAgain) {
                String traFlag = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
                if (Boolean.parseBoolean(traFlag)) {
                    msg.setTransactionId(msg.getProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
                }
                MessageAccessor.putProperty(msg, MessageConst.PROPERTY_MIN_OFFSET,
                    Long.toString(pullResult.getMinOffset()));
                MessageAccessor.putProperty(msg, MessageConst.PROPERTY_MAX_OFFSET,
                    Long.toString(pullResult.getMaxOffset()));
            }

            pullResultExt.setMsgFoundList(msgListFilterAgain);
        }

        pullResultExt.setMessageBinary(null);

        return pullResult;
    }

拉取结果类结构

public class PullResult {
    private final PullStatus pullStatus;

    private final long nextBeginOffset;

    private final long minOffset;
    private final long maxOffset;

    private List<MessageExt> msgFoundList;
}

服务端

package org.apache.rocketmq.remoting.netty;

public abstract class NettyRemotingAbstract {

    /**
     * Entry of incoming command processing.

     * NettyRemotingClient 和 NettyRemotingServer,都继承了NettyRemotingAbstract
     *
     * NettyRemotingClient 和 NettyRemotingServer,也都会调用这个processMessageReceived()方法
     */
    public void processMessageReceived(ChannelHandlerContext ctx, RemotingCommand msg) throws Exception {
        final RemotingCommand cmd = msg;
        if (cmd != null) {
            switch (cmd.getType()) {
                case REQUEST_COMMAND:
                    processRequestCommand(ctx, cmd);
                    break;
                case RESPONSE_COMMAND:
                    processResponseCommand(ctx, cmd);
                    break;
                default:
                    break;
            }
        }
    }
}

网络请求的路由分发处

public abstract class NettyRemotingAbstract {

    /**
     * Process incoming request command issued by remote peer.
     *
     * @param ctx channel handler context.
     * @param cmd request command.
     */
    public void processRequestCommand(final ChannelHandlerContext ctx, final RemotingCommand cmd) {
        final Pair<NettyRequestProcessor, ExecutorService> matched = this.processorTable.get(cmd.getCode());
        final Pair<NettyRequestProcessor, ExecutorService> pair = null == matched ? this.defaultRequestProcessor : matched;
        final int opaque = cmd.getOpaque();

        if (pair != null) {
            Runnable run = new Runnable() {
                @Override
                public void run() {
                    try {
                        doBeforeRpcHooks(RemotingHelper.parseChannelRemoteAddr(ctx.channel()), cmd);
                        /*
                        * 根据不同的请求类型(不同的requestCode)
                        * 选择不同的NettyRequestProcessor的子类,来执行请求
                        *
                        * 比如,如果是拉消息请求,那么就调用PullMessageProcessor的processRequest()来处理请求
                        * 当然,最终还是要依靠 Pair<NettyRequestProcessor, ExecutorService> pair中的 ExecutorService线程池,来真正执行请求
                        *
                        * 所以,也可以认识到PullMessageProcessor的processRequest()方法,是活在线程池中的,
                        * 也就是说,它是会被多线程调用的,也就是说它内部的容器,是可能会有线程安全问题的;
                        * */
                        final RemotingCommand response = pair.getObject1().processRequest(ctx, cmd);
                        doAfterRpcHooks(RemotingHelper.parseChannelRemoteAddr(ctx.channel()), cmd, response);

                        if (!cmd.isOnewayRPC()) {
                            if (response != null) {
                                response.setOpaque(opaque);
                                response.markResponseType();
                                try {
                                    ctx.writeAndFlush(response);
                                } catch (Throwable e) {
                                    log.error("process request over, but response failed", e);
                                    log.error(cmd.toString());
                                    log.error(response.toString());
                                }
                            } else {

                            }
                        }
                    } catch (Throwable e) {
                        log.error("process request exception", e);
                        log.error(cmd.toString());

                        if (!cmd.isOnewayRPC()) {
                            final RemotingCommand response = RemotingCommand.createResponseCommand(RemotingSysResponseCode.SYSTEM_ERROR,
                                RemotingHelper.exceptionSimpleDesc(e));
                            response.setOpaque(opaque);
                            ctx.writeAndFlush(response);
                        }
                    }
                }
            };

            if (pair.getObject1().rejectRequest()) {
                final RemotingCommand response = RemotingCommand.createResponseCommand(RemotingSysResponseCode.SYSTEM_BUSY,
                    "[REJECTREQUEST]system busy, start flow control for a while");
                response.setOpaque(opaque);
                ctx.writeAndFlush(response);
                return;
            }

            try {
                /*
                * 接收到客户端发来的,拉取消息的请求后,调用NettyRemotingAbstract#processRequestCommand()方法
                * 将客户端发过来的拉消息的request,重新组装成一个runnable,丢入新线程池 pullMessageExecutor 中;
                *
                * 也就是说,这里netty线程,在得到客户端的拉消息请求后,并没有直接就开始处理
                * netty线程而是,将请求又丢给业务线程池,目的是为了保持netty线程的轻量
                *
                * 这里的线程池,也就是BrokerController#registerProcessor()中注册的指令与线程池组
                * */
                final RequestTask requestTask = new RequestTask(run, ctx.channel(), cmd);
                pair.getObject2().submit(requestTask);

            } catch (RejectedExecutionException e) {
                if ((System.currentTimeMillis() % 10000) == 0) {
                    log.warn(RemotingHelper.parseChannelRemoteAddr(ctx.channel())
                        + ", too many requests and system thread pool busy, RejectedExecutionException "
                        + pair.getObject2().toString()
                        + " request code: " + cmd.getCode());
                }

                if (!cmd.isOnewayRPC()) {
                    final RemotingCommand response = RemotingCommand.createResponseCommand(RemotingSysResponseCode.SYSTEM_BUSY,
                        "[OVERLOAD]system busy, start flow control for a while");
                    response.setOpaque(opaque);
                    ctx.writeAndFlush(response);
                }
            }
        } else {
            String error = " request type " + cmd.getCode() + " not supported";
            final RemotingCommand response =
                RemotingCommand.createResponseCommand(RemotingSysResponseCode.REQUEST_CODE_NOT_SUPPORTED, error);
            response.setOpaque(opaque);
            ctx.writeAndFlush(response);
            log.error(RemotingHelper.parseChannelRemoteAddr(ctx.channel()) + error);
        }
    }

    /**
     * Process response from remote peer to the previous issued requests.
     *
     * @param ctx channel handler context.
     * @param cmd response command instance.
     */
    public void processResponseCommand(ChannelHandlerContext ctx, RemotingCommand cmd) {
        final int opaque = cmd.getOpaque();
        final ResponseFuture responseFuture = responseTable.get(opaque);
        if (responseFuture != null) {
            responseFuture.setResponseCommand(cmd);

            responseTable.remove(opaque);

            if (responseFuture.getInvokeCallback() != null) {
                executeInvokeCallback(responseFuture);
            } else {
                responseFuture.putResponse(cmd);
                responseFuture.release();
            }
        } else {
            log.warn("receive response, but not matched any request, " + RemotingHelper.parseChannelRemoteAddr(ctx.channel()));
            log.warn(cmd.toString());
        }
    }

}

按照请求类型code,进行请求转发

dbe2425b61944fe4a2f270fd534fbfde.png

既然这里有使用这些XxxxxProcessor,那么肯定就有注册它们的位置,就在BrokerController的初始化逻辑里面


public class BrokerController {
    private static final InternalLogger log = InternalLoggerFactory.getLogger(LoggerName.BROKER_LOGGER_NAME);
    private static final InternalLogger LOG_PROTECTION = InternalLoggerFactory.getLogger(LoggerName.PROTECTION_LOGGER_NAME);
    private static final InternalLogger LOG_WATER_MARK = InternalLoggerFactory.getLogger(LoggerName.WATER_MARK_LOGGER_NAME);
    private final BrokerConfig brokerConfig;
    private final NettyServerConfig nettyServerConfig;
    private final NettyClientConfig nettyClientConfig;
    private final MessageStoreConfig messageStoreConfig;

    /*
    * 管理topic队列的消费进度
    * */
    private final ConsumerOffsetManager consumerOffsetManager;
    private final ConsumerManager consumerManager;
    private final ConsumerFilterManager consumerFilterManager;
    private final ProducerManager producerManager;
    private final ClientHousekeepingService clientHousekeepingService;
    private final PullMessageProcessor pullMessageProcessor;
    private final PullRequestHoldService pullRequestHoldService;
    private final MessageArrivingListener messageArrivingListener;
    private final Broker2Client broker2Client;
    private final SubscriptionGroupManager subscriptionGroupManager;
    private final ConsumerIdsChangeListener consumerIdsChangeListener;
    private final RebalanceLockManager rebalanceLockManager = new RebalanceLockManager();
    private final BrokerOuterAPI brokerOuterAPI;
    private final ScheduledExecutorService scheduledExecutorService = Executors.newSingleThreadScheduledExecutor(new ThreadFactoryImpl(
        "BrokerControllerScheduledThread"));
    private final SlaveSynchronize slaveSynchronize;

    /*
     * 各种业务阻塞队列,和下方的各种隔离的业务线程池对应
     *
     * */
    private final BlockingQueue<Runnable> sendThreadPoolQueue;
    private final BlockingQueue<Runnable> pullThreadPoolQueue;
    private final BlockingQueue<Runnable> replyThreadPoolQueue;
    private final BlockingQueue<Runnable> queryThreadPoolQueue;
    private final BlockingQueue<Runnable> clientManagerThreadPoolQueue;
    private final BlockingQueue<Runnable> heartbeatThreadPoolQueue;
    private final BlockingQueue<Runnable> consumerManagerThreadPoolQueue;
    private final BlockingQueue<Runnable> endTransactionThreadPoolQueue;

    private final FilterServerManager filterServerManager;
    private final BrokerStatsManager brokerStatsManager;
    private final List<SendMessageHook> sendMessageHookList = new ArrayList<SendMessageHook>();
    private final List<ConsumeMessageHook> consumeMessageHookList = new ArrayList<ConsumeMessageHook>();
    private MessageStore messageStore;
    private RemotingServer remotingServer;
    private RemotingServer fastRemotingServer;
    private TopicConfigManager topicConfigManager;

    /*
    * 各种隔离的业务线程池
    *
    * */
    private ExecutorService sendMessageExecutor;
    private ExecutorService pullMessageExecutor;
    private ExecutorService replyMessageExecutor;
    private ExecutorService queryMessageExecutor;
    private ExecutorService adminBrokerExecutor;
    private ExecutorService clientManageExecutor;
    private ExecutorService heartbeatExecutor;
    private ExecutorService consumerManageExecutor;
    private ExecutorService endTransactionExecutor;

    private boolean updateMasterHAServerAddrPeriodically = false;
    private BrokerStats brokerStats;
    private InetSocketAddress storeHost;
    private BrokerFastFailure brokerFastFailure;
    private Configuration configuration;
    private FileWatchService fileWatchService;
    private TransactionalMessageCheckService transactionalMessageCheckService;
    private TransactionalMessageService transactionalMessageService;
    private AbstractTransactionalMessageCheckListener transactionalMessageCheckListener;
    private Future<?> slaveSyncFuture;
    private Map<Class,AccessValidator> accessValidatorMap = new HashMap<Class, AccessValidator>();

    public BrokerController(
        final BrokerConfig brokerConfig,
        final NettyServerConfig nettyServerConfig,
        final NettyClientConfig nettyClientConfig,
        final MessageStoreConfig messageStoreConfig
    ) {
        this.brokerConfig = brokerConfig;
        this.nettyServerConfig = nettyServerConfig;
        this.nettyClientConfig = nettyClientConfig;
        this.messageStoreConfig = messageStoreConfig;
        this.consumerOffsetManager = new ConsumerOffsetManager(this);

        /*
        * autoCreateTopicEnable机制-Step1:在Broker启动流程中,会构建TopicConfigManager对象,其构造方法中首先会判断是否开启了允许自动创建主题,
        * 如果启用了自动创建主题,则向topicConfigTable中添加默认主题的路由信息。
        *
        * 在Broker端的topic配置管理器中存在的路由信息,
        * 一方面会向Nameserver发送心跳包,汇报到Nameserver,
        * 另一方面会有一个定时任务,定时存储在broker端,具体路径为${ROCKET_HOME}/store/config/topics.json中,这样在Broker关闭后再重启,并不会丢失路由信息。
        * */
        this.topicConfigManager = new TopicConfigManager(this);
        this.pullMessageProcessor = new PullMessageProcessor(this);
        this.pullRequestHoldService = new PullRequestHoldService(this);
        this.messageArrivingListener = new NotifyMessageArrivingListener(this.pullRequestHoldService);
        this.consumerIdsChangeListener = new DefaultConsumerIdsChangeListener(this);
        this.consumerManager = new ConsumerManager(this.consumerIdsChangeListener);
        this.consumerFilterManager = new ConsumerFilterManager(this);
        this.producerManager = new ProducerManager();
        this.clientHousekeepingService = new ClientHousekeepingService(this);
        this.broker2Client = new Broker2Client(this);
        this.subscriptionGroupManager = new SubscriptionGroupManager(this);
        this.brokerOuterAPI = new BrokerOuterAPI(nettyClientConfig);
        this.filterServerManager = new FilterServerManager(this);

        this.slaveSynchronize = new SlaveSynchronize(this);

        /*
        * 初始化各种业务阻塞队列
        * */
        this.sendThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getSendThreadPoolQueueCapacity());
        this.pullThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getPullThreadPoolQueueCapacity());
        this.replyThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getReplyThreadPoolQueueCapacity());
        this.queryThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getQueryThreadPoolQueueCapacity());
        this.clientManagerThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getClientManagerThreadPoolQueueCapacity());
        this.consumerManagerThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getConsumerManagerThreadPoolQueueCapacity());
        this.heartbeatThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getHeartbeatThreadPoolQueueCapacity());
        this.endTransactionThreadPoolQueue = new LinkedBlockingQueue<Runnable>(this.brokerConfig.getEndTransactionPoolQueueCapacity());

        this.brokerStatsManager = new BrokerStatsManager(this.brokerConfig.getBrokerClusterName());
        this.setStoreHost(new InetSocketAddress(this.getBrokerConfig().getBrokerIP1(), this.getNettyServerConfig().getListenPort()));

        /*
        * 初始化快速失败对象,并将brokerController自身传入
        * */
        this.brokerFastFailure = new BrokerFastFailure(this);
        this.configuration = new Configuration(
            log,
            BrokerPathConfigHelper.getBrokerConfigPath(),
            this.brokerConfig, this.nettyServerConfig, this.nettyClientConfig, this.messageStoreConfig
        );
    }

}

Broker的初始化逻辑:BrokerController#initialize() 


public class BrokerController {

    public boolean initialize() throws CloneNotSupportedException {
        boolean result = this.topicConfigManager.load();

        result = result && this.consumerOffsetManager.load();
        result = result && this.subscriptionGroupManager.load();
        result = result && this.consumerFilterManager.load();

        if (result) {
            try {
                this.messageStore = new DefaultMessageStore(this.messageStoreConfig, this.brokerStatsManager, this.messageArrivingListener,
                        this.brokerConfig);
                if (messageStoreConfig.isEnableDLegerCommitLog()) {
                    DLedgerRoleChangeHandler roleChangeHandler = new DLedgerRoleChangeHandler(this, (DefaultMessageStore) messageStore);
                    ((DLedgerCommitLog)((DefaultMessageStore) messageStore).getCommitLog()).getdLedgerServer().getdLedgerLeaderElector().addRoleChangeHandler(roleChangeHandler);
                }
                this.brokerStats = new BrokerStats((DefaultMessageStore) this.messageStore);
                //load plugin
                MessageStorePluginContext context = new MessageStorePluginContext(messageStoreConfig, brokerStatsManager, messageArrivingListener, brokerConfig);
                this.messageStore = MessageStoreFactory.build(context, this.messageStore);
                this.messageStore.getDispatcherList().addFirst(new CommitLogDispatcherCalcBitMap(this.brokerConfig, this.consumerFilterManager));
            } catch (IOException e) {
                result = false;
                log.error("Failed to initialize", e);
            }
        }

        result = result && this.messageStore.load();

        if (result) {
            this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.clientHousekeepingService);
            NettyServerConfig fastConfig = (NettyServerConfig) this.nettyServerConfig.clone();
            fastConfig.setListenPort(nettyServerConfig.getListenPort() - 2);
            this.fastRemotingServer = new NettyRemotingServer(fastConfig, this.clientHousekeepingService);

            /*
            * 猜测应该是,用多线程的模式,来调用SendMessageProcessor的processRequest()来完成单条消息或者批量消息的保存
            *
            * 消息发送者向 Broker 发送消息写入请求,
            * Broker 端在接收到请求后会首先放入一个队列中(SendThreadPoolQueue),默认容量为 10000。
            * Broker 会专门使用一个线程池(SendMessageExecutor)去从队列中获取任务并执行消息写入请求,为了保证消息的顺序处理,该线程池默认线程个数为1。
            * */
            this.sendMessageExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getSendMessageThreadPoolNums(),
                this.brokerConfig.getSendMessageThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.sendThreadPoolQueue,
                new ThreadFactoryImpl("SendMessageThread_"));

            this.pullMessageExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getPullMessageThreadPoolNums(),
                this.brokerConfig.getPullMessageThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.pullThreadPoolQueue,
                new ThreadFactoryImpl("PullMessageThread_"));

            this.replyMessageExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getProcessReplyMessageThreadPoolNums(),
                this.brokerConfig.getProcessReplyMessageThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.replyThreadPoolQueue,
                new ThreadFactoryImpl("ProcessReplyMessageThread_"));

            this.queryMessageExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getQueryMessageThreadPoolNums(),
                this.brokerConfig.getQueryMessageThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.queryThreadPoolQueue,
                new ThreadFactoryImpl("QueryMessageThread_"));

            this.adminBrokerExecutor =
                Executors.newFixedThreadPool(this.brokerConfig.getAdminBrokerThreadPoolNums(), new ThreadFactoryImpl(
                    "AdminBrokerThread_"));

            this.clientManageExecutor = new ThreadPoolExecutor(
                this.brokerConfig.getClientManageThreadPoolNums(),
                this.brokerConfig.getClientManageThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.clientManagerThreadPoolQueue,
                new ThreadFactoryImpl("ClientManageThread_"));

            this.heartbeatExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getHeartbeatThreadPoolNums(),
                this.brokerConfig.getHeartbeatThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.heartbeatThreadPoolQueue,
                new ThreadFactoryImpl("HeartbeatThread_", true));

            this.endTransactionExecutor = new BrokerFixedThreadPoolExecutor(
                this.brokerConfig.getEndTransactionThreadPoolNums(),
                this.brokerConfig.getEndTransactionThreadPoolNums(),
                1000 * 60,
                TimeUnit.MILLISECONDS,
                this.endTransactionThreadPoolQueue,
                new ThreadFactoryImpl("EndTransactionThread_"));

            this.consumerManageExecutor =
                Executors.newFixedThreadPool(this.brokerConfig.getConsumerManageThreadPoolNums(), new ThreadFactoryImpl(
                    "ConsumerManageThread_"));



            /*
            * 注册各种处理器
            * */
            this.registerProcessor();



            final long initialDelay = UtilAll.computeNextMorningTimeMillis() - System.currentTimeMillis();
            final long period = 1000 * 60 * 60 * 24;
            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.getBrokerStats().record();
                    } catch (Throwable e) {
                        log.error("schedule record error.", e);
                    }
                }
            }, initialDelay, period, TimeUnit.MILLISECONDS);

            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.consumerOffsetManager.persist();
                    } catch (Throwable e) {
                        log.error("schedule persist consumerOffset error.", e);
                    }
                }
            }, 1000 * 10, this.brokerConfig.getFlushConsumerOffsetInterval(), TimeUnit.MILLISECONDS);

            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.consumerFilterManager.persist();
                    } catch (Throwable e) {
                        log.error("schedule persist consumer filter error.", e);
                    }
                }
            }, 1000 * 10, 1000 * 10, TimeUnit.MILLISECONDS);

            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.protectBroker();
                    } catch (Throwable e) {
                        log.error("protectBroker error.", e);
                    }
                }
            }, 3, 3, TimeUnit.MINUTES);

            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        BrokerController.this.printWaterMark();
                    } catch (Throwable e) {
                        log.error("printWaterMark error.", e);
                    }
                }
            }, 10, 1, TimeUnit.SECONDS);

            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

                @Override
                public void run() {
                    try {
                        log.info("dispatch behind commit log {} bytes", BrokerController.this.getMessageStore().dispatchBehindBytes());
                    } catch (Throwable e) {
                        log.error("schedule dispatchBehindBytes error.", e);
                    }
                }
            }, 1000 * 10, 1000 * 60, TimeUnit.MILLISECONDS);

            if (this.brokerConfig.getNamesrvAddr() != null) {
                this.brokerOuterAPI.updateNameServerAddressList(this.brokerConfig.getNamesrvAddr());
                log.info("Set user specified name server address: {}", this.brokerConfig.getNamesrvAddr());
            } else if (this.brokerConfig.isFetchNamesrvAddrByAddressServer()) {
                this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

                    @Override
                    public void run() {
                        try {
                            BrokerController.this.brokerOuterAPI.fetchNameServerAddr();
                        } catch (Throwable e) {
                            log.error("ScheduledTask fetchNameServerAddr exception", e);
                        }
                    }
                }, 1000 * 10, 1000 * 60 * 2, TimeUnit.MILLISECONDS);
            }

            if (!messageStoreConfig.isEnableDLegerCommitLog()) {
                if (BrokerRole.SLAVE == this.messageStoreConfig.getBrokerRole()) {
                    if (this.messageStoreConfig.getHaMasterAddress() != null
                            && this.messageStoreConfig.getHaMasterAddress().length() >= 6) {
                        this.messageStore.updateHaMasterAddress(this.messageStoreConfig.getHaMasterAddress());
                        this.updateMasterHAServerAddrPeriodically = false;
                    } else {
                        this.updateMasterHAServerAddrPeriodically = true;
                    }
                } else {
                    // 定时打印master 与 slave的差距
                    this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                        @Override
                        public void run() {
                            try {
                                BrokerController.this.printMasterAndSlaveDiff();
                            } catch (Throwable e) {
                                log.error("schedule printMasterAndSlaveDiff error.", e);
                            }
                        }
                    }, 1000 * 10, 1000 * 60, TimeUnit.MILLISECONDS);
                }
            }

            if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) {
                // Register a listener to reload SslContext
                try {
                    fileWatchService = new FileWatchService(
                        new String[] {
                            TlsSystemConfig.tlsServerCertPath,
                            TlsSystemConfig.tlsServerKeyPath,
                            TlsSystemConfig.tlsServerTrustCertPath
                        },
                        new FileWatchService.Listener() {
                            boolean certChanged, keyChanged = false;

                            @Override
                            public void onChanged(String path) {
                                if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) {
                                    log.info("The trust certificate changed, reload the ssl context");
                                    reloadServerSslContext();
                                }
                                if (path.equals(TlsSystemConfig.tlsServerCertPath)) {
                                    certChanged = true;
                                }
                                if (path.equals(TlsSystemConfig.tlsServerKeyPath)) {
                                    keyChanged = true;
                                }
                                if (certChanged && keyChanged) {
                                    log.info("The certificate and private key changed, reload the ssl context");
                                    certChanged = keyChanged = false;
                                    reloadServerSslContext();
                                }
                            }

                            private void reloadServerSslContext() {
                                ((NettyRemotingServer) remotingServer).loadSslContext();
                                ((NettyRemotingServer) fastRemotingServer).loadSslContext();
                            }
                        });
                } catch (Exception e) {
                    log.warn("FileWatchService created error, can't load the certificate dynamically");
                }
            }
            initialTransaction();
            initialAcl();
            initialRpcHooks();
        }
        return result;
    }
}

实际注册的各种处理器

public class BrokerController {

    /**
     * broker在启动时,给每种命令,注册了对应的处理器
     * */
    public void registerProcessor() {
        /**
         * SendMessageProcessor
         */
        SendMessageProcessor sendProcessor = new SendMessageProcessor(this);

        sendProcessor.registerSendMessageHook(sendMessageHookList);
        sendProcessor.registerConsumeMessageHook(consumeMessageHookList);

        /*
        * 不同的RequestCode对应不同的动作
        *
        * 这里就是,让下面这几个动作,都交给(sendProcessor, this.sendMessageExecutor)来执行
        * */
        this.remotingServer.registerProcessor(RequestCode.SEND_MESSAGE, sendProcessor, this.sendMessageExecutor);
        this.remotingServer.registerProcessor(RequestCode.SEND_MESSAGE_V2, sendProcessor, this.sendMessageExecutor);
        this.remotingServer.registerProcessor(RequestCode.SEND_BATCH_MESSAGE, sendProcessor, this.sendMessageExecutor);
        this.remotingServer.registerProcessor(RequestCode.CONSUMER_SEND_MSG_BACK, sendProcessor, this.sendMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.SEND_MESSAGE, sendProcessor, this.sendMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.SEND_MESSAGE_V2, sendProcessor, this.sendMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.SEND_BATCH_MESSAGE, sendProcessor, this.sendMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.CONSUMER_SEND_MSG_BACK, sendProcessor, this.sendMessageExecutor);

        /**
         * PullMessageProcessor
         */
        this.remotingServer.registerProcessor(RequestCode.PULL_MESSAGE, this.pullMessageProcessor, this.pullMessageExecutor);
        this.pullMessageProcessor.registerConsumeMessageHook(consumeMessageHookList);

        /**
         * ReplyMessageProcessor
         */
        ReplyMessageProcessor replyMessageProcessor = new ReplyMessageProcessor(this);
        replyMessageProcessor.registerSendMessageHook(sendMessageHookList);

        this.remotingServer.registerProcessor(RequestCode.SEND_REPLY_MESSAGE, replyMessageProcessor, replyMessageExecutor);
        this.remotingServer.registerProcessor(RequestCode.SEND_REPLY_MESSAGE_V2, replyMessageProcessor, replyMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.SEND_REPLY_MESSAGE, replyMessageProcessor, replyMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.SEND_REPLY_MESSAGE_V2, replyMessageProcessor, replyMessageExecutor);

        /**
         * QueryMessageProcessor
         */
        NettyRequestProcessor queryProcessor = new QueryMessageProcessor(this);
        this.remotingServer.registerProcessor(RequestCode.QUERY_MESSAGE, queryProcessor, this.queryMessageExecutor);
        this.remotingServer.registerProcessor(RequestCode.VIEW_MESSAGE_BY_ID, queryProcessor, this.queryMessageExecutor);

        this.fastRemotingServer.registerProcessor(RequestCode.QUERY_MESSAGE, queryProcessor, this.queryMessageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.VIEW_MESSAGE_BY_ID, queryProcessor, this.queryMessageExecutor);

        /**
         * ClientManageProcessor
         */
        ClientManageProcessor clientProcessor = new ClientManageProcessor(this);
        this.remotingServer.registerProcessor(RequestCode.HEART_BEAT, clientProcessor, this.heartbeatExecutor);
        this.remotingServer.registerProcessor(RequestCode.UNREGISTER_CLIENT, clientProcessor, this.clientManageExecutor);
        this.remotingServer.registerProcessor(RequestCode.CHECK_CLIENT_CONFIG, clientProcessor, this.clientManageExecutor);

        this.fastRemotingServer.registerProcessor(RequestCode.HEART_BEAT, clientProcessor, this.heartbeatExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.UNREGISTER_CLIENT, clientProcessor, this.clientManageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.CHECK_CLIENT_CONFIG, clientProcessor, this.clientManageExecutor);

        /**
         * ConsumerManageProcessor
         *
         * 负责处理,客户端来broker端
             * 查询消费进度
             * 更新消费进度
         *
         */
        ConsumerManageProcessor consumerManageProcessor = new ConsumerManageProcessor(this);
        this.remotingServer.registerProcessor(RequestCode.GET_CONSUMER_LIST_BY_GROUP, consumerManageProcessor, this.consumerManageExecutor);
        this.remotingServer.registerProcessor(RequestCode.UPDATE_CONSUMER_OFFSET, consumerManageProcessor, this.consumerManageExecutor);
        this.remotingServer.registerProcessor(RequestCode.QUERY_CONSUMER_OFFSET, consumerManageProcessor, this.consumerManageExecutor);

        this.fastRemotingServer.registerProcessor(RequestCode.GET_CONSUMER_LIST_BY_GROUP, consumerManageProcessor, this.consumerManageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.UPDATE_CONSUMER_OFFSET, consumerManageProcessor, this.consumerManageExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.QUERY_CONSUMER_OFFSET, consumerManageProcessor, this.consumerManageExecutor);

        /**
         * EndTransactionProcessor
         */
        this.remotingServer.registerProcessor(RequestCode.END_TRANSACTION, new EndTransactionProcessor(this), this.endTransactionExecutor);
        this.fastRemotingServer.registerProcessor(RequestCode.END_TRANSACTION, new EndTransactionProcessor(this), this.endTransactionExecutor);

        /**
         * Default
         */
        AdminBrokerProcessor adminProcessor = new AdminBrokerProcessor(this);
        this.remotingServer.registerDefaultProcessor(adminProcessor, this.adminBrokerExecutor);
        this.fastRemotingServer.registerDefaultProcessor(adminProcessor, this.adminBrokerExecutor);
    }

}

消息过滤与重试

What is going to happen when consumer listener returns “RECONSUME_LATER”?

在Consumer的消息消费监听器中,如果抛出来RuntimeException并且你自己没有捕获,那也等同于直接返回了“RECONSUME_LATER”

62077bb295914c0ea79dc5923584f497.png

消费失败时,重试消息的处理逻辑:

注意,最原始的消息为1号消息,是存在程序员自定义的业务topic中的,而一旦1号消息消费失败,会被客户端重新发回broker端

broker端接收到这条发回的消息后,先把1号消息的原业务topic更改为 %RETRY%consumer group,然后将此消息的topic改为SCHEDULE_TOPIC_XXXX,将此消息的属性key为PROPERTY_REAL_TOPIC对应的value中保存 %RETRY%consumer group这个topic,并将这条消息写入commitlog成为2号消息(注意,这里还会把消息的retryTimes + 1)

2号消息存储的topic为SCHEDULE_TOPIC_XXXX, 一旦2号消息到达了指定的延迟时间后,会被再次取出成为3号消息,此时将2号消息的属性key为PROPERTY_REAL_TOPIC对应的value中保存的topic值 %RETRY%consumer group取出来,设置为3号消息的topic,并将3号消息重新投递到topic为%RETRY%consumer group对应的队列中去

后续,就可以按照正常的逻辑进行消费了
73fdc3c69089470b8db2f591e16feaeb.png

这里是分为了18个等级,分别对应18个queue,这也就有点类似于kafka中的时间轮,将相同延迟级别的消息放入同一个queue,方便统一管理控制

在kafka中是使用了时间轮,进行了更为精确的时间控制 

具体的延迟实现逻辑:

18个延迟级别,分别对应18个ConsumeQueue,并且这18个ConsumeQueue同属于同一个topic:SCHEDULE_TOPIC_XXXX

针对这18个ConsumeQueue,每个都创建了一个专属的延时TimeTask并丢入了一个统一的Timer定时任务实例中,这18个任务初始默认都是1s后执行

每个专属的延时TimeTask的执行逻辑是,

  1. 先从delayLevel.json中先加载已有的消费进度,从而得到下一次要消费的offset,通过这个offset去对应的ConsumeQueue中拿到对应的索引条目,从中拿到phyOffset、size、tagHash,
  2. 需要注意的是,如果往Commitlog中写入的延时消息时,ReputMessageService会把写入的ConsumeQueue对应的索引条目中tagHash,改写为该延时消息下次要执行的时间对应的时间戳nextExecTimestamp
  3. 专属的延时TimeTask拿到该延时消息对应的索引条目的nextExecTimestamp后,与当前时间戳now取一个差值countdown
  4. 如果countdown<=0,说明当前消息已经到了需要被消费的时候了,那么就把这个索引条目对应的延时消息从commitlog中取出,并将该消息对应的topic从SCHEDULE_TOPIC_XXXX,改成该延时消息的属性key为PROPERTY_REAL_TOPIC对应的value中保存的topic值 %RETRY%consumer group,然后把改完后的3号消息,重新调用DefaultMessageStore#putMessage()方法,把该三3号消息投入Commitlog,后续就进去了正常的消费逻辑
  5. 如果countdown<=0,说明当前消息需要再等countdown毫秒才可以被消费,此时就重新new出一个延时TimeTask并带上offset,然后把这个TimeTask丢入Timer中,指定再等countdown毫秒执行这个TimeTask

注意,这里有一个兜底策略,就是如果每个延迟队列很长时间都没有新消息进来,那么每个延迟队列对应的TimeTask,也会每隔100ms被丢入Timer中一次。具体逻辑就是,该延迟队列的上一个TimeTask执行过程中发现该延迟队列没有新的延迟消息,则会在最后,往timer中丢入一个TimeTask,并指定这个TimeTask在100ms后执行,以此循环往复

消费过滤

Where does RocketMQ filter messages? Broker or Consumer?

b21f11f1bc674810a489b326a10bd92f.png

长轮询与零拷贝

d2692f9a908e41d7b5354a51649946b4.png

brokerAllowSuppend为true和false直接影响整个代码逻辑执行

如果理想情况下,消费者先发送PullRequest来到broker端,调用getMessage(brokerAllowSuppend)方法,如果PullRequest中要拉取的pullOffset大于对应ConsumeQueue中最大logicOffset,则将该PullRequest丢入PullRequestHoldService的ConcurrentMap<String /*Topic+@+queueId*/,ManyPullRequest> pullRequestTable中

后续如果生产者发送新的消息到来,会异步调起ReputMessageService#doReput()方法,该方法内部会调用messageArrivingListener#arriving()方法,该方法内部就会从pullRequestTable中获取ManyPullRequest,遍历是否有满足条件的PullRequest,如果有,则拿到专门处理PullMessageProcessor相关工作的线程池,在该线程池中创建一个task来调用getMessage(PullRequest,brokerAllowSuppend = false)

fef71e2e819b4b3ea11b689df099e0d5.png

为了避免出现上述的并发问题,也就是消费者这边发现没有符合条件的消息,这是发生了一次gc后,才把该pullRequest放入pullRequestTable,而就在gc的过程中到该pullRequest放入pullRequestTable之前,生产者这边发来了满足条件的新消息,导致messageArrivingListener#arriving()方法从pullRequestTable中获取ManyPullRequest,没有遍历到满足条件的PullRequest,而出现PullRequest被长期hold的情况

rocketmq提供了一种兜底策略,就是PullRequestHoldService内部的死循环,会每5秒醒一次,醒来后会检查pullRequestTable内部的pullRequest,是否达到了超时时间,如果已经超过超时时间,不管是否有没有满足条件的新消息,都将拿到专门处理PullMessageProcessor相关工作的线程池,在该线程池中创建一个task来调用getMessage(PullRequest,brokerAllowSuppend = false)

FileChannel.force()和MappedByteBufer.force()区别

MappedByteBufer.force()底层是调用的c语言的msync

消息重试、定时消息、批量消息

32656012912548d4b1c677289647c485.png

延迟避退

可以根据这个实现逻辑,本质就是一个分段函数,实现一个根据线程池中任务数,决定往线程池写任务时,阻塞多长时间

b7be59a3379648489ebe1fe2ee416543.png

如果broker端,返回的响应码是591到595行这几种,是会continue继续重试的,而如果是broker端因为pagecache等锁定太久等原因太繁忙,而返回的SYSTEN_BUSY,是不会进行发送重试的 

bc82ac51d73742ad951356122fff98b9.png

8d1c23c2692c41c797e1645efb45add9.png

b814306800d24c26b04029f62cbb42c1.png

472d3afdf4cc408eaaf6a349f0e20a5a.png

34016e3b1dd5434ab45f8d93b6dfea27.png

定时消息

c9fb959e4ca8498cb28cc8c4183308cc.png

508bb04cda494fcf99adbf41b60ef81c.png

ab3dd4ac4ff144a0a6d66c27d33d2026.png

377c02166fe24b8f90ee591a8ef34f6b.png

批量消息

f0bbb94159504949b2a106eb428ecaae.png

f0a2643a560b4be1ad16a4f6b3b57023.png

Remoting模块分析

2ac80063de31453f996e66ef87f174cd.png

BrokerController管理着rocketmq各个组件的生命周期,包括initiallize、start、shartdown

46dab98afb2e4908b68298621588c6cd.png

rocketmq设计的比较好的一点是,各个组件都服务化,每个组件都有管理自己生命周期的方法,然后统一由BrokerController进行统一的调用。我们学习的过程中,也能根据不同的组件,进行分别的学习

初始化

BrokerController#initiallize中,先初始化好处理不同请求的线程池后,又registerProcessors()注册了很多处理器,不同的处理器利用不同的线程池,负责处理不同的请求,因为不同的请求的量、复杂程度都是不同的,所以它们之间处理的线程池也都是隔离的

根据不同的业务逻辑划分了不同的RequestCode命令,然后对不同的RequestCode进行了分类,每一类RequestCode命令,都有自己对应的XxxxxProcessor来进行处理,每个XxxxxProcessor都有负责执行自己的线程池

最终把上面这些关系存入Map<RequestCode,Pair<NettyRequestProcessor,ExecutorService> > 中

启动

7f96bfb1cd7f476289331f738755543c.png

前面说到,rocketmq内部分成了很多不同的组件,每个不同的组件都有自己的生命周期管理方法,这里就是NettyRemotingServer组件的启动start()方法,BrokerController#start()内部会统一调用这些不同组件的启动方法

可以看到,NettyRemotingServer组件的启动start()方法内,正在启动netty的server,除了有boss、workers、还自定义了一个线程池defaultEventExecutorGroup来负责执行channelPipeline中的各个处理器的逻辑。也就是说,boss负责接收连接、workers负责把boss接收的连接注册到自己身上,并负责监听连接身上的读写事件,然后立马把监听到的读写事件转发给defaultEventExecutorGroup(workers做得非常轻量,就监听事件并转发,所以workers可以设置得小一点)

6d3222ca46904b63b63536333c17064f.png

27a4ed3ac0144b99992b572cf6d066f8.png

5b28ca2da63f4986a2ebe34f3e0d093c.png

处理具体业务的Handler

a51de656ee264c92b5d3895450454963.png

分broker收到的客户端发来的请求,和broker发给客户端请求对应回来的响应,有两条不同的处理逻辑

4becc2e670f240f0885f796eadb52408.png

f96931500727466e971035d91a5c3f61.png

就在这里,根据客户端发来的请求的不同的RequestCode,进行路由转发给不同的processor和线程池进行处理 

线程模型

c130c0adbb7241ffabb8bceff70cadea.png

 金字塔模型

12cee8e7addf4b68ace0c59aa7f9d283.png

最底层的业务线程池的线程数量,都是可以动态调整的,比如花呗内部使用的同步刷盘、同步复制,此时他们的sendMessaseThreadPool就从默认值,改成了64

46d9a5c9241b490383f0e2ec3e676116.png

每次有新的读写事件过来,都同步的更新一下lastReadTime和lastWriteTime

c2226be6f438432d91ffe8d311f80dc1.png

然后通过这个调度任务,就根据这两个事件的上次更新时间,来判断当前连接有多久没有新的读写事件进来了,也就是当前连接已经空闲多久了,空闲超时的连接就对它进行关闭

9cc11c988a60452f97cc1117f28f127c.png

长连接

590ffca9fb41403ab925d89cdc4431f2.png

0bca8502d48e4483a329db39d211b1e3.png

事务消息

TCC 模式的两阶段提交,第一阶段是用来尝试预留资源,第二阶段来扣减资源或释放资源

而消息队列中生产消息、保存消息、消费消息是不需要预留资源的,这是完全不同的业务场景。

消息队列中的事务消息跟分布式事务中两阶段提交的 TCC 模式在实现方式上的区别。TCC 模式中两阶段提交的目的主要是用来确定资源是否可用,而消息队列的事务消息是不需要考虑资源的,二者是不同的业务场景

Client模块分析

1. 从Nameserver获取topic的配置信息

2. 从broker获取该消费组下的消费者List

3. 重平衡

4. 当前消费者根据分配到的queue,去该queue所在的broker上拉取当前queue的消费进入offset

5. 根据上面返回的,去broker的queue拉取消息。注意,拉取请求中有拉取offset、每批次最大拉取消息数默认32、消费进度commitOffset、长轮询的超时时间15s

6. 消费本批次消息

7. 拉取下一批次的消息、顺带提交前面的消费进度commitOffset。另外还有5s一次的定时任务也会提交消费进度

broker在有consumer上线或下线时,会通知consumer做rebalance,broker和consumer之间也是有长连接的

Rebalance模块,topicSubscribeInfoTable就存每个topic对应的所有queue,方便所以consumer持有统一视图做排序,最后做rebalance

MQClientFactory也就是MQClientInstance,实际上就是一个大的容器,管理所有的consumer和producer,所以consumer和producer需要做的一些公共的事情,都在这个类中去完成

MQClientFactory启动start()后,会给所有的Broker都发送心跳,sendHeartbeatToAllBroker(),内部会给所有的Mater Broker发心跳

public class MQClientInstance {

    private void startScheduledTask() {

        // 2分钟,拉取一次NameServer的地址
        if (null == this.clientConfig.getNamesrvAddr()) {
            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

                @Override
                public void run() {
                    try {
                        MQClientInstance.this.mQClientAPIImpl.fetchNameServerAddr();
                    } catch (Exception e) {
                        log.error("ScheduledTask fetchNameServerAddr exception", e);
                    }
                }
            }, 1000 * 10, 1000 * 60 * 2, TimeUnit.MILLISECONDS);
        }

        // 每30秒,从多个NameServer中获取一次最新的topic路由信息,更新本地缓存
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.updateTopicRouteInfoFromNameServer();
                } catch (Exception e) {
                    log.error("ScheduledTask updateTopicRouteInfoFromNameServer exception", e);
                }
            }
        }, 10, this.clientConfig.getPollNameServerInterval(), TimeUnit.MILLISECONDS);

        // 给所有的Mater broker,发送当前客户端的心跳
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.cleanOfflineBroker();
                    MQClientInstance.this.sendHeartbeatToAllBrokerWithLock();
                } catch (Exception e) {
                    log.error("ScheduledTask sendHeartbeatToAllBroker exception", e);
                }
            }
        }, 1000, this.clientConfig.getHeartbeatBrokerInterval(), TimeUnit.MILLISECONDS);

        /*
        * RemoteBrokerOffsetStore的offsetTable中保存了客户端的消费进度,
        * 这里每隔5s,会向broker上报一次,实际也就是更新broker的ConsumeOffsetManager中保存的消费进度
        *
        * */
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.persistAllConsumerOffset();
                } catch (Exception e) {
                    log.error("ScheduledTask persistAllConsumerOffset exception", e);
                }
            }
        }, 1000 * 10, this.clientConfig.getPersistConsumerOffsetInterval(), TimeUnit.MILLISECONDS);


        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    //动态调整线程池
                    MQClientInstance.this.adjustThreadPool();
                } catch (Exception e) {
                    log.error("ScheduledTask adjustThreadPool exception", e);
                }
            }
        }, 1, 1, TimeUnit.MINUTES);
    }
}

RocketMQ和Kafka对比

RocketMQ适合业务,kafka适合日志,因为kafka中大量运用了批处理的思想,比如消息发送就是攒一批了一起发

RocketMQKafka
性能(单机)TPS 10W     TPS 100W
可靠性支持同步输盘、异步刷盘、同步复制,异步复制支持异步刷盘,异步复制RocketMQ的可靠性较好
实时性RocketMQ的实时性较好
顺序性
消费失败重试支持失败重试,支持重试间隔时间顺延不支持
延时消息/定时消息支持
 
不支持
分布式事务支持        不支持
消息查询机制支持messageid和消息内容查询消息
 
不支持
消息回溯
 
支持某个时间戳(毫秒级)来回溯消息
 
支持某个偏移量offset来回溯消息
 
各有千秋,不分伯仲,术业有专攻
 


 

各个 MQ 使用的文件读写方式:

861d5bcf1fdc807e9b3c7382598e7dc6.png

kafka:record 的读写都是基于 FileChannel。index 读写基于 MMAP(厮大提示)。

RocketMQ:读盘基于 MMAP,写盘默认使用 MMAP,可通过修改配置,配置成 FileChannel,原因是作者想避免 PageCache 的锁竞争,通过两层架构实现读写分离

QMQ: 去哪儿MQ,读盘使用MMAP,写盘使用 FileChannel

ActiveMQ 5.15: 读写全部都是基于 RandomAccessFile,这也是我们抛弃 ActiveMQ 的原因

总结:

2436178da055f376be134e6e61884dab.png

参考链接:

MappedByteBuffer VS FileChannel ,孰强孰弱?_51CTO博客_MappedByteBuffer

面试题

5cf3ebdcb672462886f97e4a7702bd37.png

为什么在异步刷盘/同步复制时开启堆外内存transientStorePoolEnable后,集群压测几乎无法进行?

1>主从同步复制使用mappedByteBuffer;

2>开启堆外内存池transientStorePoolEnable后数据先落到WriteBuffer,再通过异步提交线程提交到FileChannel.write()给commit到PageCache中,最后刷盘;

3>未开启堆外内存池transientStorePoolEnable数据直接写入到mappedByteBuffer;

由于开启堆外内存数据映射到mappedByteBuffer比直接写入mappedByteBuffer多了很多步骤,再加上发送队列处理事件默认只有200毫秒(waitTimeMillsInSendQueue=200),造成集群不能正常压测的原因

为什么在异步刷盘/同步复制时调大JVM堆内存后,性能明显提升呢?提升了的倍数几乎是对内存增大的倍数。

从模拟流程中可以看出,在组装消息时使用堆内存,提高堆内存显著提高写入Tps的原因所在。ByteBuffer msgStoreItemMemory=ByteBuffer.allocate(data.length());

RocketMQ同步复制性能优化【实战笔记】 - 腾讯云开发者社区-腾讯云



RocketMQ设计精髓

精髓一:结构清晰的线程模型

RocketMQ 在Netty原生的多线程 Reactor 模型上做了一系列的扩展和优化,记住主要的数字:(1 + N + M1 + M2

  • 一个 Reactor 主线程(eventLoopGroupBoss,即为上面的1)负责监听 TCP 网络连接请求,建立好连接,创建 SocketChannel,并注册到 selector 上。
  • RocketMQ 的源码中会自动根据 OS 的类型选择 NIO 和 Epoll,也可以通过参数配置,然后监听真正的网络数据。
  • 拿到网络数据后,再丢给 Worker 线程池(eventLoopGroupSelector,即为上面的“N”,源码中默认设置为3),
  • 在真正执行业务逻辑之前需要进行 SSL 验证、编解码、空闲检查、网络连接管理,这些工作交给 defaultEventExecutorGroup(即为上面的“M1”,源码中默认设置为 8 )去做。
  • 而处理业务操作放在业务线程池中执行,根据 RomotingCommand 的业务请求码 code 去 processorTable 这个本地缓存变量中找到对应的 processor,然后封装成 task 任务后,提交给对应的业务 processor 处理线程池来执行(sendMessageExecutor,以发送消息为例,即为上面的 “M2”)。从入口到业务逻辑的几个步骤中线程池一直再增加,这跟每一步逻辑复杂性相关,越复杂,需要的并发通道越宽。

通过这样的线程模型,使得Netty的IO线程,变得非常轻量,拿到socket上的数据,立马就把数据甩出去

精髓二:业务线程池隔离、职责单一

0e6504dfbb134643944fc7b34b6069dc.png

不同的指令,有不同的XxxxxxProcessor进行处理,同时对应着不同的业务线程池,不同的业务线程池执行业务,互不影响。比如sendMessageExecutor、pullMessageExecutor等

并且,各个业务线程池,采取的是拒绝的策略,也就是线程池放不进去后,直接拒绝IO线程往业务线程池中放入数据,捕获RejectException后,打印流控日志,并直接通过channel给客户端返回失败结果

精髓三:各种兜底策略

消费者的pullMessagService线程,在从阻塞队列中拿到PullRequest并向Broker发送消息拉取请求时,会顺带上报当前队列的消费进度

另外,还会有一个兜底的定期任务,每5s执行一次,定期向Broker端上报一次当前存在消费者本地的消费进度

这里,也体现了RocketMQ设计的轻量,并不是每消费一条消息都会去Broker端上报一次消费进度,而是有一个批处理攒一波的概念

延迟消息、重试消息:如果每个延迟队列很长时间都没有新消息进来,那么每个延迟队列对应的TimeTask,也会每隔100ms被丢入Timer中一次。具体逻辑就是,该延迟队列的上一个TimeTask执行过程中发现该延迟队列没有新的延迟消息,则会在最后,往timer中丢入一个TimeTask,并指定这个TimeTask在100ms后执行,以此循环往复

rocketmq提供了一种兜底策略,就是PullRequestHoldService内部的死循环,会每5秒醒一次,醒来后会检查pullRequestTable内部的pullRequest,是否达到了超时时间,如果已经超过超时时间,不管是否有没有满足条件的新消息,都将拿到专门处理PullMessageProcessor相关工作的线程池,在该线程池中创建一个task来调用getMessage(PullRequest,brokerAllowSuppend = false)

精髓四:自我保护机制

Broker端,比如限定每次拉如的消息不超过256k等等

消费端,也有一些限流机制,本地红黑树中最大和最小偏移量不超过2000、红黑树中消息不超过1000条、消息总大小不超过100M等等,

精髓五:内存映射MMP加速磁盘读写

读盘基于MMAP,写盘默认使用MMAP,可通过修改配置,配置成FileChannel,原因是作者想避免PageCache的锁竞争,通过两层架构实现读写分离

751307387bda4ae2b15343e99d6c1a4d.png

这一块pageCache一直在内存中,一段时间内,读取的都是这块热点区域的消息,所以读取的消息是有保证的,这也是为什么rocketmq高效的原因之一,大量用了PageCache 

比如,一次拉取32条消息,如果没有PageCache,那么一次拉取,就要走32次随机磁盘IO读取,有了PageCache,实际上就变成32次内存读了

5477173e0cb843da8f9cc87b4219deb7.png

消费者拉取Broker端的消息到消费端,实际上是把Broker端磁盘中的数据,通过网络请求,转发到消费者,底层实际上是通过linux提供的一个零拷贝函数sendFile(fd1,fd2),fd1代表磁盘文件,fd2代表网络socket连接

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值