目录:
- RocketMQ源码解析——搭建源码环境
- RocketMQ源码解析——NameServer
- RocketMQ源码解析——Broker
- RocketMQ源码解析——Producer
- RocketMQ源码解析——消息存储
- RocketMQ源码解析——Consumer
1. 了解RocketMQ核心组件——NameServer:
RocketMQ消息中间件的设计思路一般是基于主题订阅发布的机制,消息生产者(Producer)发送一个消息到消息服务器,消息服务器负责将消息持久化存储,消息消费者(Consumer)订阅该兴趣的主题,消息服务器根据订阅信息(路由信息)将消息推送到消费者(Push模式)或者消费者主动向消息服务器拉去(Pull模式),从而实现消息生产者与消息消费者解耦。
为了避免消息服务器的单点故障导致的整个系统瘫痪,通常会部署多台消息服务器共同承担消息的存储。那消息生产者如何知道消息要发送到哪台消息服务器呢?如果某一台消息服务器宕机了,那么消息生产者如何在不重启服务情况下感知呢?
而NameServer就是为了解决以上问题设计的,整体设计图如下所示:
Broker消息服务器在启动的时向所有NameServer注册,消息生产者(Producer)在发送消息时之前先从NameServer获取Broker服务器地址列表,然后根据负载均衡算法从列表中选择一台服务器进行发送。NameServer与每台Broker保持长连接,并间隔一定时间检测Broker是否存活,如果检测到Broker宕机,则从路由注册表中删除。
2. 流程时序图:
3. 源码解析:
找到NameServer启动类,及其启动入口:
public static void main(String[] args) { // NameServer启动入口
main0(args);
}
public static NamesrvController main0(String[] args) {
try {
NamesrvController controller = createNamesrvController(args); // 创建NameServer控制器,前面说到NameServer的主要作用就是作为信息注册中心,需要接收其他组件的注册/获取信息请求。因此可以将其看做是一个接收请求的controller
start(controller); // 启动controller
String tip = "The Name Server boot success. serializeType=" + RemotingCommand.getSerializeTypeConfigInThisServer();
log.info(tip);
System.out.printf("%s%n", tip);
return controller;
} catch (Throwable e) {
e.printStackTrace();
System.exit(-1); // 新建/启动controller异常,停止JVM
}
return null;
}
- 创建NameServerController:
public static NamesrvController createNamesrvController(String[] args) throws IOException, JoranException {
System.setProperty(RemotingCommand.REMOTING_VERSION_KEY, Integer.toString(MQVersion.CURRENT_VERSION)); // 设置RocketMQ版本属性信息
Options options = ServerUtil.buildCommandlineOptions(new Options()); // 构建命令行解析options对象
commandLine = ServerUtil.parseCmdLine("mqnamesrv", args, buildCommandlineOptions(options), new PosixParser()); // 创建解析命令行参数对象
if (null == commandLine) {
System.exit(-1);
return null;
}
final NamesrvConfig namesrvConfig = new NamesrvConfig(); // 新建NameServer配置对象
final NettyServerConfig nettyServerConfig = new NettyServerConfig(); // 新建Netty服务器端配置对象,用于接收请求(组件之间的通信主要依赖netty)
nettyServerConfig.setListenPort(9876); // 默认监听端口9876
// 解析启动命令-c参数
if (commandLine.hasOption('c')) {
String file = commandLine.getOptionValue('c');
if (file != null) {
InputStream in = new BufferedInputStream(new FileInputStream(file));
properties = new Properties();
properties.load(in);
MixAll.properties2Object(properties, namesrvConfig);
MixAll.properties2Object(properties, nettyServerConfig);
namesrvConfig.setConfigStorePath(file);
System.out.printf("load config properties file OK, %s%n", file);
in.close();
}
}
// 解析启动命令-p参数
if (commandLine.hasOption('p')) {
InternalLogger console = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_CONSOLE_NAME);
MixAll.printObjectProperties(console, namesrvConfig);
MixAll.printObjectProperties(console, nettyServerConfig);
System.exit(0);
}
MixAll.properties2Object(ServerUtil.commandLine2Properties(commandLine), namesrvConfig);
if (null == namesrvConfig.getRocketmqHome()) {
System.out.printf("Please set the %s variable in your environment to match the location of the RocketMQ installation%n", MixAll.ROCKETMQ_HOME_ENV);
System.exit(-2);
}
LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
JoranConfigurator configurator = new JoranConfigurator();
configurator.setContext(lc);
lc.reset();
configurator.doConfigure(namesrvConfig.getRocketmqHome() + "/conf/logback_namesrv.xml");
log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);
MixAll.printObjectProperties(log, namesrvConfig);
MixAll.printObjectProperties(log, nettyServerConfig);
final NamesrvController controller = new NamesrvController(namesrvConfig, nettyServerConfig);
// remember all configs to prevent discard
controller.getConfiguration().registerConfig(properties);
return controller;
}
以上代码就是创建NameServer控制器的整体流程,其中可以发现控制器的创建依赖于NameServer和NettyService配置对象,因此重点来看下这两个配置对象当中有哪些重要的属性:
/**
* NamesrvConfig重点属性
*/
private String rocketmqHome = System.getProperty(MixAll.ROCKETMQ_HOME_PROPERTY, System.getenv(MixAll.ROCKETMQ_HOME_ENV)); // RocketMQ项目的目录,也就是它的家目录
private String kvConfigPath = System.getProperty("user.home") + File.separator + "namesrv" + File.separator + "kvConfig.json"; // NameServer存储KV配置属性的持久化路径
private String configStorePath = System.getProperty("user.home") + File.separator + "namesrv" + File.separator + "namesrv.properties"; // NameServer默认配置文件路径
private String productEnvName = "center";
private boolean clusterTest = false;
private boolean orderMessageEnable = false; // 是否支持顺序消息,默认不支持
/**
* NettyServerConfig重点属性
*/
private int listenPort = 8888; // NameServer监听端口,默认是8888,但是会被修改为9876
private int serverWorkerThreads = 8; // 线程池线程个数
private int serverCallbackExecutorThreads = 0; // 回调函数处理池线程个数(比如说异步发送消息,会提供一个回调方法)
private int serverSelectorThreads = 3; // IO线程池线程个数,主要是处理网络请求,解析请求包
private int serverOnewaySemaphoreValue = 256; // 单向消息发送最大并发数
private int serverAsyncSemaphoreValue = 64; // 异步消息发送最大并发数
private int serverChannelMaxIdleTimeSeconds = 120; // 网络连接最大空闲时间,默认120s
private int serverSocketSndBufSize = NettySystemConfig.socketSndbufSize; // 网络发送缓冲区大小
private int serverSocketRcvBufSize = NettySystemConfig.socketRcvbufSize; // 网络接收缓冲区大小
private boolean serverPooledByteBufAllocatorEnable = true;
然后创建控制器:
public NamesrvController(NamesrvConfig namesrvConfig, NettyServerConfig nettyServerConfig) {
this.namesrvConfig = namesrvConfig; // 指定NameServer配置对象
this.nettyServerConfig = nettyServerConfig; //指定NettyServer配置对象
this.kvConfigManager = new KVConfigManager(this); // 创建KV配置信息对象
this.routeInfoManager = new RouteInfoManager(); // 创建路由管理器(核心属性,用于路由信息注册、查找)
this.brokerHousekeepingService = new BrokerHousekeepingService(this);
this.configuration = new Configuration(
log,
this.namesrvConfig, this.nettyServerConfig
);
this.configuration.setStorePathFromConfig(this.namesrvConfig, "configStorePath");
}
完成控制器的创建后,还需要对其进行初始化:
public static NamesrvController start(final NamesrvController controller) throws Exception {
if (null == controller) {
throw new IllegalArgumentException("NamesrvController is null");
}
boolean initResult = controller.initialize(); // 初始化控制器
if (!initResult) { // 初始化失败
controller.shutdown(); // 控制器关闭
System.exit(-3);
}
Runtime.getRuntime().addShutdownHook(new ShutdownHookThread(log, new Callable<Void>() { // 给当前运行程序添加关闭时的钩子函数
@Override
public Void call() throws Exception {
controller.shutdown(); // 程序关闭,控制器跟着关闭
return null;
}
}));
controller.start(); // 正式启动控制器
return controller;
}
- 控制器初始化:
public boolean initialize() {
this.kvConfigManager.load(); // 加载KV配置
this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.brokerHousekeepingService); // 创建NettyServer网络处理对象
this.remotingExecutor = Executors.newFixedThreadPool(nettyServerConfig.getServerWorkerThreads(), new ThreadFactoryImpl("RemotingExecutorThread_"));
this.registerProcessor();
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { // 重点方法,定义定时任务(先延迟5s,后执行指定任务,此后每隔10s,执行指定任务)
@Override
public void run() {
NamesrvController.this.routeInfoManager.scanNotActiveBroker(); // 由路由控制器扫描并排除不活跃的Broker
}
}, 5, 10, TimeUnit.SECONDS);
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { // 定义定时任务
@Override
public void run() {
NamesrvController.this.kvConfigManager.printAllPeriodically(); // 打印所有KV配置信息
}
}, 1, 10, TimeUnit.MINUTES);
if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) {
// Register a listener to reload SslContext
try {
fileWatchService = new FileWatchService(
new String[] {
TlsSystemConfig.tlsServerCertPath,
TlsSystemConfig.tlsServerKeyPath,
TlsSystemConfig.tlsServerTrustCertPath
},
new FileWatchService.Listener() {
boolean certChanged, keyChanged = false;
@Override
public void onChanged(String path) {
if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) {
log.info("The trust certificate changed, reload the ssl context");
reloadServerSslContext();
}
if (path.equals(TlsSystemConfig.tlsServerCertPath)) {
certChanged = true;
}
if (path.equals(TlsSystemConfig.tlsServerKeyPath)) {
keyChanged = true;
}
if (certChanged && keyChanged) {
log.info("The certificate and private key changed, reload the ssl context");
certChanged = keyChanged = false;
reloadServerSslContext();
}
}
private void reloadServerSslContext() {
((NettyRemotingServer) remotingServer).loadSslContext();
}
});
} catch (Exception e) {
log.warn("FileWatchService created error, can't load the certificate dynamically");
}
}
return true;
}
以上代码便是控制器初始化的全部逻辑,其中我们需要重点关注的方法是 NamesrvController.this.routeInfoManager.scanNotActiveBroker(); 这个方法是NameServer依赖路由管理器定时扫描不活跃的Broker,并将其信息移除。是NameServer作为RocketMQ的注册信息中心最为核心的功能。
首先,来了解下路由管理器的创建,及其核心属性:
private final static long BROKER_CHANNEL_EXPIRED_TIME = 1000 * 60 * 2; // 与Broker通信的长连接通道的过期时间
private final ReadWriteLock lock = new ReentrantReadWriteLock(); // 读写锁
private final HashMap<String/* topic */, List<QueueData>> topicQueueTable; // Topic消息队列路由信息,消息发送时根据路由表进行负载均衡
private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable; // Broker基础信息,包括broker名、所属集群名称、主备Broker地址
private final HashMap<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable; // Broker集群信息,存储集群中所有的Broker名称
private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable; // Broker状态信息,NameServer每次检测完各个Broker信息后会更新该Table
private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */> filterServerTable; // Broker上的FilterServer列表,用于类模式消息过滤
public RouteInfoManager() { // 构造函数,对以上属性进行初始化
this.topicQueueTable = new HashMap<String, List<QueueData>>(1024);
this.brokerAddrTable = new HashMap<String, BrokerData>(128);
this.clusterAddrTable = new HashMap<String, Set<String>>(32);
this.brokerLiveTable = new HashMap<String, BrokerLiveInfo>(256);
this.filterServerTable = new HashMap<String, List<String>>(256);
}
通过以上代码可以看到,路由管理器的核心属性都是HashMap类型对象,为了对路由管理器有更深入的了解,下面来关注其中三个重点value对象:
同一个Topic中会包含多个消息队列,一个Broker会为每一个主题创建4个读队列,和四个写队列,因此topicQueueTable的value是list列表
/**
* 消息队列
*/
public class QueueData implements Comparable<QueueData> {
private String brokerName; // 所属的Broker名
private int readQueueNums; // 读队列数量
private int writeQueueNums; // 写队列数量
private int perm;
private int topicSysFlag;
...
}
多个Broker组成一个集群,集群由相同的多台Broker组成Master-Slave架构,brokerId为0代表master,大于0代表Salve
/**
* Broker
*/
public class BrokerData implements Comparable<BrokerData> {
private String cluster; // 所属的集群名
private String brokerName; // Broker名
private HashMap<Long/* brokerId */, String/* broker address */> brokerAddrs; // 集群中同名Broker结点的id和地址的映射表(会存在同名多结点的情况,保证高可用)
...
}
表示Broker相关状态信息
/**
* Broker状态信息
*/
class BrokerLiveInfo {
private long lastUpdateTimestamp; // 信息最新更新时间戳
private DataVersion dataVersion; // 版本
private Channel channel; // 长连接通道
private String haServerAddr;
...
}
到此,有了以上对路由管理器核心属性的了解,再来回顾NameServer是如何通过路由管理器,定时扫描并排除不活跃的Broker:
public void scanNotActiveBroker() {
Iterator<Entry<String, BrokerLiveInfo>> it = this.brokerLiveTable.entrySet().iterator();
while (it.hasNext()) { // 遍历所有Broker状态信息
Entry<String, BrokerLiveInfo> next = it.next();
long last = next.getValue().getLastUpdateTimestamp();
if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) { // Broker信息最新更新时间戳 + 长连接通道过期时间 < 当前时间戳 意味着该Broker已经超过指定时间未与NameServer进行通信,因此被视为故障,需要从删除该Broker对应的信息
RemotingUtil.closeChannel(next.getValue().getChannel()); // 关闭与该Broker通信的长连接
it.remove(); // 删除该Broker的状态信息
log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME);
this.onChannelDestroy(next.getKey(), next.getValue().getChannel()); // 删除其他table有关该Broker的信息
}
}
}
public void onChannelDestroy(String remoteAddr, Channel channel) {
String brokerAddrFound = null;
if (channel != null) {
try {
try {
this.lock.readLock().lockInterruptibly(); // 加读锁
Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable = this.brokerLiveTable.entrySet().iterator();
while (itBrokerLiveTable.hasNext()) { // 遍历所有Broker状态信息
Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
if (entry.getValue().getChannel() == channel) { // 要删除的Broker状态信息还存在(还没删除)
brokerAddrFound = entry.getKey(); // 则找到该Broker的地址
break;
}
}
} finally {
this.lock.readLock().unlock(); // 释放读锁
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
if (null == brokerAddrFound) { // 为空,说明对应Broker状态信息已经删除
brokerAddrFound = remoteAddr; // 使用传进来的Broker地址参数
} else {
log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
}
if (brokerAddrFound != null && brokerAddrFound.length() > 0) {
try {
try {
this.lock.writeLock().lockInterruptibly(); // 加写锁,保证删除操作的线程安全
this.brokerLiveTable.remove(brokerAddrFound); // 再次确认删除指定Broker的状态信息
this.filterServerTable.remove(brokerAddrFound); // 删除有关指定Broker的过滤地址信息
String brokerNameFound = null;
boolean removeBrokerName = false;
Iterator<Entry<String, BrokerData>> itBrokerAddrTable = this.brokerAddrTable.entrySet().iterator();
while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) { // 遍历所有Broker
BrokerData brokerData = itBrokerAddrTable.next().getValue();
Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
while (it.hasNext()) { // 遍历每个Broker中所包含的同名结点的Broker信息
Entry<Long, String> entry = it.next();
Long brokerId = entry.getKey();
String brokerAddr = entry.getValue();
if (brokerAddr.equals(brokerAddrFound)) { // 找到指定删除的Broker地址
brokerNameFound = brokerData.getBrokerName(); // 获取该Broker名
it.remove(); // 将其从同名Broker集合中删除
log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
brokerId, brokerAddr);
break;
}
}
if (brokerData.getBrokerAddrs().isEmpty()) { // 同名Broker集合为空,说明以该Broker命名的结点已经全部挂掉,则这个命名的Broker也没有存在的必要了
removeBrokerName = true; // 标识需要删除该Broker
itBrokerAddrTable.remove(); // 从地址信息中删除该Broker
log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed", brokerData.getBrokerName());
}
}
if (brokerNameFound != null && removeBrokerName) { // 满足删除指定名称的Broker条件,需要删除有关该Broker的其他信息
Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
while (it.hasNext()) { // 遍历所有集群
Entry<String, Set<String>> entry = it.next();
String clusterName = entry.getKey();
Set<String> brokerNames = entry.getValue();
boolean removed = brokerNames.remove(brokerNameFound); // 从集群中删除指定名称的Broker名称信息
if (removed) {
log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed", brokerNameFound, clusterName);
if (brokerNames.isEmpty()) { // Broker名称信息集合为空,说明这个集群已经不存在任何Broker结点了,则这个集群也没有存在的必要了
log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster", clusterName);
it.remove(); // 删除该集群的信息
}
break;
}
}
}
if (removeBrokerName) {
Iterator<Entry<String, List<QueueData>>> itTopicQueueTable = this.topicQueueTable.entrySet().iterator();
while (itTopicQueueTable.hasNext()) { // 遍历所有Topic主题
Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
String topic = entry.getKey(); // 获取topic名称
List<QueueData> queueDataList = entry.getValue(); // 获取Topic下的所有队列
Iterator<QueueData> itQueueData = queueDataList.iterator();
while (itQueueData.hasNext()) { // 遍历该Topic下的所有队列
QueueData queueData = itQueueData.next();
if (queueData.getBrokerName().equals(brokerNameFound)) { // 队列属于指定删除的Broker
itQueueData.remove(); // 将该队列从集合中删除
log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed", topic, queueData);
}
}
if (queueDataList.isEmpty()) { // 队列集合为空,则没有存在的必要了
itTopicQueueTable.remove(); // 删除该队列集合
log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed", topic);
}
}
}
} finally {
this.lock.writeLock().unlock(); // 释放写锁
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
}
到此,NameServer的核心源码解析结束。