1.namesrv如何处理broker的心跳包
Broker发出路由注册的心跳包之后,NameServer会根据心跳包中的requestCode进行处理。NameServer的默认网络处理器是DefaultRequestProcessor,具体代码如下:
public RemotingCommand processRequest(ChannelHandlerContext ctx,
RemotingCommand request) throws RemotingCommandException {
//这里是NameServer处理请求的核心代码。根据请求类型有不同的处理过程
switch (request.getCode()) {
//...
//这里是注册Broker的请求
case RequestCode.REGISTER_BROKER:
Version brokerVersion = MQVersion.value2Version(request.getVersion());
if (brokerVersion.ordinal() >= MQVersion.Version.V3_0_11.ordinal()) {
return this.registerBrokerWithFilterServer(ctx, request);
} else {
//注册Broker的实际方法
return this.registerBroker(ctx, request);
}
//...
default:
break;
}
return null;
}
核心注册逻辑是由RouteInfoManager#registerBroker来实现,核心代码如下:
//RouteInfoMapper就是管理路由信息的核心组件。
RegisterBrokerResult result = this.namesrvController.getRouteInfoManager().registerBroker(
requestHeader.getClusterName(),
requestHeader.getBrokerAddr(),
requestHeader.getBrokerName(),
requestHeader.getBrokerId(),
requestHeader.getHaServerAddr(),
topicConfigWrapper,
null,
ctx.channel()
);
//注册Broker
public RegisterBrokerResult registerBroker(
final String clusterName,
final String brokerAddr,
final String brokerName,
final long brokerId,
final String haServerAddr,
final TopicConfigSerializeWrapper topicConfigWrapper,
final List<String> filterServerList,
final Channel channel) {
RegisterBrokerResult result = new RegisterBrokerResult();
try {
try {
//并发加锁,同一时间只能一个线程写。
this.lock.writeLock().lockInterruptibly();
//Broker列表,用的一个set,自动去重。
Set<String> brokerNames = this.clusterAddrTable.get(clusterName);
if (null == brokerNames) {
brokerNames = new HashSet<String>();
this.clusterAddrTable.put(clusterName, brokerNames);
}
brokerNames.add(brokerName);
boolean registerFirst = false;
//根据Broker名称获取数据。这个brokerAddrTable就是核心路由数据表
BrokerData brokerData = this.brokerAddrTable.get(brokerName);
//第一次注册,这个brokerData就是null。后续心跳注册时就不会重复注册。
if (null == brokerData) {
registerFirst = true;
brokerData = new BrokerData(clusterName, brokerName, new HashMap<Long, String>());
this.brokerAddrTable.put(brokerName, brokerData);
}
//对路由数据做一些封装
Map<Long, String> brokerAddrsMap = brokerData.getBrokerAddrs();
//Switch slave to master: first remove <1, IP:PORT> in namesrv, then add <0, IP:PORT>
//The same IP:PORT must only have one record in brokerAddrTable
Iterator<Entry<Long, String>> it = brokerAddrsMap.entrySet().iterator();
while (it.hasNext()) {
Entry<Long, String> item = it.next();
if (null != brokerAddr && brokerAddr.equals(item.getValue()) && brokerId != item.getKey()) {
it.remove();
}
}
String oldAddr = brokerData.getBrokerAddrs().put(brokerId, brokerAddr);
registerFirst = registerFirst || (null == oldAddr);
if (null != topicConfigWrapper
&& MixAll.MASTER_ID == brokerId) {
if (this.isBrokerTopicConfigChanged(brokerAddr, topicConfigWrapper.getDataVersion())
|| registerFirst) {
ConcurrentMap<String, TopicConfig> tcTable =
topicConfigWrapper.getTopicConfigTable();
if (tcTable != null) {
for (Map.Entry<String, TopicConfig> entry : tcTable.entrySet()) {
this.createAndUpdateQueueData(brokerName, entry.getValue());
}
}
}
}
//每隔30秒心跳注册时,会封装一个新的BrokerLiveInfo。这样就会覆盖上一次的数据。
//同时,这个BrokerLiveInfo里会保存一个当前时间戳,代表最近一次心跳时间。
BrokerLiveInfo prevBrokerLiveInfo = this.brokerLiveTable.put(brokerAddr,
new BrokerLiveInfo(
System.currentTimeMillis(),
topicConfigWrapper.getDataVersion(),
channel,
haServerAddr));
if (null == prevBrokerLiveInfo) {
log.info("new broker registered, {} HAServer: {}", brokerAddr, haServerAddr);
}
if (filterServerList != null) {
if (filterServerList.isEmpty()) {
this.filterServerTable.remove(brokerAddr);
} else {
this.filterServerTable.put(brokerAddr, filterServerList);
}
}
if (MixAll.MASTER_ID != brokerId) {
String masterAddr = brokerData.getBrokerAddrs().get(MixAll.MASTER_ID);
if (masterAddr != null) {
BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.get(masterAddr);
if (brokerLiveInfo != null) {
result.setHaServerAddr(brokerLiveInfo.getHaServerAddr());
result.setMasterAddr(masterAddr);
}
}
}
} finally {
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("registerBroker Exception", e);
}
return result;
}
从源码我们可以看出namesrv处理broker的心跳包主要是通过维护private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable;这个属性来进行注册的。
2.namesrv如何剔除不活跃的broker
前面我们在namesrv初始化时看到过剔除无效broker的定时任务,每隔10s扫描一次Broker,移除不活跃的Broker。
//开启定时任务:每隔10s扫描一次Broker,移除不活跃的Broker
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
NamesrvController.this.routeInfoManager.scanNotActiveBroker();
}
}, 5, 10, TimeUnit.SECONDS);
跟踪源码我们可以看到namesrv两分钟内没有收到broker的心跳请求,即认为是不活跃的broker;
//K2 扫描不活动的Broker
public void scanNotActiveBroker() {
//扫描的就是这个BrokerLiveTable,路由信息表。还有一个Brokernames
Iterator<Entry<String, BrokerLiveInfo>> it = this.brokerLiveTable.entrySet().iterator();
while (it.hasNext()) {
Entry<String, BrokerLiveInfo> next = it.next();
long last = next.getValue().getLastUpdateTimestamp();
//根据心跳时间判断是否存活的核心逻辑。两分钟
if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) {
RemotingUtil.closeChannel(next.getValue().getChannel());
it.remove();
log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME);
this.onChannelDestroy(next.getKey(), next.getValue().getChannel());
}
}
}
而剔除的核心逻辑如下:
public void onChannelDestroy(String remoteAddr, Channel channel) {
String brokerAddrFound = null;
// ---------------第一步start------------------------------
// 通过channel从当前活跃的列表中找出对应的broker地址
if (channel != null) {
try {
try {
this.lock.readLock().lockInterruptibly();
Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable =
this.brokerLiveTable.entrySet().iterator();
while (itBrokerLiveTable.hasNext()) {
Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
if (entry.getValue().getChannel() == channel) {
brokerAddrFound = entry.getKey();
break;
}
}
} finally {
this.lock.readLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
//如果当前broker列表中已经不存在改broker,则用入参remoteAddr
if (null == brokerAddrFound) {
brokerAddrFound = remoteAddr;
} else {
log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
}
// ---------------第一步end------------------------------
if (brokerAddrFound != null && brokerAddrFound.length() > 0) {
try {
try {
// ---------------第二步start------------------------------
//分别从brokerLiveTable,filterServerTable中移除该broker。
this.lock.writeLock().lockInterruptibly();
this.brokerLiveTable.remove(brokerAddrFound);
this.filterServerTable.remove(brokerAddrFound);
String brokerNameFound = null;
boolean removeBrokerName = false;
Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
this.brokerAddrTable.entrySet().iterator();
//然后遍历BrokerAddrTable,根据BrokerAddress找到对应的brokerData,
//并将brokerData中对应的brokerAddress移除,如果移除后,
//整个brokerData的brokerAddress空了,
// 那么将整个brokerData移除
while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
BrokerData brokerData = itBrokerAddrTable.next().getValue();
Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
while (it.hasNext()) {
Entry<Long, String> entry = it.next();
Long brokerId = entry.getKey();
String brokerAddr = entry.getValue();
if (brokerAddr.equals(brokerAddrFound)) {
brokerNameFound = brokerData.getBrokerName();
it.remove();
log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
brokerId, brokerAddr);
break;
}
}
if (brokerData.getBrokerAddrs().isEmpty()) {
removeBrokerName = true;
itBrokerAddrTable.remove();
log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
brokerData.getBrokerName());
}
}
// ---------------第二步end------------------------------
if (brokerNameFound != null && removeBrokerName) {
// ---------------第三步start------------------------------
Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
//遍历clusterAddrTable,根据第三步中获取的需要移除的BrokerName,
//将对应的brokerName移除了。
//如果移除后,该集合为空,那么将整个集群从clusterAddrTable中移除。
while (it.hasNext()) {
Entry<String, Set<String>> entry = it.next();
String clusterName = entry.getKey();
Set<String> brokerNames = entry.getValue();
boolean removed = brokerNames.remove(brokerNameFound);
if (removed) {
log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
brokerNameFound, clusterName);
if (brokerNames.isEmpty()) {
log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
clusterName);
it.remove();
}
break;
}
}
}
// ---------------第三步end------------------------------
if (removeBrokerName) {
// ---------------第四步start------------------------------
Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
this.topicQueueTable.entrySet().iterator();
while (itTopicQueueTable.hasNext()) {
//遍历TopicQueueTable,根据BrokerName,
//将Topic下对应的Broker移除掉,
//如果该Topic下只有一个待移除的Broker,那么该Topic也从table中移除。
Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
String topic = entry.getKey();
List<QueueData> queueDataList = entry.getValue();
Iterator<QueueData> itQueueData = queueDataList.iterator();
while (itQueueData.hasNext()) {
QueueData queueData = itQueueData.next();
if (queueData.getBrokerName().equals(brokerNameFound)) {
itQueueData.remove();
log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
topic, queueData);
}
}
if (queueDataList.isEmpty()) {
itTopicQueueTable.remove();
log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
topic);
}
}
}
// ---------------第四步end------------------------------
} finally {
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
}
从上面namesrv两个功能源码中可以看出,注册broker以及路由剔除的整体逻辑还算比较简单,就是单纯地针对路由元信息的数据结构进行操作。建议我们还是要先搞懂路由元信息的数据结构(这部分在我博客的namesrv介绍中有提到过)以便我们能更好的了解分析源码。