1、重要API及原理
与instance有关的请求都是这个InstanceController来处理。该类为一个服务实例处理器API,用于处理的心跳、注册、下线等请求。
- Register new instance.
- Deregister instances.
- Update instance.
- Create a beat for instance.
- Batch update instance's metadata. old key exist = update, old key not exist = add.
- Batch delete instance's metadata. old key exist = delete, old key not exist = not operate
- Patch instance.
- Get all instance of input service.
- Get detail information of specified instance.
- .......
处理注册请求(Register new instance.)
nacos注册中心功能是在naming这个子项目下面的InstanceController.register(),会先解析client客户端发送的实例对象,之后调用serviceManager组件进行实例注册
@CanDistro
@PostMapping
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public String register(HttpServletRequest request) throws Exception {
// 从请求中获取指定属性值
final String namespaceId = WebUtils
.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
// 从请求中获取指定属性值
final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
// 检测serviceName是否合法
NamingUtils.checkServiceNameFormat(serviceName);
// 通过请求参数组装出instance
final Instance instance = parseInstance(request);
// todo 将instance写到注册表
serviceManager.registerInstance(namespaceId, serviceName, instance);
return "ok";
}
ServiceManager业务是组成中心核心组件,管理服务注册,下线,获取服务列表等 ,接下来看 ServiceManager.registerInstance()
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
// 初始化创建service对象
// isEphemeral()表示实例类型 true表示临时实例
createEmptyService(namespaceId, serviceName, instance.isEphemeral());
// 从注册表获取到service
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM,
"service not found, namespace: " + namespaceId + ", service: " + serviceName);
}
// todo 将实例信息写入到service,即写入到了注册表
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
ServiceManager.registerInstance(),主要讲实例写入到注册表列表中,然后会将实例同步到其他nacos服务端,保障集群内实例信息同步
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
// 从注册表中获取service
Service service = getService(namespaceId, serviceName);
synchronized (service) {
// todo 将要注册的instance写入到service,即写入到了注册表
List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
// 赋值给instance
Instances instances = new Instances();
instances.setInstanceList(instanceList);
// todo 将本次变更同步给其它Nacos
consistencyService.put(key, instances);
}
}
具体分析两块内容
- addIpAddresses(service, ephemeral, ips)实例写入本地注册表
private List<Instance> addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
// 修改当前service的instance列表,这个修改一共有两种操作:
// 添加实例 与 删除实例
//UPDATE_INSTANCE_ACTION_ADD是add
//UPDATE_INSTANCE_ACTION_REMOVE是remove
return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
}
updateIpAddresses()
public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips)
throws NacosException {
// 从其它nacos获取当前服务数据(临时实例数据)
Datum datum = consistencyService
.get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
// 获取本地注册表中当前服务的所有临时实例
List<Instance> currentIPs = service.allIPs(ephemeral);
Map<String, Instance> currentInstances = new HashMap<>(currentIPs.size());
Set<String> currentInstanceIds = Sets.newHashSet();
// 遍历注册表中获取到的实例
for (Instance instance : currentIPs) {
// 将当前遍历的instance写入到map,key为ip:port,value为instance
currentInstances.put(instance.toIpAddr(), instance);
// 将当前遍历的instanceId写入到一个set
currentInstanceIds.add(instance.getInstanceId());
}
Map<String, Instance> instanceMap;
if (datum != null && null != datum.value) {
// todo 将注册表中主机的instance数据替换掉外来的相同主机的instance数据
instanceMap = setValid(((Instances) datum.value).getInstanceList(), currentInstances);
} else {
instanceMap = new HashMap<>(ips.length);
}
for (Instance instance : ips) {
// 若当前service中不包含当前要注册的instance所属cluster,则创建一个
if (!service.getClusterMap().containsKey(instance.getClusterName())) {
Cluster cluster = new Cluster(instance.getClusterName(), service);
// todo 初始化cluster的健康检测任务
cluster.init();
service.getClusterMap().put(instance.getClusterName(), cluster);
Loggers.SRV_LOG
.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
instance.getClusterName(), instance.toJson());
}
// 若当前操作为清除操作,则将当前instance从instanceMap中清除,
// 否则就是添加操作,即将当前instance添加到instanceMap中
if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
instanceMap.remove(instance.getDatumKey());
} else {
Instance oldInstance = instanceMap.get(instance.getDatumKey());
if (oldInstance != null) {
instance.setInstanceId(oldInstance.getInstanceId());
} else {
instance.setInstanceId(instance.generateInstanceId(currentInstanceIds));
}
instanceMap.put(instance.getDatumKey(), instance);
}
}
if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
throw new IllegalArgumentException(
"ip list can not be empty, service: " + service.getName() + ", ip list: " + JacksonUtils
.toJson(instanceMap.values()));
}
return new ArrayList<>(instanceMap.values());
}
- consistencyService.put(key, instances)实例同步
put()会现根据实例类型,执行具体的流程,临时实例会走的ephemeralConsistencyService服务对应的实现类DistroConsistencyServiceImpl,否则就走persistentConsistencyService服务对应的实现类PersistentConsistencyServiceDelegateImpl
//DelegateConsistencyServiceImpl.class
@Override
public void put(String key, Record value) throws NacosException {
mapConsistencyService(key).put(key, value);
}
private ConsistencyService mapConsistencyService(String key) {
// 判断是不是实例类型
临时key就走ephemeralConsistencyService服务,否则就走persistentConsistencyService服务
return KeyBuilder.matchEphemeralKey(key) ? ephemeralConsistencyService : persistentConsistencyService;
}
传入的是临时实例,所以会走到DistroConsistencyServiceImpl中的put()方法
@Override
public void put(String key, Record value) throws NacosException {
// 新增到本地缓存
onPut(key, value);
// todo 同步到远程服务器
distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE,
globalConfig.getTaskDispatchPeriod() / 2);
}
distroProtocol.sync()当中通过添加定时任务进行同步
public void sync(DistroKey distroKey, DataOperation action, long delay) {
for (Member each : memberManager.allMembersWithoutSelf()) {
DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(),
each.getAddress());
DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay);
distroTaskEngineHolder.getDelayTaskExecuteEngine().addTask(distroKeyWithTarget, distroDelayTask);
if (Loggers.DISTRO.isDebugEnabled()) {
Loggers.DISTRO.debug("[DISTRO-SCHEDULE] {} to {}", distroKey, each.getAddress());
}
}
}
服务主动下线(Deregister instances)
InstanceController的deregister()是服务下线的web接口。这块的业务相对容易理解,就是获取到服务实例之后,与本地注册表比对如果没有则不需要进行任何操作,如果有则进行删除,然后同步到其他远程nacos服务
@CanDistro
@DeleteMapping
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public String deregister(HttpServletRequest request) throws Exception {
// 从请求中获取要操作的instance
Instance instance = getIpAddress(request);
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
NamingUtils.checkServiceNameFormat(serviceName);
// 从注册表中获取service
Service service = serviceManager.getService(namespaceId, serviceName);
if (service == null) {
Loggers.SRV_LOG.warn("remove instance from non-exist service: {}", serviceName);
return "ok";
}
// todo 删除instance
serviceManager.removeInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
return "ok";
}
serviceManager.removeInstance(),进行删除时会用synchronized进行加锁,为了防止其他线程影响service
public void removeInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
throws NacosException {
// 从注册表获取当前service
Service service = getService(namespaceId, serviceName);
synchronized (service) {
// todo 删除
removeInstance(namespaceId, serviceName, ephemeral, service, ips);
}
}
private void removeInstance(String namespaceId, String serviceName, boolean ephemeral, Service service,
Instance... ips) throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
// todo 从注册表中删除instance,返回下线完剩下的instance集合
List<Instance> instanceList = substractIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
// todo 将本次变更同步给其它nacos,交给一致性服务进行存储,通知等
consistencyService.put(key, instances);
}
先生成一个服务列表的key这个key与你instance是否是临时节点有关系,如果是临时节点,生成的key是这个样子的com.alibaba.nacos.naming.iplist.ephemeral.{namespace}##{serviceName} 永久节点就是com.alibaba.nacos.naming.iplist.{namespace}##{serviceName} 这个样子。接着就是调用substractIpAddresses 方法用之前的instance列表减去 这次要下线的实例列表,然后生成一份新的删除下线的实例列表。
private List<Instance> substractIpAddresses(Service service, boolean ephemeral, Instance... ips)
throws NacosException {
//UPDATE_INSTANCE_ACTION_ADD是add
//UPDATE_INSTANCE_ACTION_REMOVE是remove
return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE, ephemeral, ips);
}
UPDATE_INSTANCE_ACTION_REMOVE这个action是remove。
接着调用updateIpAddresses() 方法,在处理注册请求中已经对该方法做了解析。其实删除与注册addIpAddresses()的执行流程差不多。
服务端接收心跳(Create a beat for instance.)
1.作者在/beat接口中定义了心跳接收机制,先会获取请求request中namespaceId, serviceName, clusterName, ip, port等参数,然后调用getInstance()从注册表中获取对应的instance实例对象,如果获取到的实例为null,则会执行registerInstance(),重新注册该实例并进行同步。
@CanDistro
@PutMapping("/beat")
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public ObjectNode beat(HttpServletRequest request) throws Exception {
// 创建一个JSON Node,该方法的返回值就是它,后面的代码就是对这个Node进行各种初始化
ObjectNode result = JacksonUtils.createEmptyJsonNode();
result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, switchDomain.getClientBeatInterval());
// 从请求中获取到beat,即client端的beatInfo
String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY);
RsInfo clientBeat = null;
// 将beat构建为clientBeat
if (StringUtils.isNotBlank(beat)) {
clientBeat = JacksonUtils.toObj(beat, RsInfo.class);
}
String clusterName = WebUtils
.optional(request, CommonParams.CLUSTER_NAME, UtilsAndCommons.DEFAULT_CLUSTER_NAME);
String ip = WebUtils.optional(request, "ip", StringUtils.EMPTY);
// 获取到客户端传递来的client的port,其将来用于UDP通信
int port = Integer.parseInt(WebUtils.optional(request, "port", "0"));
if (clientBeat != null) {
if (StringUtils.isNotBlank(clientBeat.getCluster())) {
clusterName = clientBeat.getCluster();
} else {
// fix #2533
clientBeat.setCluster(clusterName);
}
ip = clientBeat.getIp();
port = clientBeat.getPort();
}
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
NamingUtils.checkServiceNameFormat(serviceName);
Loggers.SRV_LOG.debug("[CLIENT-BEAT] full arguments: beat: {}, serviceName: {}", clientBeat, serviceName);
// 从注册表中获取当前发送请求的client对应的instance
Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port);
// 处理注册表中不存在该client的instance的情况
if (instance == null) {
// 若请求中没有携带心跳数据,则直接返回
if (clientBeat == null) {
result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND);
return result;
}
Loggers.SRV_LOG.warn("[CLIENT-BEAT] The instance has been removed for health mechanism, "
+ "perform data compensation operations, beat: {}, serviceName: {}", clientBeat, serviceName);
// 下面处理的情况是,注册表中没有该client的instance,但其发送的请求中具有心跳数据。
// 在client的注册请求还未到达时(网络抖动等原因),第一次心跳请求先到达了server,会出现这种情况
// 处理方式是,使用心跳数据构建出一个instance,注册到注册表
instance = new Instance();
instance.setPort(clientBeat.getPort());
instance.setIp(clientBeat.getIp());
instance.setWeight(clientBeat.getWeight());
instance.setMetadata(clientBeat.getMetadata());
instance.setClusterName(clusterName);
instance.setServiceName(serviceName);
instance.setInstanceId(instance.getInstanceId());
instance.setEphemeral(clientBeat.isEphemeral());
// 注册
serviceManager.registerInstance(namespaceId, serviceName, instance);
}
// 从注册表中获取service
Service service = serviceManager.getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.SERVER_ERROR,
"service not found: " + serviceName + "@" + namespaceId);
}
if (clientBeat == null) {
clientBeat = new RsInfo();
clientBeat.setIp(ip);
clientBeat.setPort(port);
clientBeat.setCluster(clusterName);
}
// todo 处理本次心跳
service.processClientBeat(clientBeat);
result.put(CommonParams.CODE, NamingResponseCode.OK);
// 这个就有点动态配置了
// 如果instance中有 preserved.heart.beat.interval 这个参数
if (instance.containsMetadata(PreservedMetadataKeys.HEART_BEAT_INTERVAL)) {
// 带回给客户端
result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, instance.getInstanceHeartBeatInterval());
}
result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled());
return result;
}
2.前面保障了实例信息的完整性以及可用性,第二步就是处理心跳processClientBeat(clientBeat),这块就是将收集的信息封装成ClientBeatProcessor对象,通过scheduleNow()管理创建了个任务,开启新的线程做健康检测
public void processClientBeat(final RsInfo rsInfo) {
// 创建一个处理器,其是一个任务
ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor();
clientBeatProcessor.setService(this);
clientBeatProcessor.setRsInfo(rsInfo);
// 开启一个立即执行的任务,即执行clientBeatProcessor任务的run()
HealthCheckReactor.scheduleNow(clientBeatProcessor);
}
因为ClientBeatProcessor实现Runnable类开启线程,接下来看中的run()方法
@Override
public void run() {
Service service = this.service;
if (Loggers.EVT_LOG.isDebugEnabled()) {
Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString());
}
String ip = rsInfo.getIp();
String clusterName = rsInfo.getCluster();
int port = rsInfo.getPort();
Cluster cluster = service.getClusterMap().get(clusterName);
// 获取当前服务的所有临时实例
List<Instance> instances = cluster.allIPs(true);
// 遍历所有这些临时实例,从中查找当前发送心跳的instance
for (Instance instance : instances) {
// 只要ip与port与当前心跳的instance的相同,就是了
if (instance.getIp().equals(ip) && instance.getPort() == port) {
if (Loggers.EVT_LOG.isDebugEnabled()) {
Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString());
}
// 修改最后心跳时间戳
instance.setLastBeat(System.currentTimeMillis());
// 修改该instance的健康状态
// 当instance被标记时,即其marked为true时,其是一个持久实例
if (!instance.isMarked()) {
// instance的healthy才是临时实例健康状态的表示
// 若当前instance健康状态为false,但本次是其发送的心跳,说明这个instance“起死回生”了,
// 我们需要将其health变为true
if (!instance.isHealthy()) {
instance.setHealthy(true);
Loggers.EVT_LOG
.info("service: {} {POS} {IP-ENABLED} valid: {}:{}@{}, region: {}, msg: client beat ok",
cluster.getService().getName(), ip, port, cluster.getName(),
UtilsAndCommons.LOCALHOST_SITE);
//发布服务变更事件(其对后续我们要分析的UDP通信非常重要)
getPushService().serviceChanged(service);
}
}
}
}
}
临时实例和永久实例集合
// 持久实例集合
@JsonIgnore
private Set<Instance> persistentInstances = new HashSet<>();
// 临时实例集合
@JsonIgnore
private Set<Instance> ephemeralInstances = new HashSet<>();