Spring Cloud Alibaba组件之深入分析Nacos服务注册源码实现

本文详细分析了Nacos服务注册、心跳续约和健康检查的源码实现。Nacos客户端在启动时,通过NacosServiceRegistryAutoConfiguration自动配置类进行服务注册。客户端基于WebServerInitializedEvent事件触发服务注册,使用NacosNamingService注册服务到服务端。服务端接收到注册请求后,创建或更新Service,并保存到ServiceMap和Datastore中。心跳续约由BeatTask定时任务完成,确保客户端实例的活性。健康检查通过ClientBeatCheckTask定期检查实例状态,对不健康或超时的实例进行处理。通过对Nacos的深入了解,有助于提升开发者的技术认知。
摘要由CSDN通过智能技术生成


前言

在看这篇文章之前,最好先对Nacos有所了解,可以前往官网查看下文档,Nacos官方地址:https://nacos.io/zh-cn/index.html
本文将深入的分析Nacos的服务注册流程、服务心跳、健康检查一系列源码实现,希望帮助读者更深入的了解Nacos,当然文章还有一些细节没有进行分析,会在下篇文章进行补充

核心入口

Spring Cloud Alibaba Nacos组件也是按照SpringBoot的自动配置规范进行引入的,它会在对应的jar中的META-INF下的spring.factories文件中,指定一系列的自动配置类,完成相应组件的自动加载
在这里插入图片描述
对于Nacos Client端,在客户端启动时,最为关键的是加载NacosServiceRegistryAutoConfiguration这个自动配置类
在这里插入图片描述

@Configuration(proxyBeanMethods = false)
@EnableConfigurationProperties
@ConditionalOnNacosDiscoveryEnabled
@ConditionalOnProperty(value = "spring.cloud.service-registry.auto-registration.enabled",
		matchIfMissing = true)
@AutoConfigureAfter({ AutoServiceRegistrationConfiguration.class,
		AutoServiceRegistrationAutoConfiguration.class,
		NacosDiscoveryAutoConfiguration.class })
public class NacosServiceRegistryAutoConfiguration {

    //注入Nacos注册中心实现
    @Bean
	public NacosServiceRegistry nacosServiceRegistry(
			NacosDiscoveryProperties nacosDiscoveryProperties) {
		return new NacosServiceRegistry(nacosDiscoveryProperties);
	}
	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosRegistration nacosRegistration(
			NacosDiscoveryProperties nacosDiscoveryProperties,
			ApplicationContext context) {
		return new NacosRegistration(nacosDiscoveryProperties, context);
	}
    //核心类
	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosAutoServiceRegistration nacosAutoServiceRegistration(
			NacosServiceRegistry registry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
		return new NacosAutoServiceRegistration(registry,
				autoServiceRegistrationProperties, registration);
	}
}

其中NacosServiceRegistryAutoConfiguration这个配置类,其内部通过@Bean的方式向Spring容器中注入了几个核心类:
(1)NacosServiceRegistry:该类实现了Spring Cloud对于注册中心定义的规范接口ServiceRegistry,并实现注册中心的核心方法,进行整合Nacos服务注册中心的功能
在这里插入图片描述

(2)NacosRegistration:该类内部维护NacosDiscoveryProperties实例,用于记录Nacos客户端的配置信息
(3)NacosAutoServiceRegistration:该类是作为Nacos客户端服务注册的入口类,它实现了ApplicationListener接口即Spring的事件监听,并对WebServerInitializedEvent事件进行监听,当SpringBoot工程启动时,它对应的上下文对象在完成bean的初始化后,就会调用finishRefresh方法进行发布WebServerInitializedEvent相关事件,之后该类就会监听到发布的事件,执行onApplicationEvent方法

NacosAutoServiceRegistration类图:在这里插入图片描述

客户端发起服务注册

客户端基于SpringBoot进行构建在启动时,会发布一个WebServerInitializedEvent事件,之后AbstractAutoServiceRegistration类就会监听到发布的该事件,然后调用onApplicationEvent方法来开启服务注册。
该方法经过调用链 bind()->start()->register(),再到NacosServiceRegistry类的register方法,该方法先将客户端的配置信息,封装到Instance实例对象中,再进行注册该客户端实例

public abstract class AbstractAutoServiceRegistration<R extends Registration>
		implements AutoServiceRegistration, ApplicationContextAware,
		ApplicationListener<WebServerInitializedEvent> {
  ...
  @Override
  @SuppressWarnings("deprecation")
  public void onApplicationEvent(WebServerInitializedEvent event) {
	bind(event);
  }
}
public class NacosServiceRegistry implements ServiceRegistry<Registration> {
    ....
    @Override
	public void register(Registration registration) {

		if (StringUtils.isEmpty(registration.getServiceId())) {
			log.warn("No service to register for nacos client...");
			return;
		}
        //获取服务id
		String serviceId = registration.getServiceId();
		//获得服务所在的组,默认为: default
		String group = nacosDiscoveryProperties.getGroup();
        //将客户端注册信息封装到Instance对象
		Instance instance = getNacosInstanceFromRegistration(registration);
		try {
		    //注册客户端实例
			namingService.registerInstance(serviceId, group, instance);
			log.info("nacos registry, {} {} {}:{} register finished", group, serviceId,
					instance.getIp(), instance.getPort());
		}
		catch (Exception e) {
			log.error("nacos registry, {} register failed...{},", serviceId,
					registration.toString(), e);
			rethrowRuntimeException(e);
		}
	}
}

接下来,注册客户端实例会交由Nacos Client模块提供的NacosNamingService类来实现。在发起注册前会先确认当前客户端实例是否配置为临时节点,如果是临时节点,就会先解析客户端配置参数,封装心跳信息BeatInfo,并开启心跳续约(心跳的具体实现后文再分析),之后通过NamingProxy类的registerService实现服务注册

//nacos服务注册的核心类
public class NacosNamingService implements NamingService{

    /**
     * 客户端 发起服务注册
     */
    @Override
    public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
    	 //是否为临时节点,默认为true
        if (instance.isEphemeral()) {
        	//解析客户端配置参数,封装心跳信息
            BeatInfo beatInfo = new BeatInfo();
            beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName));
            beatInfo.setIp(instance.getIp());
            beatInfo.setPort(instance.getPort());
            beatInfo.setCluster(instance.getClusterName());
            beatInfo.setWeight(instance.getWeight());
            beatInfo.setMetadata(instance.getMetadata());
            beatInfo.setScheduled(false);
            //设置客户端配置的心跳间隔
            beatInfo.setPeriod(instance.getInstanceHeartBeatInterval());
            //启动心跳续约
            beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
        }
        //服务注册
        serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
    }
}

服务注册流程:首先封装发送服务注册请求的请求参数,封装到params这个Map中,接下来,从服务端集群节点中随机选择一个节点,并对指定的节点发起HTTP GET请求进行服务注册。请求的地址为:http://127.0.0.1:8488/nacos/v1/ns/instance
说明一点:这里为什么采用随机指定一个节点,原因是:站在客户端的角度,服务端集群所有的节点都是对等的,因此请求服务端集群节点都是随机的

public class NamingProxy {
   
   public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {

        NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}",
            namespaceId, serviceName, instance);
        //构建发送服务注册的HTTP请求参数
        final Map<String, String> params = new HashMap<String, String>(9);
        params.put(CommonParams.NAMESPACE_ID, namespaceId);
        params.put(CommonParams.SERVICE_NAME, serviceName);
        params.put(CommonParams.GROUP_NAME, groupName);
        params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
        params.put("ip", instance.getIp());
        params.put("port", String.valueOf(instance.getPort()));
        params.put("weight", String.valueOf(instance.getWeight()));
        params.put("enable", String.valueOf(instance.isEnabled()));
        params.put("healthy", String.valueOf(instance.isHealthy()));
        params.put("ephemeral", String.valueOf(instance.isEphemeral()));
        params.put("metadata", JSON.toJSONString(instance.getMetadata()));
        //通过POST请求,发起服务注册
        reqAPI(UtilAndComs.NACOS_URL_INSTANCE, params, HttpMethod.POST);
    }
}
public String reqAPI(String api, Map<String, String> params, String body, List<String> servers, String method) throws NacosException {
    	 //添加namespaceId参数,默认为public
        params.put(CommonParams.NAMESPACE_ID, getNamespaceId());

        if (CollectionUtils.isEmpty(servers) && StringUtils.isEmpty(nacosDomain)) {
            throw new NacosException(NacosException.INVALID_PARAM, "no server available");
        }

        NacosException exception = new NacosException();
        //服务端的集群节点,[127.0.0.1:8488] 服务的IP+端口
        if (servers != null && !servers.isEmpty()) {
            Random random = new Random(System.currentTimeMillis());
            //随机指定一个节点
            int index = random.nextInt(servers.size());
            for (int i = 0; i < servers.size(); i++) {
                String server = servers.get(index);
                try {
                	//发起服务调用
                    return callServer(api, params, body, server, method);
                } catch (NacosException e) {
                    exception = e;
                    if (NAMING_LOGGER.isDebugEnabled()) {
                        NAMING_LOGGER.debug("request {} failed.", server, e);
                    }
                }
                index = (index + 1) % servers.size();
            }
        }
       ...
    }
//发起HTTP请求
public String callServer(String api, Map<String, String> params, String body, String curServer, String method)throws NacosException {
        long start = System.currentTimeMillis();
        long end = 0;
        injectSecurityInfo(params);
        //构造请求头参数
        List<String> headers = builderHeaders();

        String url;
        if (curServer.startsWith(UtilAndComs.HTTPS) || curServer.startsWith(UtilAndComs.HTTP)) {
            url = curServer + api;
        } else {
            if (!curServer.contains(UtilAndComs.SERVER_ADDR_IP_SPLITER)) {
                curServer = curServer + UtilAndComs.SERVER_ADDR_IP_SPLITER + serverPort;
            }
            //构造获得请求的url, http://127.0.0.1:8488/nacos/v1/ns/instance
            url = HttpClient.getPrefix() + curServer + api;
        }
        //发起http请求
        HttpClient.HttpResult result = HttpClient.request(url, headers, params, body, UtilAndComs.ENCODING, method);
        end = System.currentTimeMillis();
        MetricsMonitor.getNamingRequestMonitor(method, url, String.valueOf(result.code))
            .observe(end - start);
        if (HttpURLConnection.HTTP_OK == result.code) {
            return result.content;
        }
        if (HttpURLConnection.HTTP_NOT_MODIFIED == result.code) {
            return StringUtils.EMPTY;
        }
        throw new NacosException(result.code, result.content);
}

服务端接收服务注册请求

对于Nacos服务端,它在InstanceController类中提供了一个用来接收服务注册请求的接口,对应的方法为register。该方法内部又调用到serviceManager类的registerInstance方法,该方法是服务注册的核心,下面分析该方法执行流程:

@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance")
public class InstanceController {
    ...
    /**
     * 服务端,服务注册
     * @param request
     * @return
     * @throws Exception
     */
    @CanDistro
    @PostMapping
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public String register(HttpServletRequest request) throws Exception {

        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        //获得namespaceId,常用场景之一是不同环境的配置的区分隔离,例如开发环境和生产环境资源隔离
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);

        //服务注册
        serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
        return "ok";
    }
}
 /**
 * 关注服务Service 的变更
 * Core manager storing all services in Nacos
 *
 * @author nkorange
 */
@Component
@DependsOn("nacosApplicationContext")
public class ServiceManager implements RecordListener<Service> {
    ...
    public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
    	 //确认是否已经有注册的服务对应的service对象,拿不到创建一个对应的service对象,然后注册实例,并开启健康检查
        createEmptyService(namespaceId, serviceName, instance.isEphemeral());
        //拿到注册的服务对应的service对象
        Service service = getService(namespaceId, serviceName);

        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
        //添加实例
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }
}

首先,可以看到会先调用createEmptyService方法,见名思意创建一个空的Service对象,但实际上执行时会先确认当前注册的客户端实例对应的Service对象是否已经存在,不存在才需要创建一个空的Service对象,之后再将创建好的Service对象放入到ServiceMap缓存中,并将当前空的Service对象注册到监听器列表,后续再通过该Service对象触发执行其onChange方法来更新Service服务下的实例列表(事件监听会在下篇文章进行分析)。ServiceMap内部记录了服务列表,服务拉取时就会通过ServiceMap来获取服务列表
Service数据结构:Service内部通过一个Map<String, Cluster> clusterMap记录了所有Instance实例,其中key为clusterName,value为cluster对象,而cluster对象内部又通过ephemeralInstances set集合记录了当前集群下所有Instance实例
说明一点:Nacos服务端会将服务抽象成Service服务对象,而客户端注册的实例会抽象成Instance实例对象,一个Service服务内部会包括多个Instance实例。因此当一个客户端注册时,严格意思上说:创建了一个服务,并注册了一个实例,也有可能在服务端并没有创建一个服务,但是一定注册了一个实例

public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException {
    	//先尝试获取当前注册服务的Service对象
    	Service service = getService(namespaceId, serviceName);
        if (service == null) {
            Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName);
            //为当前注册的服务创建一个Service对象
            service = new Service();
            service.setName(serviceName);
            service.setNamespaceId(namespaceId);
            service.setGroupName(NamingUtils.getGroupName(serviceName));
            // now validate the service. if failed, exception will be thrown
            service.setLastModifiedMillis(System.currentTimeMillis());
            //对service做签名
            service.recalculateChecksum();
            if (cluster != null) {
                cluster.setService(service);
                service.getClusterMap().put(cluster.getName(), cluster);
            }
            //检查服务和集群名称是合规
            service.validate();
            //将Service服务进行缓存,检查心跳
            putServiceAndInit(service);
            if (!local) {
            	//持久化操作
                addOrReplaceService(service);
            }
        }
    }
private void putServiceAndInit(Service service) throws NacosException {
    	 //将服务对象添加到serviceMap缓存中
        putService(service);
        //开启健康检查
        service.init();
        //添加service监听器,目的是通过该监听器更新服务下实例列表
        consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
        consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
        Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJSON());
}
public void putService(Service service) {
        if (!serviceMap.containsKey(service.getNamespaceId())) {
            synchronized (putServiceLock) {
                if (!serviceMap.containsKey(service.getNamespaceId())) {
                    serviceMap.put(service.getNamespaceId(), new ConcurrentHashMap<>(16));
                }
            }
        }
        //服务记录到map中
        serviceMap.get(service.getNamespaceId()).put(service.getName(), service);
}

接下来,又会调用addInstance方法及添加实例,首先会根据namespaceId和serviceName创建一个Key,即服务的唯一标识,然后调用addIpAddresses方法从当前Service对象中,获取已存在的客户端实例列表,并将当前注册的客户端实例添加到该列表中,之后再将收集好的客户端实例列表更新到datastore对象中,并向notifier触发器中添加一个Change事件,来触发执行当前Service对象的onChange方法,进而更新Service对象下的实例列表。到这就完成了服务注册流程。
datastore数据结构:datastore内部通过了一个Map<String, Datum> dataMap记录了所有客户端实例,其中key为Service Key(服务唯一标识),value为Datum对象,Datum对象内部也是key-value结构,其中Key也是为Service Key,而value则是Instances对象,该对象内部又通过instanceList列表记录了当前Service Key下所有客户端实例
分析一点:前面我们知道了Service服务对象中通过clusterMap已经记录了服务下的实例列表,那datastore为什么还记录一次实例列表呢。其实是这样的,Service它是具有层次结构的数据,主要用于提供服务查询。而datastore属于扁平化的数据结构,主要用于做服务端集群节点之间数据一致性的

public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
    	//根据namespaceId 和 serviceName 生成一个key,作为Service服务的唯一标识
        String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);

        Service service = getService(namespaceId, serviceName);

        synchronized (service) {
        	//获得Service下所有的实例以及Cluster的初始化
            List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);

            Instances instances = new Instances();
            instances.setInstanceList(instanceList);
            //核心方法
            //consistencyService为依赖注入了DelegateConsistencyServiceImpl委托对象
            consistencyService.put(key, instances);
        }
}

addIpAddresses方法内部会调用到该updateIpAddresses方法

public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips) throws NacosException {

    	 //根据key,拿到Datum
        Datum datum = consistencyService.get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));

        //拿到服务对象中的所有实例
        List<Instance> currentIPs = service.allIPs(ephemeral);
        //创建一个map,key为服务的id+端口
        Map<String, Instance> currentInstances = new HashMap<>(currentIPs.size());
        Set<String> currentInstanceIds = Sets.newHashSet();

        //遍历服务实例添加到集合中
        for (Instance instance : currentIPs) {
            currentInstances.put(instance.toIPAddr(), instance);
            currentInstanceIds.add(instance.getInstanceId());
        }

        Map<String, Instance> instanceMap;
        if (datum != null) {
            instanceMap = setValid(((Instances) datum.value).getInstanceList(), currentInstances);
        } else {
            instanceMap = new HashMap<>(ips.length);
        }

        //遍历客户端所有注册的服务实例
        for (Instance instance : ips) {
        	//如果服务中不包含客户端实例的集群,则创建一个对应的集群
            if (!service.getClusterMap().containsKey(instance.getClusterName())) {
            	//创建实例的cluster对象
                Cluster cluster = new Cluster(instance.getClusterName(), service);
                //健康检查
                cluster.init();
                //将创建的cluster对象添加到Map中
                service.getClusterMap().put(instance.getClusterName(), cluster);
                Loggers.SRV_LOG
                        .warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                                instance.getClusterName(), instance.toJSON());
            }

            if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
            	//删除服务实例
                instanceMap.remove(instance.getDatumKey());
            } else {
            	//为每个实例设置一个唯一的id
                instance.setInstanceId(instance.generateInstanceId(currentInstanceIds));
                //添加服务实例
                instanceMap.put(instance.getDatumKey(), instance);
            }
        }
        if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
            throw new IllegalArgumentException("ip list can not be empty, service: " + service.getName() + ", ip list: "
                + JSON.toJSONString(instanceMap.values()));
        }
        //返回服务下所有实例列表
        return new ArrayList<>(instanceMap.values());
    }
@org.springframework.stereotype.Service("distroConsistencyService")
public class DistroConsistencyServiceImpl implements EphemeralConsistencyService {
    ....
    @Override
    public void put(String key, Record value) throws NacosException {
    	//填充datastore数据,并向监听调度中添加一个service监听任务,来改变service下的实例列表
        onPut(key, value);
        //运行过程中,数据一致性
        //服务端集群的某个节点上被注册了一个客户端实例,就会触发实例客户端实例同步到其它节点
        taskDispatcher.addTask(key);
    }
    
    public void onPut(String key, Record value) {
        if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
            Datum<Instances> datum = new Datum<>();
            datum.value = (Instances) value;
            datum.key = key;
            datum.timestamp.incrementAndGet();
            //记录到dataStore中
            dataStore.put(key, datum);
        }
        if (!listeners.containsKey(key)) {
            return;
        }
        //发布监听事件
        notifier.addTask(key, ApplyAction.CHANGE);
    }
}

客户端发起心跳续约

前面分析服务注册时,可以知道当客户端实例为临时节点时,会基于客户端实例信息构造出一个BeatInfo对象,之后开启心跳续约。对于心跳续约来说,客户端提供了一个BeatTask的心跳任务,默认情况下为每隔5s,触发执行一次BeatTask任务来进行心跳续约
心跳续约流程:基于BeatInfo心跳信息,构造发送心跳请求的参数,将参数封装到一个Map中,之后发起HTTP PUT请求将心跳信息发送给服务端,请求地址为:/nacos/v1/ns/instance/beat。接下来,接收服务端心跳回调结果,判断返回码code是否为20404及未发现,如果是,此时又会基于BeatInfo构造客户端实例Instance对象,并对当前客户端实例发起服务注册。最后再次向定时线程池提交心跳续约任务,来保持心跳

public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
        NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
        String key = buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort());
        BeatInfo existBeat = null;
        //fix #1733
        if ((existBeat = dom2Beat.remove(key)) != null) {
            existBeat.setStopped(true);
        }
        dom2Beat.put(key, beatInfo);
        //提交心跳任务
        executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
        MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
}
class BeatTask implements Runnable {

        BeatInfo beatInfo;

        public BeatTask(BeatInfo beatInfo) {
            this.beatInfo = beatInfo;
        }

        /**
         * 向服务端发送心跳
         */
        @Override
        public void run() {
            if (beatInfo.isStopped()) {
                return;
            }
            long nextTime = beatInfo.getPeriod();
            try {
            	//发送心跳续约
                JSONObject result = serverProxy.sendBeat(beatInfo, BeatReactor.this.lightBeatEnabled);
                long interval = result.getIntValue("clientBeatInterval");
                boolean lightBeatEnabled = false;
                if (result.containsKey(CommonParams.LIGHT_BEAT_ENABLED)) {
                    lightBeatEnabled = result.getBooleanValue(CommonParams.LIGHT_BEAT_ENABLED);
                }
                BeatReactor.this.lightBeatEnabled = lightBeatEnabled;
                if (interval > 0) {
                    nextTime = interval;
                }
                int code = NamingResponseCode.OK;
                if (result.containsKey(CommonParams.CODE)) {
                    code = result.getIntValue(CommonParams.CODE);
                }
                //心跳为未发现,此时就需要进行服务注册
                if (code == NamingResponseCode.RESOURCE_NOT_FOUND) {
                    Instance instance = new Instance();
                    instance.setPort(beatInfo.getPort());
                    instance.setIp(beatInfo.getIp());
                    instance.setWeight(beatInfo.getWeight());
                    instance.setMetadata(beatInfo.getMetadata());
                    instance.setClusterName(beatInfo.getCluster());
                    instance.setServiceName(beatInfo.getServiceName());
                    instance.setInstanceId(instance.getInstanceId());
                    instance.setEphemeral(true);
                    try {
                        //服务注册
                        serverProxy.registerService(beatInfo.getServiceName(),
                            NamingUtils.getGroupName(beatInfo.getServiceName()), instance);
                    } catch (Exception ignore) {
                    }
                }
            } catch (NacosException ne) {
                NAMING_LOGGER.error("[CLIENT-BEAT] failed to send beat: {}, code: {}, msg: {}",
                    JSON.toJSONString(beatInfo), ne.getErrCode(), ne.getErrMsg());

            }
            //继续心跳续约
            executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS);
        }
    }
public JSONObject sendBeat(BeatInfo beatInfo, boolean lightBeatEnabled) throws NacosException {

        if (NAMING_LOGGER.isDebugEnabled()) {
            NAMING_LOGGER.debug("[BEAT] {} sending beat to server: {}", namespaceId, beatInfo.toString());
        }
        //构造发送心跳续约的HTTP请求参数
        Map<String, String> params = new HashMap<String, String>(8);
        String body = StringUtils.EMPTY;
        if (!lightBeatEnabled) {
            try {
                body = "beat=" + URLEncoder.encode(JSON.toJSONString(beatInfo), "UTF-8");
            } catch (UnsupportedEncodingException e) {
                throw new NacosException(NacosException.SERVER_ERROR, "encode beatInfo error", e);
            }
        }
        params.put(CommonParams.NAMESPACE_ID, namespaceId);
        params.put(CommonParams.SERVICE_NAME, beatInfo.getServiceName());
        params.put(CommonParams.CLUSTER_NAME, beatInfo.getCluster());
        params.put("ip", beatInfo.getIp());
        params.put("port", String.valueOf(beatInfo.getPort()));
        //发送心跳 
        //发送地址: /nacos/v1/ns/instance/beat
        String result = reqAPI(UtilAndComs.NACOS_URL_BASE + "/instance/beat", params, body, HttpMethod.PUT);
        return JSON.parseObject(result);
    }

服务端接收心跳请求

对于Nacos服务端,它在InstanceController类中提供了一个用来接收心跳请求的接口,对应的方法为beat
心跳处理流程:首先从请求对象中,获得客户端发送来的心跳信息,并根据获得的信息,确认当前客户端实例是否已经被当前服务端节点注册过,如果还未注册,则对当前发送心跳的客户端实例进行注册。接下来,封装心跳信息到clientBeat对象中,并进行处理客户端心跳。在处理心跳时,会创建一个ClientBeatProcessor心跳任务,然后通过线程池提交该心跳任务。之后,执行该心跳任务。在执行心跳任务过程中,首先从当前Service服务对象中获取客户端实例列表,然后从实例列表中找到与当前发送心跳的客户端IP和端口相同的实例,找到之后,更新该实例的心跳时间,并检查该实例是否为健康状态,不是健康状态,则将该实例设置为健康状态。到此完成了心跳续约流程

    @CanDistro 
    @PutMapping("/beat")
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public JSONObject beat(HttpServletRequest request) throws Exception {

    	//构造服务端接受心跳后,需要返回的结果
        JSONObject result = new JSONObject();

        //告诉客户端心跳间隔时间
        result.put("clientBeatInterval", switchDomain.getClientBeatInterval());
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID,
            Constants.DEFAULT_NAMESPACE_ID);
        String clusterName = WebUtils.optional(request, CommonParams.CLUSTER_NAME,
            UtilsAndCommons.DEFAULT_CLUSTER_NAME);
        String ip = WebUtils.optional(request, "ip", StringUtils.EMPTY);
        int port = Integer.parseInt(WebUtils.optional(request, "port", "0"));
        String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY);

        RsInfo clientBeat = null;
        if (StringUtils.isNotBlank(beat)) {
            clientBeat = JSON.parseObject(beat, RsInfo.class);
        }

        if (clientBeat != null) {
            if (StringUtils.isNotBlank(clientBeat.getCluster())) {
                clusterName = clientBeat.getCluster();
            } else {
                // fix #2533
                clientBeat.setCluster(clusterName);
            }
            ip = clientBeat.getIp();
            port = clientBeat.getPort();
        }

        if (Loggers.SRV_LOG.isDebugEnabled()) {
            Loggers.SRV_LOG.debug("[CLIENT-BEAT] full arguments: beat: {}, serviceName: {}", clientBeat, serviceName);
        }

        //查询已完成注册的当前客户端的实例对象
        Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port);

        //当实例不存在
        if (instance == null) {
            if (clientBeat == null) {
                result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND);
                return result;
            }
            instance = new Instance();
            instance.setPort(clientBeat.getPort());
            instance.setIp(clientBeat.getIp());
            instance.setWeight(clientBeat.getWeight());
            instance.setMetadata(clientBeat.getMetadata());
            instance.setClusterName(clusterName);
            instance.setServiceName(serviceName);
            instance.setInstanceId(instance.getInstanceId());
            instance.setEphemeral(clientBeat.isEphemeral());
            //服务注册,创建实例
            serviceManager.registerInstance(namespaceId, serviceName, instance);
        }

        Service service = serviceManager.getService(namespaceId, serviceName);

        if (service == null) {
            throw new NacosException(NacosException.SERVER_ERROR,
                "service not found: " + serviceName + "@" + namespaceId);
        }
        if (clientBeat == null) {
            clientBeat = new RsInfo();
            clientBeat.setIp(ip);
            clientBeat.setPort(port);
            clientBeat.setCluster(clusterName);
        }
        //处理客户端心跳
        service.processClientBeat(clientBeat);

        result.put(CommonParams.CODE, NamingResponseCode.OK);
        result.put("clientBeatInterval", instance.getInstanceHeartBeatInterval());
        result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled());
        return result;
    }
public void processClientBeat(final RsInfo rsInfo) {
    	//创建一个心跳任务
        ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor();
        clientBeatProcessor.setService(this);
        clientBeatProcessor.setRsInfo(rsInfo);
        //通过线程池提交心跳任务
        HealthCheckReactor.scheduleNow(clientBeatProcessor);
}
public class ClientBeatProcessor implements Runnable {
   
   /**
     * 执行心跳
     */
    @Override
    public void run() {
        Service service = this.service;
        if (Loggers.EVT_LOG.isDebugEnabled()) {
            Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString());
        }
        //根据心跳参数拿到ip、端口
        String ip = rsInfo.getIp();
        String clusterName = rsInfo.getCluster();
        int port = rsInfo.getPort();
        Cluster cluster = service.getClusterMap().get(clusterName);
        //拿到所有的客户端实例
        List<Instance> instances = cluster.allIPs(true);
        
        for (Instance instance : instances) {
        	//根据心跳的ip和端口,找到对应的客户端实例
            if (instance.getIp().equals(ip) && instance.getPort() == port) {
                if (Loggers.EVT_LOG.isDebugEnabled()) {
                    Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString());
                }
                //更新上次客户端实例的心跳时间,完成心跳续约
                instance.setLastBeat(System.currentTimeMillis());
                if (!instance.isMarked()) {
                	//确认实例是否健康
                    if (!instance.isHealthy()) {
                        //设置实例为健康状态
                        instance.setHealthy(true);
                        Loggers.EVT_LOG
                                .info("service: {} {POS} {IP-ENABLED} valid: {}:{}@{}, region: {}, msg: client beat ok",
                                        cluster.getService().getName(), ip, port, cluster.getName(),
                                        UtilsAndCommons.LOCALHOST_SITE);
                        //广播消息
                        getPushService().serviceChanged(service);
                    }
                }
            }
        }
    }
}

服务端开启健康检查

前面分析服务端服务注册时,可以看到当Service服务对象还未创建时,就会创建一个Service服务对象。创建完成后服务端就会基于该Service对象开启服务的健康检查,在开启健康检查时,会创建一个ClientBeatCheckTask健康检查任务,每隔5s触发一次该任务
健康检查过程:首先拿到当前Service服务对象下的客户端实例列表,然后分别对这些客户端实例进行判断,确认客户端实例:当前系统时间减去其最后一次发送心跳时间是否大于默认心跳超时时间,如果大于,那么此时再判断当前客户端实例是否健康,健康的话,会将客户端实例标识为不健康。接下来,再分别确认这些客户端实例:当前系统时间减去其最后一次发送心跳时间是否大于默认实例删除时间,如果大于则剔除该实例

public void init() {
    	//开启服务端的健康检查
        HealthCheckReactor.scheduleCheck(clientBeatCheckTask);

        for (Map.Entry<String, Cluster> entry : clusterMap.entrySet()) {
            entry.getValue().setService(this);
            entry.getValue().init();
        }
}

public static void scheduleCheck(ClientBeatCheckTask task) {
         //定时执行健康检查任务
        futureMap.putIfAbsent(task.taskKey(), EXECUTOR.scheduleWithFixedDelay(task, 5000, 5000, TimeUnit.MILLISECONDS));
}
public class ClientBeatCheckTask implements Runnable {
    ....
    /**
     *  健康检查
     */
    @Override
    public void run() {
        try {
            if (!getDistroMapper().responsible(service.getName())) {
                return;
            }

            if (!getSwitchDomain().isHealthCheckEnabled()) {
                return;
            }
            //拿到服务下当前客户端集群的所有实例
            List<Instance> instances = service.allIPs(true);

            // first set health status of instances:
            for (Instance instance : instances) {
            	//当前时间减去客户端最后一次发送心跳时间    大于   默认心跳超时时间(默认15s)
                if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) {
                    if (!instance.isMarked()) {
                    	//实例是否健康
                    	if (instance.isHealthy()) {
                    		 //标识当前实例为不健康
                            instance.setHealthy(false);
                            Loggers.EVT_LOG.info("{POS} {IP-DISABLED} valid: {}:{}@{}@{}, region: {}, msg: client timeout after {}, last beat: {}",
                                instance.getIp(), instance.getPort(), instance.getClusterName(), service.getName(),
                                UtilsAndCommons.LOCALHOST_SITE, instance.getInstanceHeartBeatTimeOut(), instance.getLastBeat());
                            //发布事件广播,通过UDP网络传输,告诉客户端有一些服务实例不健康了
                            getPushService().serviceChanged(service);
                            //发布实例超时的一个事件
                            SpringContext.getAppContext().publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance));
                        }
                    }
                }
            }

            if (!getGlobalConfig().isExpireInstance()) {
                return;
            }

            // then remove obsolete instances:
            for (Instance instance : instances) {

                if (instance.isMarked()) {
                    continue;
                }
                //当前时间减去客户端最后一次发送心跳时间    大于   默认实例删除时间(默认30s)
                if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) {
                    // delete instance
                    Loggers.SRV_LOG.info("[AUTO-DELETE-IP] service: {}, ip: {}", service.getName(), JSON.toJSONString(instance));
                    //剔除实例
                    deleteIP(instance);
                }
            }

        } catch (Exception e) {
            Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e);
        }

    }
}

剔除实例

剔除实例是通过向当前服务端节点发送一个HTTP DELETE请求,而请求处理过程中,首先会先根据Service服务对象,拿到服务下的客户端实例列表,然后将当前实例从获取的客户端实例列表中进行删除,之后,再更新到Datastore中

private void deleteIP(Instance instance) {

        try {
            NamingProxy.Request request = NamingProxy.Request.newRequest();
            request.appendParam("ip", instance.getIp())
                .appendParam("port", String.valueOf(instance.getPort()))
                .appendParam("ephemeral", "true")
                .appendParam("clusterName", instance.getClusterName())
                .appendParam("serviceName", service.getName())
                .appendParam("namespaceId", service.getNamespaceId());
            String url = "http://127.0.0.1:" + RunningConfig.getServerPort() + RunningConfig.getContextPath()
                + UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance?" + request.toUrl();

            // delete instance asynchronously:
            //向本机发起一个HTTP Delete删除请求
            HttpClient.asyncHttpDelete(url, null, null, new AsyncCompletionHandler() {
                @Override
                public Object onCompleted(Response response) throws Exception {
                    if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                        Loggers.SRV_LOG.error("[IP-DEAD] failed to delete ip automatically, ip: {}, caused {}, resp code: {}",
                            instance.toJSON(), response.getResponseBody(), response.getStatusCode());
                    }
                    return null;
                }
            });
        } catch (Exception e) {
            Loggers.SRV_LOG.error("[IP-DEAD] failed to delete ip automatically, ip: {}, error: {}", instance.toJSON(), e);
        }
}

对于Nacos服务端,它在InstanceController类中提供了一个用来接收删除实例请求的接口,对应的方法为deregister

 /**
     * 剔除实例
     * @param request
     * @return
     * @throws Exception
     */
    @CanDistro
    @DeleteMapping
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public String deregister(HttpServletRequest request) throws Exception {
    	//获得客户端实例
        Instance instance = getIPAddress(request);
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID,
            Constants.DEFAULT_NAMESPACE_ID);
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);

        //找到服务对象
        Service service = serviceManager.getService(namespaceId, serviceName);
        if (service == null) {
            Loggers.SRV_LOG.warn("remove instance from non-exist service: {}", serviceName);
            return "ok";
        }

        //删除服务下实例
        serviceManager.removeInstance(namespaceId, serviceName, instance.isEphemeral(), instance);

        return "ok";
    }
public void removeInstance(String namespaceId, String serviceName, boolean ephemeral, Service service, Instance... ips) throws NacosException {

    	//生成服务的唯一标识
        String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);

        //更新服务下实例列表,删除当前客户端实例
        List<Instance> instanceList = substractIpAddresses(service, ephemeral, ips);

        Instances instances = new Instances();
        //更新
        instances.setInstanceList(instanceList);
        
        consistencyService.put(key, instances);
    }

总结

本文详细分析了Spring Cloud Alibaba Nacos的服务注册、心跳续约、健康检查的核心原理实现,掌握其原理能够帮助我们对Nacos有更深入的了解。当然Nacos在代码设计方面有很多地方是值得借鉴的,非常多的细节,细细的去品味可以对我们自身编码水平有很大的提高

由于本人能力有限,分析不恰当的地方和文章有错误的地方的欢迎批评指出,非常感谢!

  • 5
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值