Nacos——服务注册和心跳机制

Nacos——服务注册和心跳机制

1. 分享背景

在项目中使用Nacos已经有很长时间了,用到了Nacos的配置中心和服务发现功能。就做为注册中心来说,自己有用过Zookeeper,Eureka以及Nacos,之前对于注册中心的理解更多停留在表面,没有关注过做为注册中心如何对服务进行健康检测,数据一致性等问题。对Nacos源码的兴趣起源于一次线上环境偶现的bug,对此,以Nacos为切入点来深入理解注册中心,可以知道:

  1. 更好的对注册中心进行选型;
  2. 注册中心做为一个典型的分布式系统,是如何解决数据一致性等分布式系统问题的。

2. 前提

  1. 这里分享的Nacos版本是1.4.2。现在Nacos已经出了2开头的版本了,在远程通讯协议上面做了比较大的修改,后面可能还会对2.0.3版本进行分析;

  2. 这里只讲Nacos源码分析,对于Nacos的使用不进行阐述;

  3. 实际项目中往往是在Spring Cloud Alibaba的环境下集成Nacos的,所以这里会结合Spring Cloud Alibaba来说;

  4. Nacos有持久化和非持久化实例之分,这里分享的是非持久化实例。

    Nacos: 1.4.2 Spring Cloud Alibaba: 2.2.4.RELEASE

3. Nacos服务注册流程

在这里插入图片描述

4. maven依赖

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
    <exclusions>
        <exclusion>
            <groupId>com.alibaba.nacos</groupId>
            <artifactId>nacos-client</artifactId>
        </exclusion>
    </exclusions>
</dependency>

<dependency>
    <groupId>com.alibaba.nacos</groupId>
    <artifactId>nacos-client</artifactId>
    <version>1.4.2</version>
</dependency>

5. 服务注册

  • 注册类初始化

    spring-cloud-starter-alibaba-nacos-discovery项目中的spring.factories文件

    org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
      com.alibaba.cloud.nacos.discovery.NacosDiscoveryAutoConfiguration,\
      com.alibaba.cloud.nacos.ribbon.RibbonNacosAutoConfiguration,\
      com.alibaba.cloud.nacos.endpoint.NacosDiscoveryEndpointAutoConfiguration,\
      com.alibaba.cloud.nacos.registry.NacosServiceRegistryAutoConfiguration,\
      com.alibaba.cloud.nacos.discovery.NacosDiscoveryClientConfiguration,\
      com.alibaba.cloud.nacos.discovery.reactive.NacosReactiveDiscoveryClientConfiguration,\
      com.alibaba.cloud.nacos.discovery.configclient.NacosConfigServerAutoConfiguration,\
      com.alibaba.cloud.nacos.NacosServiceAutoConfiguration
    org.springframework.cloud.bootstrap.BootstrapConfiguration=\
      com.alibaba.cloud.nacos.discovery.configclient.NacosDiscoveryClientConfigServiceBootstrapConfiguration
    
    

    从命名和内容可以看出NacosServiceRegistryAutoConfiguration负责初始化注册相关的类,该类中会初始化NacosServiceRegistry这个Bean。

    NacosServiceRegistry注册的关键方法:

    @Override
    public void register(Registration registration) {
    
        if (StringUtils.isEmpty(registration.getServiceId())) {
            log.warn("No service to register for nacos client...");
            return;
        }
    
        // 获取NamingService,Nacos用于服务发现暴露的接口
        NamingService namingService = namingService();
        String serviceId = registration.getServiceId();
        // Nacos组名
        String group = nacosDiscoveryProperties.getGroup();
    
        // 实例化注册对象
        Instance instance = getNacosInstanceFromRegistration(registration);
    
        try {
            // 注册实例
            namingService.registerInstance(serviceId, group, instance);
            log.info("nacos registry, {} {} {}:{} register finished", group, serviceId,
                     instance.getIp(), instance.getPort());
        }
        catch (Exception e) {
            log.error("nacos registry, {} register failed...{},", serviceId,
                      registration.toString(), e);
            // rethrow a RuntimeException if the registration is failed.
            // issue : https://github.com/alibaba/spring-cloud-alibaba/issues/1132
            rethrowRuntimeException(e);
        }
    }
    
    /**
     * 获取单例NamingService
     * @return
     */
    private NamingService namingService() {
        return nacosServiceManager
            .getNamingService(nacosDiscoveryProperties.getNacosProperties());
    }
    
    private Instance getNacosInstanceFromRegistration(Registration registration) {
        Instance instance = new Instance();
        instance.setIp(registration.getHost());
        instance.setPort(registration.getPort());
        instance.setWeight(nacosDiscoveryProperties.getWeight());
        instance.setClusterName(nacosDiscoveryProperties.getClusterName());
        instance.setEnabled(nacosDiscoveryProperties.isInstanceEnabled());
        instance.setMetadata(registration.getMetadata());
        instance.setEphemeral(nacosDiscoveryProperties.isEphemeral());
        return instance;
    }
    

    register()中调用NamingService.registerInstance()进行注册实例。

  • NamingService.registerInstance()注册逻辑

    这个类是nacos-client包维护的

    NamingService.registerInstance()

    @Override
    public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
        // 对实例对象进行属性值校验
        NamingUtils.checkInstanceIsLegal(instance);
        // groupName@@serviceName
        String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
        // 非持久化实例需要开启心跳检测
        if (instance.isEphemeral()) {
            BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
            beatReactor.addBeatInfo(groupedServiceName, beatInfo);
        }
        // 注册到Nacos服务端
        serverProxy.registerService(groupedServiceName, groupName, instance);
    }
    

    该方法逻辑:

    1. 对非持久化实例开启心跳检测(在健康检测篇目中进行详细分析)
    2. 调用NamingProxy类进行注册
  • NamingProxy

    /**
     * 注册实例
     */
    public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
    
        NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", namespaceId, serviceName,
                           instance);
    
        final Map<String, String> params = new HashMap<String, String>(16);
        params.put(CommonParams.NAMESPACE_ID, namespaceId);
        params.put(CommonParams.SERVICE_NAME, serviceName);
        params.put(CommonParams.GROUP_NAME, groupName);
        params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
        params.put("ip", instance.getIp());
        params.put("port", String.valueOf(instance.getPort()));
        params.put("weight", String.valueOf(instance.getWeight()));
        params.put("enable", String.valueOf(instance.isEnabled()));
        params.put("healthy", String.valueOf(instance.isHealthy()));
        params.put("ephemeral", String.valueOf(instance.isEphemeral()));
        params.put("metadata", JacksonUtils.toJson(instance.getMetadata()));
        // http调用/instance接口
        reqApi(UtilAndComs.nacosUrlInstance, params, HttpMethod.POST);
    }
    
    public String reqApi(String api, Map<String, String> params, String method) throws NacosException {
        return reqApi(api, params, Collections.EMPTY_MAP, method);
    }
    
    public String reqApi(String api, Map<String, String> params, Map<String, String> body, String method)
        throws NacosException {
        return reqApi(api, params, body, getServerList(), method);
    }
    
    /**
     * 调用Nacos接口的重试机制
     */
    public String reqApi(String api, Map<String, String> params, Map<String, String> body, List<String> servers,
                         String method) throws NacosException {
    
        params.put(CommonParams.NAMESPACE_ID, getNamespaceId());
    
        if (CollectionUtils.isEmpty(servers) && StringUtils.isBlank(nacosDomain)) {
            throw new NacosException(NacosException.INVALID_PARAM, "no server available");
        }
    
        NacosException exception = new NacosException();
        // Nacos服务端域名不为空时,调用失败则在最大重试次数下重试nacos接口
        if (StringUtils.isNotBlank(nacosDomain)) {
            for (int i = 0; i < maxRetry; i++) {
                try {
                    return callServer(api, params, body, nacosDomain, method);
                } catch (NacosException e) {
                    exception = e;
                    if (NAMING_LOGGER.isDebugEnabled()) {
                        NAMING_LOGGER.debug("request {} failed.", nacosDomain, e);
                    }
                }
            }
        } else {
            // 选择Nacos服务端地址进行调用,调用失败则选择下一个地址调用,知道选择完所有的服务端地址
            Random random = new Random(System.currentTimeMillis());
            int index = random.nextInt(servers.size());
    
            for (int i = 0; i < servers.size(); i++) {
                String server = servers.get(index);
                try {
                    return callServer(api, params, body, server, method);
                } catch (NacosException e) {
                    exception = e;
                    if (NAMING_LOGGER.isDebugEnabled()) {
                        NAMING_LOGGER.debug("request {} failed.", server, e);
                    }
                }
                index = (index + 1) % servers.size();
            }
        }
    
        NAMING_LOGGER.error("request: {} failed, servers: {}, code: {}, msg: {}", api, servers, exception.getErrCode(),
                            exception.getErrMsg());
    
        throw new NacosException(exception.getErrCode(),
                                 "failed to req API:" + api + " after all servers(" + servers + ") tried: " + exception.getMessage());
    
    }
    
    
    /**
     * 调用Nacos服务端接口
     */
    public String callServer(String api, Map<String, String> params, Map<String, String> body, String curServer,
                             String method) throws NacosException {
        long start = System.currentTimeMillis();
        long end = 0;
        injectSecurityInfo(params);
        Header header = builderHeader();
    
        String url;
        if (curServer.startsWith(UtilAndComs.HTTPS) || curServer.startsWith(UtilAndComs.HTTP)) {
            url = curServer + api;
        } else {
            if (!IPUtil.containsPort(curServer)) {
                curServer = curServer + IPUtil.IP_PORT_SPLITER + serverPort;
            }
            url = NamingHttpClientManager.getInstance().getPrefix() + curServer + api;
        }
    
        try {
            // 调用NacosRestTemplate跟Nacos服务端进行Http通讯
            HttpRestResult<String> restResult = nacosRestTemplate
                .exchangeForm(url, header, Query.newInstance().initParams(params), body, method, String.class);
            end = System.currentTimeMillis();
    
            MetricsMonitor.getNamingRequestMonitor(method, url, String.valueOf(restResult.getCode()))
                .observe(end - start);
    
            if (restResult.ok()) {
                return restResult.getData();
            }
            if (HttpStatus.SC_NOT_MODIFIED == restResult.getCode()) {
                return StringUtils.EMPTY;
            }
            throw new NacosException(restResult.getCode(), restResult.getMessage());
        } catch (Exception e) {
            NAMING_LOGGER.error("[NA] failed to request", e);
            throw new NacosException(NacosException.SERVER_ERROR, e);
        }
    }
    

    至此Nacos客户端注册实例到服务端的逻辑输出完了,总的来说就是通过http调用Nacos服务端的API,接下来看看这个API里的逻辑。

  • POST /v1/ns/instance接口逻辑

    接口定义在InstanceController

    /**
     * Register new instance.
     */
    @CanDistro
    @PostMapping
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public String register(HttpServletRequest request) throws Exception {
        // 获取命名空间
        final String namespaceId = WebUtils
            .optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
        // 检测服务名是否以group@@serviceName形式命名
        final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        NamingUtils.checkServiceNameFormat(serviceName);
        // 对实例对象进行校验和补充
        final Instance instance = parseInstance(request);
        // 注册实例
        serviceManager.registerInstance(namespaceId, serviceName, instance);
        return "ok";
    }
    

    ServiceManager

    /**
     * Map(namespace, Map(group::serviceName, Service)).
     */
    private final Map<String, Map<String, Service>> serviceMap = new ConcurrentHashMap<>();
    
    /**
     * 用AP模式添加实例到Service中
     */
    public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
        // 创建空Service对象到serviceMap中
        createEmptyService(namespaceId, serviceName, instance.isEphemeral());
        // 获取刚刚创建的Service对象
        Service service = getService(namespaceId, serviceName);
    
        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                                     "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
        // 添加实例
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }
    
    /**
     * 添加实例到Service中
     */
    public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
        throws NacosException {
    
        String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
    
        Service service = getService(namespaceId, serviceName);
    
        synchronized (service) {
            // 添加实例到Service中
            List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
    
            Instances instances = new Instances();
            instances.setInstanceList(instanceList);
    
            consistencyService.put(key, instances);
        }
    }
    
    private List<Instance> addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
        return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
    }
    
    
    /**
     * 从Service中添加、更新、删除实例
     */
    public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips)
        throws NacosException {
    
        Datum datum = consistencyService
            .get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
        // 获取当前Service所有的实例
        List<Instance> currentIPs = service.allIPs(ephemeral);
        // 复制一份所有实例
        Map<String, Instance> currentInstances = new HashMap<>(currentIPs.size());
        Set<String> currentInstanceIds = Sets.newHashSet();
    
        for (Instance instance : currentIPs) {
            currentInstances.put(instance.toIpAddr(), instance);
            currentInstanceIds.add(instance.getInstanceId());
        }
    
        Map<String, Instance> instanceMap;
        if (datum != null && null != datum.value) {
            instanceMap = setValid(((Instances) datum.value).getInstanceList(), currentInstances);
        } else {
            instanceMap = new HashMap<>(ips.length);
        }
    
        for (Instance instance : ips) {
            if (!service.getClusterMap().containsKey(instance.getClusterName())) {
                Cluster cluster = new Cluster(instance.getClusterName(), service);
                cluster.init();
                // 将实例添加到Service中
                service.getClusterMap().put(instance.getClusterName(), cluster);
                Loggers.SRV_LOG
                    .warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                          instance.getClusterName(), instance.toJson());
            }
    
            // 更新或删除实例
            if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
                instanceMap.remove(instance.getDatumKey());
            } else {
                Instance oldInstance = instanceMap.get(instance.getDatumKey());
                if (oldInstance != null) {
                    instance.setInstanceId(oldInstance.getInstanceId());
                } else {
                    instance.setInstanceId(instance.generateInstanceId(currentInstanceIds));
                }
                instanceMap.put(instance.getDatumKey(), instance);
            }
    
        }
    
        if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
            throw new IllegalArgumentException(
                "ip list can not be empty, service: " + service.getName() + ", ip list: " + JacksonUtils
                .toJson(instanceMap.values()));
        }
    
        return new ArrayList<>(instanceMap.values());
    }
    

在这里插入图片描述

服务下面包含多个实例,统一存储在serviceMap中,其中会涉及到Nacos服务端节点数据一致性的操作,会另外起一篇文章讲Nacos AP模式下的数据一致性。

6. 健康检测

Nacos中有多种健康检测机制,比如Nacos服务端节点之间,服务端对持久化实例之间,服务端对非持久化实例之间。这里讲的是服务端对非持久化实例之间的健康检测,为什么需要健康检测,当实例下线的时候,如果没有健康检测,订阅该服务时也会包含下线的实例,导致调用失败。所以服务端需要知道实例是否健康,防止订阅到下线的实例。

服务端对非持久化实例的健康检测采用的是,实例每隔一段时间向服务端发送心跳来更新实例的最后心跳时间,而服务端会定时检查实例的最后心跳时间,如果超过设置的时间阈值则表示实例不健康从而清除该实例。

非持久化实例的健康检查是从注册实例的时候初始化的,让我们回到注册实例

  • NacosNamingService.registerInstance()
@Override
public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
    // 对实例对象进行属性值校验
    NamingUtils.checkInstanceIsLegal(instance);
    // groupName@@serviceName
    String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
    // 非持久化实例需要开启心跳检测
    if (instance.isEphemeral()) {
        BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
        beatReactor.addBeatInfo(groupedServiceName, beatInfo);
    }
    // 注册到Nacos服务端
    serverProxy.registerService(groupedServiceName, groupName, instance);
}

其中beatReactor就是用于心跳检测的,来看看是如何初始化这个类的:

public NacosNamingService(Properties properties) throws NacosException {
    init(properties);
}

private void init(Properties properties) throws NacosException {
    ValidatorUtils.checkInitParam(properties);
    this.namespace = InitUtils.initNamespaceForNaming(properties);
    InitUtils.initSerialization();
    initServerAddr(properties);
    InitUtils.initWebRootContext(properties);
    initCacheDir();
    initLogName(properties);

    this.serverProxy = new NamingProxy(this.namespace, this.endpoint, this.serverList, properties);
    this.beatReactor = new BeatReactor(this.serverProxy, initClientBeatThreadCount(properties));
    this.hostReactor = new HostReactor(this.serverProxy, beatReactor, this.cacheDir, isLoadCacheAtStart(properties),
                                       isPushEmptyProtect(properties), initPollingThreadCount(properties));
}

在NacosNamingService的构造方法中会初始化BeatReactor

BeatReactor的关键逻辑:

private final ScheduledExecutorService executorService;

private final NamingProxy serverProxy;

private boolean lightBeatEnabled = false;

public final Map<String, BeatInfo> dom2Beat = new ConcurrentHashMap<String, BeatInfo>();

public BeatReactor(NamingProxy serverProxy, int threadCount) {
    this.serverProxy = serverProxy;
    // 初始化定时线程池
    this.executorService = new ScheduledThreadPoolExecutor(threadCount, new ThreadFactory() {
        @Override
        public Thread newThread(Runnable r) {
            Thread thread = new Thread(r);
            thread.setDaemon(true);
            thread.setName("com.alibaba.nacos.naming.beat.sender");
            return thread;
        }
    });
}

/**
 * 给线程池添加任务
 */
public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
    NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
    // serviceName#ip#port
    String key = buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort());
    BeatInfo existBeat = null;
    //fix #1733
    if ((existBeat = dom2Beat.remove(key)) != null) {
        existBeat.setStopped(true);
    }
    // 存入dom2Beat
    dom2Beat.put(key, beatInfo);
    // 延迟beatInfo.getPeriod()毫秒执行任务
    executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
    MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
}


class BeatTask implements Runnable {

    BeatInfo beatInfo;

    public BeatTask(BeatInfo beatInfo) {
        this.beatInfo = beatInfo;
    }

    @Override
    public void run() {
        if (beatInfo.isStopped()) {
            return;
        }
        // 下一次执行间隔
        long nextTime = beatInfo.getPeriod();
        try {
            // Http调用Nacos服务端接口PUT /instance/beat
            JsonNode result = serverProxy.sendBeat(beatInfo, BeatReactor.this.lightBeatEnabled);
            long interval = result.get("clientBeatInterval").asLong();
            boolean lightBeatEnabled = false;
            if (result.has(CommonParams.LIGHT_BEAT_ENABLED)) {
                lightBeatEnabled = result.get(CommonParams.LIGHT_BEAT_ENABLED).asBoolean();
            }
            BeatReactor.this.lightBeatEnabled = lightBeatEnabled;
            // 从返回结果中获取下一次执行时间
            if (interval > 0) {
                nextTime = interval;
            }
            int code = NamingResponseCode.OK;
            // 返回的code属性
            if (result.has(CommonParams.CODE)) {
                code = result.get(CommonParams.CODE).asInt();
            }
            // code==20404,表示Nacos服务端没有该实例,重新注册
            if (code == NamingResponseCode.RESOURCE_NOT_FOUND) {
                Instance instance = new Instance();
                instance.setPort(beatInfo.getPort());
                instance.setIp(beatInfo.getIp());
                instance.setWeight(beatInfo.getWeight());
                instance.setMetadata(beatInfo.getMetadata());
                instance.setClusterName(beatInfo.getCluster());
                instance.setServiceName(beatInfo.getServiceName());
                instance.setInstanceId(instance.getInstanceId());
                instance.setEphemeral(true);
                try {
                    serverProxy.registerService(beatInfo.getServiceName(),
                                                NamingUtils.getGroupName(beatInfo.getServiceName()), instance);
                } catch (Exception ignore) {
                }
            }
        } catch (NacosException ex) {
            NAMING_LOGGER.error("[CLIENT-BEAT] failed to send beat: {}, code: {}, msg: {}",
                                JacksonUtils.toJson(beatInfo), ex.getErrCode(), ex.getErrMsg());

        } catch (Exception unknownEx) {
            NAMING_LOGGER.error("[CLIENT-BEAT] failed to send beat: {}, unknown exception msg: {}",
                                JacksonUtils.toJson(beatInfo), unknownEx.getMessage(), unknownEx);
        } finally {
            // 根据下次执行时间继续延迟执行心跳检测
            executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS);
        }
    }
}

NamingProxy.sendBeat()

/**
 * 发送心跳
 */
public JsonNode sendBeat(BeatInfo beatInfo, boolean lightBeatEnabled) throws NacosException {

    if (NAMING_LOGGER.isDebugEnabled()) {
        NAMING_LOGGER.debug("[BEAT] {} sending beat to server: {}", namespaceId, beatInfo.toString());
    }
    Map<String, String> params = new HashMap<String, String>(8);
    Map<String, String> bodyMap = new HashMap<String, String>(2);
    if (!lightBeatEnabled) {
        bodyMap.put("beat", JacksonUtils.toJson(beatInfo));
    }
    params.put(CommonParams.NAMESPACE_ID, namespaceId);
    params.put(CommonParams.SERVICE_NAME, beatInfo.getServiceName());
    params.put(CommonParams.CLUSTER_NAME, beatInfo.getCluster());
    params.put("ip", beatInfo.getIp());
    params.put("port", String.valueOf(beatInfo.getPort()));
    String result = reqApi(UtilAndComs.nacosUrlBase + "/instance/beat", params, bodyMap, HttpMethod.PUT);
    return JacksonUtils.toObj(result);
}

首先会实例化一个定时任务线程池executorService,然后创建BeatTask线程放入executorService中延迟执行,而BeatTask中的run()逻辑主要是调用Nacos服务端接口PUT /instance/beat,从返回结果中获取下次执行的间隔时间,在finally代码块中继续将任务放到线程池中延迟执行。

  • PUT /instance/beat接口

    InstanceController

    @CanDistro
    @PutMapping("/beat")
    @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
    public ObjectNode beat(HttpServletRequest request) throws Exception {
    
        ObjectNode result = JacksonUtils.createEmptyJsonNode();
        // 下次执行间隔5秒
        result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, switchDomain.getClientBeatInterval());
        String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY);
        RsInfo clientBeat = null;
        if (StringUtils.isNotBlank(beat)) {
            clientBeat = JacksonUtils.toObj(beat, RsInfo.class);
        }
        String clusterName = WebUtils
            .optional(request, CommonParams.CLUSTER_NAME, UtilsAndCommons.DEFAULT_CLUSTER_NAME);
        String ip = WebUtils.optional(request, "ip", StringUtils.EMPTY);
        int port = Integer.parseInt(WebUtils.optional(request, "port", "0"));
        if (clientBeat != null) {
            if (StringUtils.isNotBlank(clientBeat.getCluster())) {
                clusterName = clientBeat.getCluster();
            } else {
                // fix #2533
                clientBeat.setCluster(clusterName);
            }
            ip = clientBeat.getIp();
            port = clientBeat.getPort();
        }
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        NamingUtils.checkServiceNameFormat(serviceName);
        Loggers.SRV_LOG.debug("[CLIENT-BEAT] full arguments: beat: {}, serviceName: {}", clientBeat, serviceName);
        // 查询实例
        Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port);
    
        // 没有查询到实例时
        if (instance == null) {
            // 如果body里面参数为空,则返回code=20404
            if (clientBeat == null) {
                result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND);
                return result;
            }
    
            Loggers.SRV_LOG.warn("[CLIENT-BEAT] The instance has been removed for health mechanism, "
                                 + "perform data compensation operations, beat: {}, serviceName: {}", clientBeat, serviceName);
    
            // 进行实例注册操作
            instance = new Instance();
            instance.setPort(clientBeat.getPort());
            instance.setIp(clientBeat.getIp());
            instance.setWeight(clientBeat.getWeight());
            instance.setMetadata(clientBeat.getMetadata());
            instance.setClusterName(clusterName);
            instance.setServiceName(serviceName);
            instance.setInstanceId(instance.getInstanceId());
            instance.setEphemeral(clientBeat.isEphemeral());
    
            serviceManager.registerInstance(namespaceId, serviceName, instance);
        }
    
        // 找到对应的Serivce
        Service service = serviceManager.getService(namespaceId, serviceName);
    
        if (service == null) {
            throw new NacosException(NacosException.SERVER_ERROR,
                                     "service not found: " + serviceName + "@" + namespaceId);
        }
        if (clientBeat == null) {
            clientBeat = new RsInfo();
            clientBeat.setIp(ip);
            clientBeat.setPort(port);
            clientBeat.setCluster(clusterName);
        }
        // 处理实例心跳
        service.processClientBeat(clientBeat);
    
        result.put(CommonParams.CODE, NamingResponseCode.OK);
        // 如果实例中包含preserved.heart.beat.interval参数,则下次心跳间隔使用客户端配置的时间间隔
        if (instance.containsMetadata(PreservedMetadataKeys.HEART_BEAT_INTERVAL)) {
            result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, instance.getInstanceHeartBeatInterval());
        }
        // lightBeatEnabled返回true
        result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled());
        return result;
    }
    

    ServiceManager.getInstance()查询实例

    /**
     * 查找Service下单个实例
     */
    public Instance getInstance(String namespaceId, String serviceName, String cluster, String ip, int port) {
        Service service = getService(namespaceId, serviceName);
        if (service == null) {
            return null;
        }
    
        List<String> clusters = new ArrayList<>();
        clusters.add(cluster);
    
        List<Instance> ips = service.allIPs(clusters);
        if (ips == null || ips.isEmpty()) {
            return null;
        }
    
        for (Instance instance : ips) {
            // ip和端口相同则返回
            if (instance.getIp().equals(ip) && instance.getPort() == port) {
                return instance;
            }
        }
    
        return null;
    }
    

    Service.processClientBeat()处理实例心跳

    /**
     * 执行客户端心跳操作
     */
    public void processClientBeat(final RsInfo rsInfo) {
        ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor();
        clientBeatProcessor.setService(this);
        clientBeatProcessor.setRsInfo(rsInfo);
        HealthCheckReactor.scheduleNow(clientBeatProcessor);
    }
    

    线程池立即处理ClientBeatProcessor中的run()

    @Override
    public void run() {
        Service service = this.service;
        if (Loggers.EVT_LOG.isDebugEnabled()) {
            Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString());
        }
    
        String ip = rsInfo.getIp();
        String clusterName = rsInfo.getCluster();
        int port = rsInfo.getPort();
        Cluster cluster = service.getClusterMap().get(clusterName);
        List<Instance> instances = cluster.allIPs(true);
    
        for (Instance instance : instances) {
            if (instance.getIp().equals(ip) && instance.getPort() == port) {
                if (Loggers.EVT_LOG.isDebugEnabled()) {
                    Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString());
                }
                // 更新最后心跳时间
                instance.setLastBeat(System.currentTimeMillis());
                if (!instance.isMarked()) {
                    // 如果之前是不健康状态更新为健康状态
                    if (!instance.isHealthy()) {
                        instance.setHealthy(true);
                        Loggers.EVT_LOG
                            .info("service: {} {POS} {IP-ENABLED} valid: {}:{}@{}, region: {}, msg: client beat ok",
                                  cluster.getService().getName(), ip, port, cluster.getName(),
                                  UtilsAndCommons.LOCALHOST_SITE);
                        // 进行节点同步
                        getPushService().serviceChanged(service);
                    }
                }
            }
        }
    }
    

    主要是更新实例的最后心跳时间和健康状态,最后需要同步到其他节点。



    谢谢阅读,就分享到这,未完待续…

    欢迎同频共振的那一部分人

    作者公众号:Tarzan写bug

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值