Nacos核心功能点
服务注册:Nacos Client会通过发送REST请求的方式向Nacos Server注册自己的服务,提供自身的元数据,比如ip地址、端口等信息。 Nacos Server接收到注册请求后,就会把这些元数据信息存储在一个双层的内存Map中。
服务心跳:在服务注册后,Nacos Client会维护一个定时心跳来持续通知Nacos Server,说明服务一直处于可用状态,防止被剔除。默认 5s发送一次心跳。
服务同步:Nacos Server集群之间会互相同步服务实例,用来保证服务信息的一致性。
服务发现:服务消费者(Nacos Client)在调用服务提供者的服务时,会发送一个REST请求给Nacos Server,获取上面注册的服务清 单,并且缓存在Nacos Client本地,同时会在Nacos Client本地开启一个定时任务定时拉取服务端最新的注册表信息更新到本地缓存
服务健康检查:Nacos Server会开启一个定时任务用来检查注册服务实例的健康情况,对于超过15s没有收到客户端心跳的实例会将它的 healthy属性置为false(客户端服务发现时不会发现),如果某个实例超过30秒没有收到心跳,直接剔除该实例(被剔除的实例如果恢复发送 心跳则会重新注册)
二、源码分析:
1、client端进行服务续约、服务获取、服务注册
源码入口:NacosDiscoveryAutoConfiguration类:
@Configuration @EnableConfigurationProperties @ConditionalOnNacosDiscoveryEnabled @ConditionalOnProperty( value = {"spring.cloud.service-registry.auto-registration.enabled"}, matchIfMissing = true ) @AutoConfigureAfter({AutoServiceRegistrationConfiguration.class, AutoServiceRegistrationAutoConfiguration.class}) public class NacosDiscoveryAutoConfiguration { public NacosDiscoveryAutoConfiguration() { } @Bean public NacosServiceRegistry nacosServiceRegistry(NacosDiscoveryProperties nacosDiscoveryProperties) { return new NacosServiceRegistry(nacosDiscoveryProperties); } @Bean @ConditionalOnBean({AutoServiceRegistrationProperties.class}) public NacosRegistration nacosRegistration(NacosDiscoveryProperties nacosDiscoveryProperties, ApplicationContext context) { return new NacosRegistration(nacosDiscoveryProperties, context); } @Bean @ConditionalOnBean({AutoServiceRegistrationProperties.class}) public NacosAutoServiceRegistration nacosAutoServiceRegistration(NacosServiceRegistry registry, AutoServiceRegistrationProperties autoServiceRegistrationProperties, NacosRegistration registration) { return new NacosAutoServiceRegistration(registry, autoServiceRegistrationProperties, registration); } }
这里的三个bean最后是为了调用NacosServiceRegistry 的register方法,而在该方法中我们们需要看到其中调用namingService.registerInstance(serviceId, instance)这个方法
注意这个registerInstance方法就是client端进行服务注册调用的方法(该方法会去调用server端暴露的注册接口)
public class NacosServiceRegistry implements ServiceRegistry<Registration> { public void register(Registration registration) { if (StringUtils.isEmpty(registration.getServiceId())) { log.warn("No service to register for nacos client..."); } else { //1、这里获取配置文件中要注册的实例信息,这里会调用方法将这些信息转换为instance实例对象,后面有用 String serviceId = registration.getServiceId(); Instance instance = this.getNacosInstanceFromRegistration(registration); try { //2、进入registerInstance方法中 this.namingService.registerInstance(serviceId, instance); log.info("nacos registry, {} {}:{} register finished", new Object[]{serviceId, instance.getIp(), instance.getPort()}); } catch (Exception var5) { log.error("nacos registry, {} register failed...{},", new Object[]{serviceId, registration.toString(), var5}); } } }
继续进入registerInstance方法,
public class NacosNamingService implements NamingService { public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException { //这里的instance是我们从配置文件得到的哪些注册的信息组成的实例 if (instance.isEphemeral()) { BeatInfo beatInfo = new BeatInfo(); beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName)); beatInfo.setIp(instance.getIp()); beatInfo.setPort(instance.getPort()); beatInfo.setCluster(instance.getClusterName()); beatInfo.setWeight(instance.getWeight()); beatInfo.setMetadata(instance.getMetadata()); beatInfo.setScheduled(false); long instanceInterval = instance.getInstanceHeartBeatInterval(); beatInfo.setPeriod(instanceInterval == 0L ? DEFAULT_HEART_BEAT_INTERVAL : instanceInterval); //1、心跳续约接口(定时任务) this.beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo); } //2、服务注册接口 this.serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance); } //3、获取所有服务接口 public List<Instance> getAllInstances(String serviceName, String groupName) throws NacosException { return this.getAllInstances(serviceName, groupName, new ArrayList()); } }
1、我们先看看client是怎么调用心跳续约的
进入addBeatInfo方法:该方法添加一个延时队列,定时的去执行一个BeatTask心跳任务。这是我们就进入看看这个定时任务是干了什么
public void addBeatInfo(String serviceName, BeatInfo beatInfo) { LogUtils.NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo); this.dom2Beat.put(this.buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort()), beatInfo); this.executorService.schedule(new BeatReactor.BeatTask(beatInfo), 0L, TimeUnit.MILLISECONDS); MetricsMonitor.getDom2BeatSizeMonitor().set((double)this.dom2Beat.size()); }
进入定时任务,从下面的代码知道,这个定时任务是去调用nacos的server端的心跳续约接口。
class BeatTask implements Runnable { BeatInfo beatInfo; public BeatTask(BeatInfo beatInfo) { this.beatInfo = beatInfo; } public void run() { if (!this.beatInfo.isStopped()) { long result = BeatReactor.this.serverProxy.sendBeat(this.beatInfo); long nextTime = result > 0L ? result : this.beatInfo.getPeriod(); BeatReactor.this.executorService.schedule(BeatReactor.this.new BeatTask(this.beatInfo), nextTime, TimeUnit.MILLISECONDS); } } }
2、回到前面我们再看看client端是怎么调用服务注册接口的
可以看出,将实例信息存进一个map中,然后用http协议调用server端的接口(post请求)public void registerService(String serviceName, String groupName, Instance instance) throws NacosException { LogUtils.NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", new Object[]{this.namespaceId, serviceName, instance}); Map<String, String> params = new HashMap(9); params.put("namespaceId", this.namespaceId); params.put("serviceName", serviceName); params.put("groupName", groupName); params.put("clusterName", instance.getClusterName()); params.put("ip", instance.getIp()); params.put("port", String.valueOf(instance.getPort())); params.put("weight", String.valueOf(instance.getWeight())); params.put("enable", String.valueOf(instance.isEnabled())); params.put("healthy", String.valueOf(instance.isHealthy())); params.put("ephemeral", String.valueOf(instance.isEphemeral())); params.put("metadata", JSON.toJSONString(instance.getMetadata())); this.reqAPI(UtilAndComs.NACOS_URL_INSTANCE, params, (String)"POST"); }
3、同样往下看获取所有服务实例接口:getAllInstances
public List<Instance> getAllInstances(String serviceName, String groupName, List<String> clusters, boolean subscribe) throws NacosException { ServiceInfo serviceInfo; if (subscribe) { serviceInfo = this.hostReactor.getServiceInfo(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ",")); } else { serviceInfo = this.hostReactor.getServiceInfoDirectlyFromServer(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ",")); } List list; return (List)(serviceInfo != null && !CollectionUtils.isEmpty(list = serviceInfo.getHosts()) ? list : new ArrayList()); }
这个方法的主要逻辑三步:
1)、先去client端本地获取实例缓存(map)
2)、如果没有则调用getServiceInfoDirectlyFromServer方法(该方法回去调用sever端的获取服务实例的接口)
3)、调用完之后会建立一个延时定时任务,将获取到的实例信息存到client本地缓存
2、Server端的原理分析
server需要我们自己下载源码包,然后用idea打开。
我们先从怎么server拉取实例列表、服务注册、服务续约开始说的。
进入package com.alibaba.nacos.naming.controllers,找到InstanceController这个类(这里)。下面代码是不是很像我们平常写的控制层代码。
@RestController @RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance") public class InstanceController { @Autowired private SwitchDomain switchDomain; @Autowired private PushService pushService; @Autowired private ServiceManager serviceManager; private DataSource pushDataSource = new DataSource() { //1、server提供的注册实例接口 @CanDistro @PostMapping public String register(HttpServletRequest request) throws Exception { String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request)); return "ok"; } //2、这里就是server端提供的获取所有实例的接口 @GetMapping("/list") public JSONObject list(HttpServletRequest request) throws Exception { String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); String agent = WebUtils.getUserAgent(request); String clusters = WebUtils.optional(request, "clusters", StringUtils.EMPTY); String clientIP = WebUtils.optional(request, "clientIP", StringUtils.EMPTY); Integer udpPort = Integer.parseInt(WebUtils.optional(request, "udpPort", "0")); String env = WebUtils.optional(request, "env", StringUtils.EMPTY); boolean isCheck = Boolean.parseBoolean(WebUtils.optional(request, "isCheck", "false")); String app = WebUtils.optional(request, "app", StringUtils.EMPTY); String tenant = WebUtils.optional(request, "tid", StringUtils.EMPTY); boolean healthyOnly = Boolean.parseBoolean(WebUtils.optional(request, "healthyOnly", "false")); return doSrvIPXT(namespaceId, serviceName, agent, clusters, clientIP, udpPort, env, isCheck, app, tenant, healthyOnly); } //3、进行服务续约的接口 @CanDistro @PutMapping("/beat") public JSONObject beat(HttpServletRequest request) throws Exception { JSONObject result = new JSONObject(); result.put("clientBeatInterval", switchDomain.getClientBeatInterval()); String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME); String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID); String beat = WebUtils.required(request, "beat"); RsInfo clientBeat = JSON.parseObject(beat, RsInfo.class); if (!switchDomain.isDefaultInstanceEphemeral() && !clientBeat.isEphemeral()) { return result; } if (StringUtils.isBlank(clientBeat.getCluster())) { clientBeat.setCluster(UtilsAndCommons.DEFAULT_CLUSTER_NAME); } String clusterName = clientBeat.getCluster(); if (Loggers.SRV_LOG.isDebugEnabled()) { Loggers.SRV_LOG.debug("[CLIENT-BEAT] full arguments: beat: {}, serviceName: {}", clientBeat, serviceName); } Instance instance = serviceManager.getInstance(namespaceId, serviceName, clientBeat.getCluster(), clientBeat.getIp(), clientBeat.getPort()); if (instance == null) { instance = new Instance(); instance.setPort(clientBeat.getPort()); instance.setIp(clientBeat.getIp()); instance.setWeight(clientBeat.getWeight()); instance.setMetadata(clientBeat.getMetadata()); instance.setClusterName(clusterName); instance.setServiceName(serviceName); instance.setInstanceId(instance.getInstanceId()); instance.setEphemeral(clientBeat.isEphemeral()); serviceManager.registerInstance(namespaceId, serviceName, instance); } Service service = serviceManager.getService(namespaceId, serviceName); if (service == null) { throw new NacosException(NacosException.SERVER_ERROR, "service not found: " + serviceName + "@" + namespaceId); } service.processClientBeat(clientBeat); result.put("clientBeatInterval", instance.getInstanceHeartBeatInterval()); return result; } }
1、注册接口
进入接口调用的方法registerInstance
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException { //1、创建一个空实例 createEmptyService(namespaceId, serviceName, instance.isEphemeral()); Service service = getService(namespaceId, serviceName); if (service == null) { throw new NacosException(NacosException.INVALID_PARAM, "service not found, namespace: " + namespaceId + ", service: " + serviceName); } //2、将新的实例放进对应服务的注册表中去 addInstance(namespaceId, serviceName, instance.isEphemeral(), instance); }
1.1、这里有两个方法比较重要,先看第一个createEmptyService,虽然字面意思是创建一个空的实例,但其实其中还有其他逻辑。
(1)、如果没注册表则会创建一个注册表
(2)、建立一个定时任务,如果某个实例超过15秒没有发送心跳过来,则将其服务的healthly属性设置为false。
(3)、如果30秒没发心跳则会剔除该服务(剔除也是调用本身的dele接口进行删除实例操作),如果被剔除的服务重新恢复心跳则会在注册表中重新注册。
1.2、进入另一个方法addInstance
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException { String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral); Service service = getService(namespaceId, serviceName); synchronized (service) { List<Instance> instanceList = addIpAddresses(service, ephemeral, ips); Instances instances = new Instances(); instances.setInstanceList(instanceList); consistencyService.put(key, instances); } }
直接进入put方法中,是个接口,选择实现类DelegateConsistencyServiceImpl的实现方法。
@Service("consistencyDelegate") public class DelegateConsistencyServiceImpl implements ConsistencyService { @Autowired private PersistentConsistencyService persistentConsistencyService; @Autowired private EphemeralConsistencyService ephemeralConsistencyService; @Override public void put(String key, Record value) throws NacosException { mapConsistencyService(key).put(key, value); } private ConsistencyService mapConsistencyService(String key) { return KeyBuilder.matchEphemeralKey(key) ? ephemeralConsistencyService : persistentConsistencyService; } }
由代码知道,最终是会调用mapConsistencyService方法,而该方法中的逻辑:
(1)、如果该实例是临时实例,则调用ephemeralConsistencyService(AP模式)
(2)、如果该实例是持久化实例数据,则调用persistentConsistencyService(CP模式)(这里的持久化实例不等于nacos的持久化配置)
对于nacos中的临时、持久化实例概念看:https://blog.csdn.net/qq_38826019/article/details/109433231
如果我们这是走的AP模式(默认使用),看看是怎么实现的。
AP模式的注册实例:(使用了阿里自己实现的Distro协议进行实现的AP模式)(节点平等)(一般都是用AP模式)
进入ephemeralConsistencyService是个接口,查找其实现类DistroConsistencyServiceImpl,查看其中的put方法实现逻辑
@Override public void put(String key, Record value) throws NacosException { //1、将注册的实例更新到内存注册表 onPut(key, value); //2、同步实例到nacos的其他集群节点 taskDispatcher.addTask(key); }
这里的两个方法都使用了异步编程+循环遍历task任务进行实现的。其中服务同步就是在第二个方法中实现的,至于怎么实现的我这里不说明了。
在服务同步过程中,如果数量多/定期会进行批量同步任务。
其中将数据更新到注册表中,nacos为了防止读写并发冲突,使用了CopyOnWrite的思想解决了读写冲突问题。就是先复制一份内存注册表,更新后把老的注册表换了。(注意:eureka为了防止读写冲突、频繁访问使用了多级缓存实现的)
CP模式(使用简单的Raft协议)(RaftConsistencyServiceImpl实现类)(leader/follwer)
@Override public void put(String key, Record value) throws NacosException { try { raftCore.signalPublish(key, value); } catch (Exception e) { Loggers.RAFT.error("Raft put failed.", e); throw new NacosException(NacosException.SERVER_ERROR, "Raft put failed, key:" + key + ", value:" + value, e); } }
继续进入signalPublish方法
public void signalPublish(String key, Record value) throws Exception { if (!isLeader()) { JSONObject params = new JSONObject(); params.put("key", key); params.put("value", value); Map<String, String> parameters = new HashMap<>(1); parameters.put("key", key); raftProxy.proxyPostLarge(getLeader().ip, API_PUB, params.toJSONString(), parameters); return; } try { OPERATE_LOCK.lock(); long start = System.currentTimeMillis(); final Datum datum = new Datum(); datum.key = key; datum.value = value; if (getDatum(key) == null) { datum.timestamp.set(1L); } else { datum.timestamp.set(getDatum(key).timestamp.incrementAndGet()); } JSONObject json = new JSONObject(); json.put("datum", datum); json.put("source", peers.local()); onPublish(datum, peers.local()); final String content = JSON.toJSONString(json); final CountDownLatch latch = new CountDownLatch(peers.majorityCount()); for (final String server : peers.allServersIncludeMyself()) { if (isLeader(server)) { latch.countDown(); continue; } final String url = buildURL(server, API_ON_PUB); HttpClient.asyncHttpPostLarge(url, Arrays.asList("key=" + key), content, new AsyncCompletionHandler<Integer>() { @Override public Integer onCompleted(Response response) throws Exception { if (response.getStatusCode() != HttpURLConnection.HTTP_OK) { Loggers.RAFT.warn("[RAFT] failed to publish data to peer, datumId={}, peer={}, http code={}", datum.key, server, response.getStatusCode()); return 1; } latch.countDown(); return 0; } @Override public STATE onContentWriteCompleted() { return STATE.CONTINUE; } }); } if (!latch.await(UtilsAndCommons.RAFT_PUBLISH_TIMEOUT, TimeUnit.MILLISECONDS)) { // only majority servers return success can we consider this update success Loggers.RAFT.error("data publish failed, caused failed to notify majority, key={}", key); throw new IllegalStateException("data publish failed, caused failed to notify majority, key=" + key); } long end = System.currentTimeMillis(); Loggers.RAFT.info("signalPublish cost {} ms, key: {}", (end - start), key); } finally { OPERATE_LOCK.unlock(); } }
总结下该方法的逻辑:
(1)、判断本节点是否是leader节点,不是则请求转发到leader节点上
(2)、如果是leader节点,更新新的注册实例信息到内存和磁盘中去(同步实例信息到磁盘文件、异步实例信息到内存注册表)(写磁盘的方式是要配置持久化的,不配数据持久化则不会写)
(3)、其中更新写入数据的CP实现:使用countdownLatch实现一个简单的raft协议,必须集群半数以上节点写入成功才会给客户端返回成功。(用countdownLatch设置为半数节点+1,达到半数+1节点数据写成功则返回给客户端成功)(这里的cp模式是使用简化的raft协议,没有所谓的两阶段过程,直接用countdownLatch来实现)
注意follwer节点是无法进行写数据的,所以注册节点要写到leader节点上。
2、服务续约心跳接口
先查看实例是否存在,不存在则进行注册1该实例,存在则更新注册表中的实例信息。
3、服务拉取接口
直接去内存的注册表中拉取实例信息
三、CAP理论
![]()
![]()
![]()
这里在给出nacos的cp模式的leader选举流程图: