Eureka客户端解析
下图是我们Eureka客户端的主要流程分析:
根据上图分析源码
1. 查看我们的pom文件,发现我们导入了maven依赖
<!--Eureka的客户端依赖-->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
2. 根据我们导入的maven依赖查看导入的源码,查看其下面的spring.factories文件,利用SpringBoot的原理查看自动导入的类
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
org.springframework.cloud.netflix.eureka.config.EurekaClientConfigServerAutoConfiguration,\
org.springframework.cloud.netflix.eureka.config.EurekaDiscoveryClientConfigServiceAutoConfiguration,\
org.springframework.cloud.netflix.eureka.EurekaClientAutoConfiguration,\
org.springframework.cloud.netflix.ribbon.eureka.RibbonEurekaAutoConfiguration,\
org.springframework.cloud.netflix.eureka.EurekaDiscoveryClientConfiguration
org.springframework.cloud.bootstrap.BootstrapConfiguration=\
org.springframework.cloud.netflix.eureka.config.EurekaDiscoveryClientConfigServiceBootstrapConfiguration
3. 我们观察上面自动导入的类,有点多,这时候就靠猜了,看看和哪一个相关,这里我们查看的是EurekaClientAutoConfiguration类
4. 查看EurekaClientAutoConfiguration类
(4.1)这里我们关注下@AutoConfigureAfter注解。该注解的意思是当前类EurekaClientAutoConfiguration会在该注解中的类加载完成之后再加载。
(4.2)我们的@AutoConfigureAfter中导入了org.springframework.cloud.netflix.eureka.EurekaDiscoveryClientConfiguration
该类的作用就是创建一个Marker对象
(4.3)再回头看EurekaClientAutoConfiguration类的@ConditionalOnBean(EurekaDiscoveryClientConfiguration.Marker.class)注解,发现该注解的意思就是当我们类中有Marker对象的时候进行加载,这样我们就已将其联系起来。
5. EurekaClientAutoConfiguration类的Bean加载太多,我们只看关键的。这里我们看下DiscoveryClient类。
(5.1)这里我们自动注入了EurekaClient,该类也是在EurekaClientAutoConfiguration加载到Spring中的。
6. 观察EurekaClient类
(6.1)我们一步一步点进去,最终是来到了com.netflix.discovery.DiscoveryClient 中的构造函数
7. 观察该构造函数
@Inject
DiscoveryClient(ApplicationInfoManager applicationInfoManager, EurekaClientConfig config, AbstractDiscoveryClientOptionalArgs args,
Provider<BackupRegistry> backupRegistryProvider) {
// .... 省略关键代码
logger.info("Initializing Eureka in region {}", clientConfig.getRegion());
// .... 省略关键代码
try {
// 定义一个线程池,设置其为守护线程,该线程池是一个延迟或者定期执行的线程池
scheduler = Executors.newScheduledThreadPool(2,
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-%d")
.setDaemon(true)
.build());
// 该线程池是用来执行心跳检测,进行服务续约
heartbeatExecutor = new ThreadPoolExecutor(
1, clientConfig.getHeartbeatExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-HeartbeatExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
// 该线程池是用来定时更新服务注册列表的进程的
cacheRefreshExecutor = new ThreadPoolExecutor(
1, clientConfig.getCacheRefreshExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
eurekaTransport = new EurekaTransport();
scheduleServerEndpointTask(eurekaTransport, args);
AzToRegionMapper azToRegionMapper;
if (clientConfig.shouldUseDnsForFetchingServiceUrls()) {
azToRegionMapper = new DNSBasedAzToRegionMapper(clientConfig);
} else {
azToRegionMapper = new PropertyBasedAzToRegionMapper(clientConfig);
}
if (null != remoteRegionsToFetch.get()) {
azToRegionMapper.setRegionsToFetch(remoteRegionsToFetch.get().split(","));
}
instanceRegionChecker = new InstanceRegionChecker(azToRegionMapper, clientConfig.getRegion());
} catch (Throwable e) {
throw new RuntimeException("Failed to initialize DiscoveryClient!", e);
}
if (clientConfig.shouldFetchRegistry() && !fetchRegistry(false)) {
fetchRegistryFromBackup();
}
// call and execute the pre registration handler before all background tasks (inc registration) is started
if (this.preRegistrationHandler != null) {
this.preRegistrationHandler.beforeRegistration();
}
if (clientConfig.shouldRegisterWithEureka() && clientConfig.shouldEnforceRegistrationAtInit()) {
try {
if (!register() ) {
throw new IllegalStateException("Registration error at startup. Invalid server response.");
}
} catch (Throwable th) {
logger.error("Registration error at startup: {}", th.getMessage());
throw new IllegalStateException(th);
}
}
// finally, init the schedule tasks (e.g. cluster resolvers, heartbeat, instanceInfo replicator, fetch
// 最最最核心的代码 初始化线程任务 集群解析,心跳,实例信息赋值,拉取服务,
initScheduledTasks();
try {
Monitors.registerObject(this);
} catch (Throwable e) {
logger.warn("Cannot register timers", e);
}
// This is a bit of hack to allow for existing code using DiscoveryManager.getInstance()
// to work with DI'd DiscoveryClient
DiscoveryManager.getInstance().setDiscoveryClient(this);
DiscoveryManager.getInstance().setEurekaClientConfig(config);
initTimestampMs = System.currentTimeMillis();
logger.info("Discovery Client initialized at timestamp {} with initial instances count: {}",
initTimestampMs, this.getApplications().size());
}
(7.1)@Inject注解的作用是其参数在运行时由配置好的IoC容器提供
8. 初始化时启动核心功能的定时任务:initScheduledTasks()
/**
* Initializes all scheduled tasks.
*/
private void initScheduledTasks() {
// 服务是否开启拉取,是否开启了fetch-register
if (clientConfig.shouldFetchRegistry()) {
// 服务注册列表的更新的周期时间
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
// 定时更新服务的注册列表
scheduler.schedule(
new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
// 具体的更新服务的方法的具体逻辑
new CacheRefreshThread()
),
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
// 如果我们允许当前服务向Eureka注册
if (clientConfig.shouldRegisterWithEureka()) {
// 服务续约的周期时间
int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
// 从日志看,如果我们没有配置续约的时间,默认是30s
logger.info("Starting heartbeat executor: " + "renew interval is: {}", renewalIntervalInSecs);
// 服务启动心跳机制,定时服务续约
// Heartbeat timer
scheduler.schedule(
new TimedSupervisorTask(
"heartbeat",
scheduler,
heartbeatExecutor,
renewalIntervalInSecs,
TimeUnit.SECONDS,
expBackOffBound,
// 服务续约的具体逻辑
new HeartbeatThread()
),
renewalIntervalInSecs, TimeUnit.SECONDS);
// InstanceInfo replicator
instanceInfoReplicator = new InstanceInfoReplicator(
this,
instanceInfo,
clientConfig.getInstanceInfoReplicationIntervalSeconds(),
2); // burstSize
statusChangeListener = new ApplicationInfoManager.StatusChangeListener() {
@Override
public String getId() {
return "statusChangeListener";
}
@Override
public void notify(StatusChangeEvent statusChangeEvent) {
if (InstanceStatus.DOWN == statusChangeEvent.getStatus() ||
InstanceStatus.DOWN == statusChangeEvent.getPreviousStatus()) {
// log at warn level if DOWN was involved
logger.warn("Saw local status change event {}", statusChangeEvent);
} else {
logger.info("Saw local status change event {}", statusChangeEvent);
}
instanceInfoReplicator.onDemandUpdate();
}
};
if (clientConfig.shouldOnDemandUpdateStatusChange()) {
applicationInfoManager.registerStatusChangeListener(statusChangeListener);
}
// 调用start方法,该方法中会启动线程,该线程的作用是执行服务注册的逻辑
instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
} else {
logger.info("Not registering with Eureka server per configuration");
}
}
9. initScheduledTasks方法中的TimedSupervisorTask任务
- 我们在启动服务续约和定时更新服务注册表的线程都是通过new TimedSupervisorTask来实现的。
- 这里我们详细看下这个类的作用,TimedSupervisorTask类是一个Runnable接口,主要这里看其run方法
@Override
public void run() {
Future<?> future = null;
try {
// 该task是我们新建该类的对象的时候传进来的一个对象
// 如果我们是用来进行服务续约则该task就是new HeartbeatThread()
// 如果我们是用来更新服务注册表的话,该task就是new CacheRefreshThread()
// future模式执行线程
future = executor.submit(task);
threadPoolLevelGauge.set((long) executor.getActiveCount());
// 指定等待子线程的最长的时间
future.get(timeoutMillis, TimeUnit.MILLISECONDS); // block until done or timeout
// 这一步很关键,这里记得我们每次都会重新赋值这个值时间,delay变量在这里每次都被重置
delay.set(timeoutMillis);
threadPoolLevelGauge.set((long) executor.getActiveCount());
} catch (TimeoutException e) {
// 进入到这里表示的是执行超时,在规定的时间内线程还没有执行完成
logger.warn("task supervisor timed out", e);
timeoutCounter.increment();
// 得到当前delay的值
long currentDelay = delay.get();
// 任务线程超时的时候,就把delay变量翻倍,但是不会超过外部调用的时候设定的最大的延误时间
long newDelay = Math.min(maxDelay, currentDelay * 2);
// 将delay设置为最新的值,注意,这里因为有并发的情况。使用了CAS的机制,进行赋值
// CAS就是将内存中的值和期望值做对比,若一致,将其改为更新值
// 这里就是用cureentDelay的值和内存的值做对比,若相等,将其值改为newDelay
delay.compareAndSet(currentDelay, newDelay);
} catch (RejectedExecutionException e) {
// 一旦线程池的阻塞队列中放满了待处理的任务,出发了拒绝策略,就会停止掉调度器
if (executor.isShutdown() || scheduler.isShutdown()) {
logger.warn("task supervisor shutting down, reject the task", e);
} else {
logger.warn("task supervisor rejected the task", e);
}
rejectedCounter.increment();
} catch (Throwable e) {
if (executor.isShutdown() || scheduler.isShutdown()) {
logger.warn("task supervisor shutting down, can't accept the task");
} else {
logger.warn("task supervisor threw an exception", e);
}
throwableCounter.increment();
} finally {
if (future != null) {
future.cancel(true);
}
if (!scheduler.isShutdown()) {
// 亮点啊 这里是只要调度器没有停止,就等待一段时间后,再执行一次任务
// 不管我们是服务续约还是服务注册表的更新,都不应该只执行一遍,应该不停的执行
// 这里我们重新调用其定时任务将其执行
// 这里的delay的时间已经变为原来设置值的2倍
// 假设外部调用的时候传入的超时时间是30s,构造方法中的timeout,最大的时间间隔为50s,构造方法中的expBackOffBound
// 如果最近一次的任务没有超时,那么再30s后重新开始新的任务
// 如果最近一次的任务超时了,那么就在50s后开始新的任务(这里的50s因为我们最大的间隔是50s,30*2大于了50,所以这里取的是50s)
scheduler.schedule(this, delay.get(), TimeUnit.MILLISECONDS);
}
}
}
- scheduler.schedule(this, delay.get(), TimeUnit.MILLISECONDS);从代码的注释上可以看出这个方法是一次性调用的方法,但是实际上这个方法执行的任务是会反复执行的(服务续约,服务更新注册表),这里的关键就是类TimedSupervisorTask的run方法里,run方法任务执行完成之后,会再次调用schedule方法,在指定的时间之后执行一次相同的任务,这个时间间隔和最近一次任务是否超时有关,如果超时了则下一次执行任务的间隔时间就会变大。
- 代码精髓
从整体上来看,TimedSupervisorTask是固定间隔的周期性任务,一旦遇到了超时就会将下一个周期的间隔时间调大,如果连续超时的话,那么每次间隔时间都会增大一倍,一直到达外部参数设定的上限为止,因为只要超时就会进catch方法中,2倍扩充间隔时间,一旦新任务不再超时,间隔时间又会恢复初始值,不进入catch中,将delay重新赋值为初始。另外还有CAS来控制多线程的同步
- 定时更新服务注册列表线程CacheRefreshThread
- 代码解析
/**
* The task that fetches the registry information at specified intervals.
*
*/
class CacheRefreshThread implements Runnable {
public void run() {
// 多线程开始执行发方法
refreshRegistry();
}
}
// 多线程执行的方法
// 用来定时刷新服务注册表
@VisibleForTesting
void refreshRegistry() {
try {
boolean isFetchingRemoteRegionRegistries = isFetchingRemoteRegionRegistries();
boolean remoteRegionsModified = false;
// This makes sure that a dynamic change to remote regions to fetch is honored.
String latestRemoteRegions = clientConfig.fetchRegistryForRemoteRegions();
// 这些东西根本不用看,我们没有设置亚马逊的云,不会执行if逻辑
if (null != latestRemoteRegions) {
String currentRemoteRegions = remoteRegionsToFetch.get();
if (!latestRemoteRegions.equals(currentRemoteRegions)) {
// Both remoteRegionsToFetch and AzToRegionMapper.regionsToFetch need to be in sync
synchronized (instanceRegionChecker.getAzToRegionMapper()) {
if (remoteRegionsToFetch.compareAndSet(currentRemoteRegions, latestRemoteRegions)) {
String[] remoteRegions = latestRemoteRegions.split(",");
remoteRegionsRef.set(remoteRegions);
instanceRegionChecker.getAzToRegionMapper().setRegionsToFetch(remoteRegions);
remoteRegionsModified = true;
} else {
logger.info("Remote regions to fetch modified concurrently," +
" ignoring change from {} to {}", currentRemoteRegions, latestRemoteRegions);
}
}
} else {
// Just refresh mapping to reflect any DNS/Property change
instanceRegionChecker.getAzToRegionMapper().refreshMapping();
}
}
// 该方法就是用来获取注册信息的方法
boolean success = fetchRegistry(remoteRegionsModified);
if (success) {
registrySize = localRegionApps.get().size();
lastSuccessfulRegistryFetchTimestamp = System.currentTimeMillis();
}
if (logger.isDebugEnabled()) {
StringBuilder allAppsHashCodes = new StringBuilder();
allAppsHashCodes.append("Local region apps hashcode: ");
allAppsHashCodes.append(localRegionApps.get().getAppsHashCode());
allAppsHashCodes.append(", is fetching remote regions? ");
allAppsHashCodes.append(isFetchingRemoteRegionRegistries);
for (Map.Entry<String, Applications> entry : remoteRegionVsApps.entrySet()) {
allAppsHashCodes.append(", Remote region: ");
allAppsHashCodes.append(entry.getKey());
allAppsHashCodes.append(" , apps hashcode: ");
allAppsHashCodes.append(entry.getValue().getAppsHashCode());
}
logger.debug("Completed cache refresh task for discovery. All Apps hash code is {} ",
allAppsHashCodes);
}
} catch (Throwable e) {
logger.error("Cannot fetch registry from server", e);
}
}
- 简单来说上面的刷新服务注册表的信息,只有一行代码是重要的,改代码的作用就是获取服务注册信息的方法:
boolean success = fetchRegistry(remoteRegionsModified);
- 分析这个方法
/**
* Fetches the registry information.
*
* <p>
* This method tries to get only deltas after the first fetch unless there
* is an issue in reconciling eureka server and client registry information.
* </p>
*
* @param forceFullRegistryFetch Forces a full registry fetch.
*
* @return true if the registry was fetched
*/
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
try {
// If the delta is disabled or if it is the first time, get all
// applications
// 先从本地变量中获取,如果增量获取,或者不是第一次获取,按道理来说本地是存有的
Applications applications = getApplications();
// 条件判断
// 1. 是否禁用增量更新
// 2. 是否对某个region特别关注
// 3. 外部调用时候是否指定了必须全量更新
// 4. 本地还没有缓存有效的服务列表信息 即getApplications();方法返回的是空的
if (clientConfig.shouldDisableDelta()
|| (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
|| forceFullRegistryFetch
|| (applications == null)
|| (applications.getRegisteredApplications().size() == 0)
|| (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
{
logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
logger.info("Application is null : {}", (applications == null));
logger.info("Registered Applications size is zero : {}",
(applications.getRegisteredApplications().size() == 0));
logger.info("Application version is -1: {}", (applications.getVersion() == -1));
// 执行全量更新操作
getAndStoreFullRegistry();
} else {
// 执行增量更新操作
getAndUpdateDelta(applications);
}
// 重新计算和设置一致性的hash码
// 这里的一致性hash码,是用来匹对服务端的服务列表信息和当前的列表信息是一致的。
applications.setAppsHashCode(applications.getReconcileHashCode());
logTotalInstances();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to refresh its cache! status = {}", appPathIdentifier, e.getMessage(), e);
return false;
} finally {
if (tracer != null) {
tracer.stop();
}
}
// Notify about cache refresh before updating the instance remote status
// 将本地缓存的更新的事件传播给所有的已经注册的监听器,注意该方法已经被CloudEurekaClient类重写
onCacheRefreshed();
// Update remote status based on refreshed data held in the cache
// 检查刚刚更新的缓存中,有来自Eureka Sever的服务注册列表,其中包含了当前应用的状态
// 当前实例的成员变量lastRemoteInstanceStatus,记录的是最后一次更新的当前应用状态
// 上述两种状态在该方法中做比较,如果不一致,就更新lastRemoteInstanceStatus,并且广播对应的事件、
updateInstanceRemoteStatus();
// registry was fetched successfully, so return true
return true;
}
11. 全量更新方法:getAndStoreFullRegistry
- 我们在上面的拉取服务注册列表的方法中有两个方法比较重要:全量更新方法和增量更新方法
- 这里我们先描述下全量更新的方法
/**
* Gets the full registry information from the eureka server and stores it locally.
* When applying the full registry, the following flow is observed:
*
* if (update generation have not advanced (due to another thread))
* atomically set the registry to the new registry
* fi
*
* @return the full registry information.
* @throws Throwable
* on error.
*/
private void getAndStoreFullRegistry() throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
logger.info("Getting all instance registry info from the eureka server");
Applications apps = null;
// 你看看下面这些乱七八糟的方法,明显我们没有设置相关的region信息,我们这里直接看
// eurekaTransport.queryClient.getApplications(remoteRegionsRef.get())就行
// 该方法的作用是从服务端获取服务列表
// 这里调用了Jersey的一个远程服务端的接口(调用的Eureka Server)的接口
// 其实Jersey理解成SpringMVC就行
EurekaHttpResponse<Applications> httpResponse = clientConfig.getRegistryRefreshSingleVipAddress() == null
? eurekaTransport.queryClient.getApplications(remoteRegionsRef.get())
: eurekaTransport.queryClient.getVip(clientConfig.getRegistryRefreshSingleVipAddress(), remoteRegionsRef.get());
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
// 返回的就是服务列表
apps = httpResponse.getEntity();
}
logger.info("The response status is {}", httpResponse.getStatusCode());
if (apps == null) {
logger.error("The application is null for some reason. Not storing this information");
} else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
// else if中利用了compareAndSet这个方法是CAS的方法
// 这里考虑到的是多线程的情况,只有CAS成功的线程,才会把自己从Eureka Server获取的数据来替换本地的缓存。放到了localRegionApps中
// localRegionApps就是本地的缓存,用来存放服务注册列表,是一个AtomicReference实例
localRegionApps.set(this.filterAndShuffle(apps));
logger.debug("Got full registry with apps hashcode {}", apps.getAppsHashCode());
} else {
logger.warn("Not updating applications as another thread is updating it already");
}
}
- 上面的代码其实最重要的一段代码就是eurekaTransport.queryClient.getApplications(remoteRegionsRef.get()),该代码就是与Eureka Server的交互,可以理解为调用了一个SpringMVC的Controller接口实现,但是这里实际上是用Jersy实现的。
- 方法getApplications的具体实现是在EurekaHttpClientDecorator类
@Override
public EurekaHttpResponse<Applications> getApplications(final String... regions) {
return execute(new RequestExecutor<Applications>() {
@Override
public EurekaHttpResponse<Applications> execute(EurekaHttpClient delegate) {
// 请求服务列表
return delegate.getApplications(regions);
}
@Override
public RequestType getRequestType() {
// 请求的类型:获取服务列表(GetApplications)
// RequestType是一个枚举类,里面存放了一堆类型
return RequestType.GetApplications;
}
});
}
- delegate.getApplications(regions)方法就是我们请求服务端服务注册列表的方法,实际调用的是AbstractJerseyEurekaHttpClient类中的方法,里面都是具体的jersey实现的网络接口请求,可以将jersey理解为SpringMVC
// 全量接口,全量请求服务注册列表信息
@Override
public EurekaHttpResponse<Applications> getApplications(String... regions) {
return getApplicationsInternal("apps/", regions);
}
// 增量接口:增量请求服务注册列表信息
@Override
public EurekaHttpResponse<Applications> getDelta(String... regions) {
return getApplicationsInternal("apps/delta", regions);
}
- 观察其具体的方法getApplicationsInternal
// 所有的请求响应的处理都在这个方法中
private EurekaHttpResponse<Applications> getApplicationsInternal(String urlPath, String[] regions) {
ClientResponse response = null;
String regionsParamValue = null;
try {
// 发送restful请求,请求服务端的内容
WebResource webResource = jerseyClient.resource(serviceUrl).path(urlPath);
if (regions != null && regions.length > 0) {
regionsParamValue = StringUtil.join(regions);
webResource = webResource.queryParam("regions", regionsParamValue);
}
Builder requestBuilder = webResource.getRequestBuilder();
addExtraHeaders(requestBuilder);
// 发送网络请求,将响应封装成ClientResponse实例
response = requestBuilder.accept(MediaType.APPLICATION_JSON_TYPE).get(ClientResponse.class);
Applications applications = null;
if (response.getStatus() == Status.OK.getStatusCode() && response.hasEntity()) {
// 得到其响应的信息,即服务注册列表信息
applications = response.getEntity(Applications.class);
}
return anEurekaHttpResponse(response.getStatus(), Applications.class)
.headers(headersOf(response))
.entity(applications)
.build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP GET {}/{}?{}; statusCode={}",
serviceUrl, urlPath,
regionsParamValue == null ? "" : "regions=" + regionsParamValue,
response == null ? "N/A" : response.getStatus()
);
}
if (response != null) {
response.close();
}
}
}
- 获取全量数据,是通过jersey-client库的API向Eureka Server发起restful请求 http://localhost:8761/eureka/apps实现的,这个url就是我们在yml文件中配置的defaultZone中的地址后面拼接参数apps,我们在得到相应的结果之后,将返回的服务数据列表放在一个变量中作为本地的缓存
- http://localhost:8761/eureka/apps响应的结果是一个xml文件,里面有其详细的信息,我们可以在启动服务端和注册个别客户端之后,直接在浏览器中请求该地址进行查看。
12. 获取服务列表信息的增量更新操作getAndUpdateDelta
- 我们在这里接着看获取增量方法的操作
/**
* Get the delta registry information from the eureka server and update it locally.
* When applying the delta, the following flow is observed:
*
* if (update generation have not advanced (due to another thread))
* atomically try to: update application with the delta and get reconcileHashCode
* abort entire processing otherwise
* do reconciliation if reconcileHashCode clash
* fi
*
* @return the client response
* @throws Throwable on error
*/
private void getAndUpdateDelta(Applications applications) throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
Applications delta = null;
// 调用方法实现增量信息的获取
EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
// 得到增量接口的响应
delta = httpResponse.getEntity();
}
if (delta == null) {
logger.warn("The server does not allow the delta revision to be applied because it is not safe. "
+ "Hence got the full registry.");
// 如果获取的增量信息的数据是空,就会发起一次全量的更新
getAndStoreFullRegistry();
}
// 利用CAS进行操作
// 如果这个期间 fetchRegistryGeneration的值发生改变,表示的是其他的线程也做了类似的操作,我们就放弃本次响应的数据
else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
logger.debug("Got delta update with apps hashcode {}", delta.getAppsHashCode());
String reconcileHashCode = "";
if (fetchRegistryUpdateLock.tryLock()) {
try {
// 利用Eureka的增量数据和本地的数据做合并操作,这个方法后面会详细解释
updateDelta(delta);
// 用合并了增量数据之后的本地数据生成一致性哈希码
reconcileHashCode = getReconcileHashCode(applications);
} finally {
fetchRegistryUpdateLock.unlock();
}
} else {
logger.warn("Cannot acquire update lock, aborting getAndUpdateDelta");
}
// There is a diff in number of instances for some reason
// Eureka Server在返回增量更新数据的时候,也会返回服务端的一致性哈希码
// 理论上每次本地缓存数据经历了多次增量更新之后,应该与服务端的哈希码是一致的。
// 如果发现哈希码不一致,就证明了班底缓存的服务列表信息和Eureka Server不一致,需要来一次全量更新
if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
// 如果本地的合并增量数据后的一致性哈希码不等于服务端传过来的哈希码,就做一次全量操作
reconcileAndLogDifference(delta, reconcileHashCode); // this makes a remoteCall
}
} else {
logger.warn("Not updating application delta as another thread is updating it already");
logger.debug("Ignoring delta update with apps hashcode {}, as another thread is updating it already", delta.getAppsHashCode());
}
}
**13. updateDelta方法将增量更新数据和本地数据做合并 **
- 在上面增量更新的方法中,我们有一步是对获得的服务端的增量服务列表信息和本地数据做合并,这里我们详细看下
/**
* Updates the delta information fetches from the eureka server into the
* local cache.
*
* @param delta
* the delta information received from eureka server in the last
* poll cycle.
*/
// 这里的参数是获得的服务端的所有的增量信息
private void updateDelta(Applications delta) {
int deltaCount = 0;
// 遍历所有的增量服务
for (Application app : delta.getRegisteredApplications()) {
for (InstanceInfo instance : app.getInstances()) {
// 取出当前客户端的所有的服务列表
Applications applications = getApplications();
String instanceRegion = instanceRegionChecker.getInstanceRegion(instance);
if (!instanceRegionChecker.isLocalRegion(instanceRegion)) {
Applications remoteApps = remoteRegionVsApps.get(instanceRegion);
if (null == remoteApps) {
remoteApps = new Applications();
remoteRegionVsApps.put(instanceRegion, remoteApps);
}
applications = remoteApps;
}
++deltaCount;
// 对增量的实例的处理
if (ActionType.ADDED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Added instance {} to the existing apps in region {}", instance.getId(), instanceRegion);
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
}
// 对修改的实例的处理
else if (ActionType.MODIFIED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Modified instance {} to the existing apps ", instance.getId());
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
}
// 对删除实例的处理
else if (ActionType.DELETED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Deleted instance {} to the existing apps ", instance.getId());
applications.getRegisteredApplications(instance.getAppName()).removeInstance(instance);
}
}
}
logger.debug("The total number of instances fetched by the delta processor : {}", deltaCount);
getApplications().setVersion(delta.getVersion());
// 整理数据,让后续的使用的过程中,这些应用的实例总是以相同的数据进行返回
getApplications().shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
// 即使不在同一个region中,其实例的顺序也需要整理
for (Applications applications : remoteRegionVsApps.values()) {
applications.setVersion(delta.getVersion());
applications.shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
}
}
13. 服务续约操作
- 上面我们已经基本明白了服务的列表的更新的操作,有全量更新和增量更新
- 现在我们来了解下服务的定时续约
- 我们跟进HeartbeatThread方法
/**
* The heartbeat task that renews the lease in the given intervals.
*/
private class HeartbeatThread implements Runnable {
// 实现了Runnable接口,所以该run方法是一个线程的方法
public void run() {
// 调用renew()方法,该方法就是真正的服务续约的方法
if (renew()) {
lastSuccessfulHeartbeatTimestamp = System.currentTimeMillis();
}
}
}
/**
* Renew with the eureka service by making the appropriate REST call
*/
boolean renew() {
EurekaHttpResponse<InstanceInfo> httpResponse;
try {
// 发送心跳,给指定的appname
httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
// 如果当前心跳的反应是404,表示心跳没有反应
if (httpResponse.getStatusCode() == 404) {
REREGISTER_COUNTER.increment();
logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
long timestamp = instanceInfo.setIsDirtyWithTime();
// 这里尝试的是重新服务注册操作,这个是服务续约中的服务注册,后面会详细将服务注册
boolean success = register();
if (success) {
instanceInfo.unsetIsDirty(timestamp);
}
return success;
}
return httpResponse.getStatusCode() == 200;
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
return false;
}
}
- 我们会先发送心跳进行服务续约,如果失败我们就会进行重新的服务注册功能,服务注册后面会讲,我们可以看下服务发送心跳
@Override
public EurekaHttpResponse<InstanceInfo> sendHeartBeat(String appName, String id, InstanceInfo info, InstanceStatus overriddenStatus) {
String urlPath = "apps/" + appName + '/' + id;
ClientResponse response = null;
try {
WebResource webResource = jerseyClient.resource(serviceUrl)
.path(urlPath)
.queryParam("status", info.getStatus().toString())
.queryParam("lastDirtyTimestamp", info.getLastDirtyTimestamp().toString());
if (overriddenStatus != null) {
webResource = webResource.queryParam("overriddenstatus", overriddenStatus.name());
}
Builder requestBuilder = webResource.getRequestBuilder();
addExtraHeaders(requestBuilder);
response = requestBuilder.put(ClientResponse.class);
EurekaHttpResponseBuilder<InstanceInfo> eurekaResponseBuilder = anEurekaHttpResponse(response.getStatus(), InstanceInfo.class).headers(headersOf(response));
if (response.hasEntity()) {
eurekaResponseBuilder.entity(response.getEntity(InstanceInfo.class));
}
return eurekaResponseBuilder.build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP PUT {}/{}; statusCode={}", serviceUrl, urlPath, response == null ? "N/A" : response.getStatus());
}
if (response != null) {
response.close();
}
}
}
- 这里和我们获取全量注册表和增量注册表的思路如出一辙,都是通过jersy调用服务端的一个接口,进性相关的操作。
14. 服务注册
/**
* Initializes all scheduled tasks.
*/
private void initScheduledTasks() {
if (clientConfig.shouldFetchRegistry()) {
// registry cache refresh timer
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
scheduler.schedule(
new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
new CacheRefreshThread()
),
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
if (clientConfig.shouldRegisterWithEureka()) {
int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
logger.info("Starting heartbeat executor: " + "renew interval is: {}", renewalIntervalInSecs);
// Heartbeat timer
scheduler.schedule(
new TimedSupervisorTask(
"heartbeat",
scheduler,
heartbeatExecutor,
renewalIntervalInSecs,
TimeUnit.SECONDS,
expBackOffBound,
new HeartbeatThread()
),
renewalIntervalInSecs, TimeUnit.SECONDS);
// InstanceInfo replicator
instanceInfoReplicator = new InstanceInfoReplicator(
this,
instanceInfo,
clientConfig.getInstanceInfoReplicationIntervalSeconds(),
2); // burstSize
statusChangeListener = new ApplicationInfoManager.StatusChangeListener() {
@Override
public String getId() {
return "statusChangeListener";
}
@Override
public void notify(StatusChangeEvent statusChangeEvent) {
if (InstanceStatus.DOWN == statusChangeEvent.getStatus() ||
InstanceStatus.DOWN == statusChangeEvent.getPreviousStatus()) {
// log at warn level if DOWN was involved
logger.warn("Saw local status change event {}", statusChangeEvent);
} else {
logger.info("Saw local status change event {}", statusChangeEvent);
}
instanceInfoReplicator.onDemandUpdate();
}
};
if (clientConfig.shouldOnDemandUpdateStatusChange()) {
applicationInfoManager.registerStatusChangeListener(statusChangeListener);
}
instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
} else {
logger.info("Not registering with Eureka server per configuration");
}
}
- 服务注册的功能还是在initScheduledTasks方法中instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
实现。
public void start(int initialDelayMs) {
if (started.compareAndSet(false, true)) {
instanceInfo.setIsDirty(); // for initial register
// 线程调用服务注册
Future next = scheduler.schedule(this, initialDelayMs, TimeUnit.SECONDS);
scheduledPeriodicRef.set(next);
}
}
public void run() {
try {
discoveryClient.refreshInstanceInfo();
Long dirtyTimestamp = instanceInfo.isDirtyWithTime();
if (dirtyTimestamp != null) {
// 服务注册功能
discoveryClient.register();
instanceInfo.unsetIsDirty(dirtyTimestamp);
}
} catch (Throwable t) {
logger.warn("There was a problem with the instance info replicator", t);
} finally {
Future next = scheduler.schedule(this, replicationIntervalSeconds, TimeUnit.SECONDS);
scheduledPeriodicRef.set(next);
}
}
/**
* Register with the eureka service by making the appropriate REST call.
*/
boolean register() throws Throwable {
logger.info(PREFIX + "{}: registering service...", appPathIdentifier);
EurekaHttpResponse<Void> httpResponse;
try {
// 还是调用的服务端的接口,进行注册,将当前服务的信息传递过去
httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
} catch (Exception e) {
logger.warn(PREFIX + "{} - registration failed {}", appPathIdentifier, e.getMessage(), e);
throw e;
}
if (logger.isInfoEnabled()) {
logger.info(PREFIX + "{} - registration status: {}", appPathIdentifier, httpResponse.getStatusCode());
}
return httpResponse.getStatusCode() == 204;
}
@Override
public EurekaHttpResponse<Void> register(InstanceInfo info) {
String urlPath = "apps/" + info.getAppName();
ClientResponse response = null;
try {
Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
addExtraHeaders(resourceBuilder);
response = resourceBuilder
.header("Accept-Encoding", "gzip")
.type(MediaType.APPLICATION_JSON_TYPE)
.accept(MediaType.APPLICATION_JSON)
.post(ClientResponse.class, info);
return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
response == null ? "N/A" : response.getStatus());
}
if (response != null) {
response.close();
}
}
}
- 像上面的服务注册,服务续约,请求服务列表的全量接口,服务列表的增量接口我们都是请求的服务端的,通过Jersey接口进行请求的。
Eureka Server服务端Jersey接口源码解析
1. 服务端Jersey接口处理类ApplicationResource
- 该类中有一个addInstance方法就是用来接收客户端的注册请求接口的。
/**
* Registers information about a particular instance for an
* {@link com.netflix.discovery.shared.Application}.
*
* @param info
* {@link InstanceInfo} information of the instance.
* @param isReplication
* a header parameter containing information whether this is
* replicated from other nodes.
*/
// 服务注册接口,客户端向Eureka Server注册
@POST
@Consumes({"application/json", "application/xml"})
public Response addInstance(InstanceInfo info,
@HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication) {
logger.debug("Registering instance {} (replication={})", info.getId(), isReplication);
// validate that the instanceinfo contains all the necessary required fields
// 进行参数的各种校验,如果有不符合验证规则的,直接返回400状态码
if (isBlank(info.getId())) {
return Response.status(400).entity("Missing instanceId").build();
} else if (isBlank(info.getHostName())) {
return Response.status(400).entity("Missing hostname").build();
} else if (isBlank(info.getIPAddr())) {
return Response.status(400).entity("Missing ip address").build();
} else if (isBlank(info.getAppName())) {
return Response.status(400).entity("Missing appName").build();
} else if (!appName.equals(info.getAppName())) {
return Response.status(400).entity("Mismatched appName, expecting " + appName + " but was " + info.getAppName()).build();
} else if (info.getDataCenterInfo() == null) {
return Response.status(400).entity("Missing dataCenterInfo").build();
} else if (info.getDataCenterInfo().getName() == null) {
return Response.status(400).entity("Missing dataCenterInfo Name").build();
}
// handle cases where clients may be registering with bad DataCenterInfo with missing data
DataCenterInfo dataCenterInfo = info.getDataCenterInfo();
if (dataCenterInfo instanceof UniqueIdentifier) {
String dataCenterInfoId = ((UniqueIdentifier) dataCenterInfo).getId();
if (isBlank(dataCenterInfoId)) {
boolean experimental = "true".equalsIgnoreCase(serverConfig.getExperimental("registration.validation.dataCenterInfoId"));
if (experimental) {
String entity = "DataCenterInfo of type " + dataCenterInfo.getClass() + " must contain a valid id";
return Response.status(400).entity(entity).build();
} else if (dataCenterInfo instanceof AmazonInfo) {
AmazonInfo amazonInfo = (AmazonInfo) dataCenterInfo;
String effectiveId = amazonInfo.get(AmazonInfo.MetaDataKey.instanceId);
if (effectiveId == null) {
amazonInfo.getMetadata().put(AmazonInfo.MetaDataKey.instanceId.getName(), info.getId());
}
} else {
logger.warn("Registering DataCenterInfo of type {} without an appropriate id", dataCenterInfo.getClass());
}
}
}
// 上面一堆乱七八糟的东西,也不知道干嘛,先不管,看不懂的先放放,只抓主流程
// 这个方法是关键,我们进行注册。
registry.register(info, "true".equals(isReplication));
return Response.status(204).build(); // 204 to be backwards compatible
}
- 上面的方法,其实我们就主要看一处,就是registry.register(info, “true”.equals(isReplication));,该方法就是服务注册的方法。参数就是客户端传来的信息。
2. AbstractInstanceRegistry中的注册方法—register
- 该方法就是接着讲上面的服务注册的方法
/**
* Registers a new instance with a given duration.
*
* @see com.netflix.eureka.lease.LeaseManager#register(java.lang.Object, int, boolean)
*/
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
try {
// 上只读锁
read.lock();
// 从本地的Map中获取当前实例的信息,双层Map,第一层的key是appName,第二层是InstanceID,这就是Eureka存储注册列表的数据结构
Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
// 增加注册次数到监控信息中
REGISTER.increment(isReplication);
if (gMap == null) {
// 如果该AppName没有对应的map,表示的是这是第一次进来的
// 创建一个ConcurrentHashMap放入到registry里面去
final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
// putIfAbsent方法主要是在想ConcurrentHashMap中添加键值对的时候,他会先判断该键值对是否已经存在。
// 如果不存在则会向map中添加新的键值对,并且返回null
// 如果已经存在,那么不会覆盖已经存在的值,直接返回已经存在的值。
gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
if (gMap == null) {
// 表明当前Map是不存在要添加的键值对的。设置gMap是最新创建的那个
gMap = gNewMap;
}
}
// 查询Lease信息,key是InstanceId
Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
// Retain the last dirty timestamp without overwriting it, if there is already a lease
// 当Lease对象不是空的时候
if (existingLease != null && (existingLease.getHolder() != null)) {
// 当instance已经存在的时候,和客户端的instance的信息做比较,时间最新的那一个,就是有效的instance信息。
// 服务端的是instance的时间信息
Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
// 客户端的instance的时间信息
Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
// this is a > instead of a >= because if the timestamps are equal, we still take the remote transmitted
// InstanceInfo instead of the server local copy.
if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
" than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
registrant = existingLease.getHolder();
}
} else {
// The lease does not exist and hence it is a new registration
// 这里只有当existinglease不存在的时候才会进来,像那种恢复心跳,信息过期的都不会进入到这里。
// Eureka Server的自我保护机制做的操作,就是每分钟最大续约数+2,同时重新计算每分钟最小的续约数
synchronized (lock) {
if (this.expectedNumberOfRenewsPerMin > 0) {
// Since the client wants to cancel it, reduce the threshold
// (1
// for 30 seconds, 2 for a minute)
this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
}
}
logger.debug("No previous lease information found; it is new registration");
}
// 创建一个新的Lease对象,存放服务的信息
Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
if (existingLease != null) {
// 当原来存在的Lease的信息时候,设置他的ServiceUpTimestamp时间戳,保证服务开启的时间一直是第一次的那个。
lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
}
// 放入本地的Map中
gMap.put(registrant.getId(), lease);
// 添加到最近的注册队列里面去,以时间戳作为key,名称作为value,主要是运维界面的数据统计。
synchronized (recentRegisteredQueue) {
recentRegisteredQueue.add(new Pair<Long, String>(
System.currentTimeMillis(),
registrant.getAppName() + "(" + registrant.getId() + ")"));
}
// This is where the initial state transfer of overridden status happens
if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
+ "overrides", registrant.getOverriddenStatus(), registrant.getId());
if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
logger.info("Not found overridden id {} and hence adding it", registrant.getId());
overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
}
}
InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
if (overriddenStatusFromMap != null) {
logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
registrant.setOverriddenStatus(overriddenStatusFromMap);
}
// Set the status based on the overridden status rules
InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
registrant.setStatusWithoutDirty(overriddenInstanceStatus);
// If the lease is registered with UP status, set lease service up timestamp
// 判断其状态是否是UP的状态
if (InstanceStatus.UP.equals(registrant.getStatus())) {
lease.serviceUp();
}
// 设置注册类型是添加类型
registrant.setActionType(ActionType.ADDED);
// 变更记录队列,记录了实例的每次的变化,用于注册信息的增量获取
recentlyChangedQueue.add(new RecentlyChangedItem(lease));
registrant.setLastUpdatedTimestamp();
// 清理缓存,传入的参数是key,主动失效读写缓存里面的数据
invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
logger.info("Registered instance {}/{} with status {} (replication={})",
registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
} finally {
read.unlock();
}
}
- 理解上面的register方法我们先需要了解下注册实例信息存放的map,这个map是一个双层的ConcurrentHashMap<String, Map<String, Lease>>,外层map的key是appName,也就是服务名,内层map的key是instanceId,也就是实例名。
- 距离说明map数据:
- 内层Map的value是一个Lease类
/*
* Copyright 2012 Netflix, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.netflix.eureka.lease;
import com.netflix.eureka.registry.AbstractInstanceRegistry;
/**
* Describes a time-based availability of a {@link T}. Purpose is to avoid
* accumulation of instances in {@link AbstractInstanceRegistry} as result of ungraceful
* shutdowns that is not uncommon in AWS environments.
*
* If a lease elapses without renewals, it will eventually expire consequently
* marking the associated {@link T} for immediate eviction - this is similar to
* an explicit cancellation except that there is no communication between the
* {@link T} and {@link LeaseManager}.
*
* @author Karthik Ranganathan, Greg Kim
*/
public class Lease<T> {
enum Action {
Register, Cancel, Renew
};
public static final int DEFAULT_DURATION_IN_SECS = 90;
private T holder;
private long evictionTimestamp;
private long registrationTimestamp;
private long serviceUpTimestamp;
// Make it volatile so that the expiration task would see this quicker
private volatile long lastUpdateTimestamp;
private long duration;
public Lease(T r, int durationInSecs) {
holder = r;
registrationTimestamp = System.currentTimeMillis();
lastUpdateTimestamp = registrationTimestamp;
duration = (durationInSecs * 1000);
}
/**
* Renew the lease, use renewal duration if it was specified by the
* associated {@link T} during registration, otherwise default duration is
* {@link #DEFAULT_DURATION_IN_SECS}.
*/
public void renew() {
lastUpdateTimestamp = System.currentTimeMillis() + duration;
}
/**
* Cancels the lease by updating the eviction time.
*/
public void cancel() {
if (evictionTimestamp <= 0) {
evictionTimestamp = System.currentTimeMillis();
}
}
/**
* Mark the service as up. This will only take affect the first time called,
* subsequent calls will be ignored.
*/
public void serviceUp() {
if (serviceUpTimestamp == 0) {
serviceUpTimestamp = System.currentTimeMillis();
}
}
/**
* Set the leases service UP timestamp.
*/
public void setServiceUpTimestamp(long serviceUpTimestamp) {
this.serviceUpTimestamp = serviceUpTimestamp;
}
/**
* Checks if the lease of a given {@link com.netflix.appinfo.InstanceInfo} has expired or not.
*/
public boolean isExpired() {
return isExpired(0l);
}
/**
* Checks if the lease of a given {@link com.netflix.appinfo.InstanceInfo} has expired or not.
*
* Note that due to renew() doing the 'wrong" thing and setting lastUpdateTimestamp to +duration more than
* what it should be, the expiry will actually be 2 * duration. This is a minor bug and should only affect
* instances that ungracefully shutdown. Due to possible wide ranging impact to existing usage, this will
* not be fixed.
*
* @param additionalLeaseMs any additional lease time to add to the lease evaluation in ms.
*/
public boolean isExpired(long additionalLeaseMs) {
return (evictionTimestamp > 0 || System.currentTimeMillis() > (lastUpdateTimestamp + duration + additionalLeaseMs));
}
/**
* Gets the milliseconds since epoch when the lease was registered.
*
* @return the milliseconds since epoch when the lease was registered.
*/
public long getRegistrationTimestamp() {
return registrationTimestamp;
}
/**
* Gets the milliseconds since epoch when the lease was last renewed.
* Note that the value returned here is actually not the last lease renewal time but the renewal + duration.
*
* @return the milliseconds since epoch when the lease was last renewed.
*/
public long getLastRenewalTimestamp() {
return lastUpdateTimestamp;
}
/**
* Gets the milliseconds since epoch when the lease was evicted.
*
* @return the milliseconds since epoch when the lease was evicted.
*/
public long getEvictionTimestamp() {
return evictionTimestamp;
}
/**
* Gets the milliseconds since epoch when the service for the lease was marked as up.
*
* @return the milliseconds since epoch when the service for the lease was marked as up.
*/
public long getServiceUpTimestamp() {
return serviceUpTimestamp;
}
/**
* Returns the holder of the lease.
*/
public T getHolder() {
return holder;
}
}
这里其实有个bug,服务续约功能renew也在这个方法中
public void renew() {
lastUpdateTimestamp = System.currentTimeMillis() + duration;
}
- 这个方法实际上是有bug的,我们在这里不应该加上duration,加上的话,我们在判断服务剔除的时候不是按照默认90s的,而是180s,即2*duration的时间
- 属性介绍:
(1)DEFAULT_DURATION_IN_SECS : 租约过期的时间常量,默认未90秒,也就说90秒没有心跳过来,那么这边将会自 动剔除该节点
(2)holder :这个租约是属于谁的, 目前占用这个属性的是 instanceInfo,也就是客户端实例信息。
(3)evictionTimestamp : 租约是啥时候过期的,当服务下线的时候,会过来更新这个时间戳
(4)registrationTimestamp : 租约的注册时间
(5)serviceUpTimestamp :服务启动时间 ,当客户端在注册的时候,instanceInfo的status 为UP的时候,则更新这个 时间戳
(6)lastUpdateTimestamp :最后更新时间,每次续约的时候,都会更新这个时间戳,在判断实例
是否过期时,需要用到这个属性。
(7)duration:过期时间,毫秒单位
3. 服务端Jersey接口处理注册实例信息
- 所有的客户端刷新服务注册信息的接口,都会走ApplicationsResource类下的getContainers方法,来获取所有注册实例信息
/**
* Get information about all {@link com.netflix.discovery.shared.Applications}.
*
* @param version the version of the request.
* @param acceptHeader the accept header to indicate whether to serve JSON or XML data.
* @param acceptEncoding the accept header to indicate whether to serve compressed or uncompressed data.
* @param eurekaAccept an eureka accept extension, see {@link com.netflix.appinfo.EurekaAccept}
* @param uriInfo the {@link java.net.URI} information of the request made.
* @param regionsStr A comma separated list of remote regions from which the instances will also be returned.
* The applications returned from the remote region can be limited to the applications
* returned by {@link EurekaServerConfig#getRemoteRegionAppWhitelist(String)}
*
* @return a response containing information about all {@link com.netflix.discovery.shared.Applications}
* from the {@link AbstractInstanceRegistry}.
*/
@GET
public Response getContainers(@PathParam("version") String version,
@HeaderParam(HEADER_ACCEPT) String acceptHeader,
@HeaderParam(HEADER_ACCEPT_ENCODING) String acceptEncoding,
@HeaderParam(EurekaAccept.HTTP_X_EUREKA_ACCEPT) String eurekaAccept,
@Context UriInfo uriInfo,
@Nullable @QueryParam("regions") String regionsStr) {
boolean isRemoteRegionRequested = null != regionsStr && !regionsStr.isEmpty();
String[] regions = null;
if (!isRemoteRegionRequested) {
EurekaMonitors.GET_ALL.increment();
} else {
regions = regionsStr.toLowerCase().split(",");
Arrays.sort(regions); // So we don't have different caches for same regions queried in different order.
EurekaMonitors.GET_ALL_WITH_REMOTE_REGIONS.increment();
}
// Check if the server allows the access to the registry. The server can
// restrict access if it is not
// ready to serve traffic depending on various reasons.
if (!registry.shouldAllowAccess(isRemoteRegionRequested)) {
return Response.status(Status.FORBIDDEN).build();
}
CurrentRequestVersion.set(Version.toEnum(version));
KeyType keyType = Key.KeyType.JSON;
String returnMediaType = MediaType.APPLICATION_JSON;
if (acceptHeader == null || !acceptHeader.contains(HEADER_JSON_VALUE)) {
keyType = Key.KeyType.XML;
returnMediaType = MediaType.APPLICATION_XML;
}
// 获取服务实例对应的缓存的key
Key cacheKey = new Key(Key.EntityType.Application,
ResponseCacheImpl.ALL_APPS,
keyType, CurrentRequestVersion.get(), EurekaAccept.fromString(eurekaAccept), regions
);
Response response;
if (acceptEncoding != null && acceptEncoding.contains(HEADER_GZIP_VALUE)) {
response = Response.ok(responseCache.getGZIP(cacheKey))
.header(HEADER_CONTENT_ENCODING, HEADER_GZIP_VALUE)
.header(HEADER_CONTENT_TYPE, returnMediaType)
.build();
} else {
// 从缓存里面获取服务实例注册信息
response = Response.ok(responseCache.get(cacheKey))
.build();
}
return response;
}
- response = Response.ok(responseCache.get(cacheKey))中的responseCache.get(cacheKey)对应的源码如下
@VisibleForTesting
String get(final Key key, boolean useReadOnlyCache) {
// 从多级缓存中获取注册实例的信息
Value payload = getValue(key, useReadOnlyCache);
if (payload == null || payload.getPayload().equals(EMPTY_PAYLOAD)) {
return null;
} else {
return payload.getPayload();
}
}
/**
* Get the payload in both compressed and uncompressed form.
*/
@VisibleForTesting
Value getValue(final Key key, boolean useReadOnlyCache) {
Value payload = null;
try {
if (useReadOnlyCache) {
// 从只读缓存中获取(这里只获取)
final Value currentPayload = readOnlyCacheMap.get(key);
if (currentPayload != null) {
payload = currentPayload;
} else {
// 从读写缓存中获取
payload = readWriteCacheMap.get(key);
// 更新只读缓存中的值
readOnlyCacheMap.put(key, payload);
}
} else {
// 如果不能使用只读缓存,直接从读写缓存中取值,不去更细只读缓存
payload = readWriteCacheMap.get(key);
}
} catch (Throwable t) {
logger.error("Cannot get value for key : {}", key, t);
}
return payload;
}
ResponseCacheImpl(EurekaServerConfig serverConfig, ServerCodecs serverCodecs, AbstractInstanceRegistry registry) {
this.serverConfig = serverConfig;
this.serverCodecs = serverCodecs;
this.shouldUseReadOnlyResponseCache = serverConfig.shouldUseReadOnlyResponseCache();
this.registry = registry;
long responseCacheUpdateIntervalMs = serverConfig.getResponseCacheUpdateIntervalMs();
// 读写缓存默认180s会自动定时过期
this.readWriteCacheMap =
CacheBuilder.newBuilder().initialCapacity(1000)
.expireAfterWrite(serverConfig.getResponseCacheAutoExpirationInSeconds(), TimeUnit.SECONDS)
.removalListener(new RemovalListener<Key, Value>() {
@Override
public void onRemoval(RemovalNotification<Key, Value> notification) {
Key removedKey = notification.getKey();
if (removedKey.hasRegions()) {
Key cloneWithNoRegions = removedKey.cloneWithoutRegions();
regionSpecificKeys.remove(cloneWithNoRegions, removedKey);
}
}
})
.build(new CacheLoader<Key, Value>() {
@Override
public Value load(Key key) throws Exception {
if (key.hasRegions()) {
Key cloneWithNoRegions = key.cloneWithoutRegions();
regionSpecificKeys.put(cloneWithNoRegions, key);
}
Value value = generatePayload(key);
return value;
}
});
if (shouldUseReadOnlyResponseCache) {
// 默认每个30s用读写缓存的数据更新只读缓存的数据
timer.schedule(getCacheUpdateTask(),
new Date(((System.currentTimeMillis() / responseCacheUpdateIntervalMs) * responseCacheUpdateIntervalMs)
+ responseCacheUpdateIntervalMs),
responseCacheUpdateIntervalMs);
}
try {
Monitors.registerObject(this);
} catch (Throwable e) {
logger.warn("Cannot register the JMX monitor for the InstanceRegistry", e);
}
}
/*
* Generate pay load for the given key.
*/
// 初始化直接从注册表registry里拿数据放到读写缓存中去
private Value generatePayload(Key key) {
Stopwatch tracer = null;
try {
String payload;
switch (key.getEntityType()) {
case Application:
boolean isRemoteRegionRequested = key.hasRegions();
if (ALL_APPS.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeAllAppsWithRemoteRegionTimer.start();
payload = getPayLoad(key, registry.getApplicationsFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeAllAppsTimer.start();
payload = getPayLoad(key, registry.getApplications());
}
} else if (ALL_APPS_DELTA.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeDeltaAppsWithRemoteRegionTimer.start();
versionDeltaWithRegions.incrementAndGet();
versionDeltaWithRegionsLegacy.incrementAndGet();
payload = getPayLoad(key,
registry.getApplicationDeltasFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeDeltaAppsTimer.start();
versionDelta.incrementAndGet();
versionDeltaLegacy.incrementAndGet();
payload = getPayLoad(key, registry.getApplicationDeltas());
}
} else {
tracer = serializeOneApptimer.start();
payload = getPayLoad(key, registry.getApplication(key.getName()));
}
break;
case VIP:
case SVIP:
tracer = serializeViptimer.start();
payload = getPayLoad(key, getApplicationsForVip(key, registry));
break;
default:
logger.error("Unidentified entity type: {} found in the cache key.", key.getEntityType());
payload = "";
break;
}
return new Value(payload);
} finally {
if (tracer != null) {
tracer.stop();
}
}
}
// 用读写缓存的数据更新只读缓存的数据
private TimerTask getCacheUpdateTask() {
return new TimerTask() {
@Override
public void run() {
logger.debug("Updating the client cache from response cache");
for (Key key : readOnlyCacheMap.keySet()) {
if (logger.isDebugEnabled()) {
logger.debug("Updating the client cache from response cache for key : {} {} {} {}",
key.getEntityType(), key.getName(), key.getVersion(), key.getType());
}
try {
CurrentRequestVersion.set(key.getVersion());
Value cacheValue = readWriteCacheMap.get(key);
Value currentCacheValue = readOnlyCacheMap.get(key);
if (cacheValue != currentCacheValue) {
readOnlyCacheMap.put(key, cacheValue);
}
} catch (Throwable th) {
logger.error("Error while updating the client cache from response cache for key {}", key.toStringCompact(), th);
}
}
}
};
}
小结这里:
- 源码精髓:多级缓存的设计思想
-
在拉取注册表的时候:
(1)首先从只读缓存(ReadOnlyCacheMap)里面查询缓存的注册表
(2)若只读缓存中不存在,就从读写缓存中找注册表(ReadWriteCacheMap)
(3)如果还是没有,就从内存中获取实际的注册表数据(这个肯定有,这个就是Eureka Server的注册表信息) -
在注册表发生变更的时候:
(1)会在内存中更新变更的注册表的数据,同时清除掉读写缓存中的值(清除ReadWriteCacheMap)
(2)该过程不会影响只读缓存(ReadOnlyCacheMap),依然可以查询只读缓存中的注册表
(3)默认每隔30s,Eureka Server会将ReadWriteCacheMap(读写缓存)中的值更新到只读缓存中(ReadOnlyCacheMap)
(4)默认每隔180s的时候Eureka Server会将ReadWriteCacheMap(读写缓存)中的数据失效
(5)下次有服务拉取注册表的时候,又会从内存中获取最新的数据,同时填充读写缓存和只读缓存。
- 多级缓存的优点
- 尽可能的保证了内存注册表数据不会出现频繁的读写冲突问题
- 进一步保证对Eureka Server的大量请求,都是快速从纯内存走,性能很高。
- 看完源码可以解决的诡异问题
-
当我们Eureka服务实例有注册或者下线或者有实例发生故障的时候,内存注册表虽然会及时更新数据,但是客户端不一定能够及时的感知,可能会过30s才感知,这是因为我们有多级缓存策略,我们先感知的是只读缓存中的数据,只有每隔30s,读写缓存才会同步到只读缓存中,这时候如果读写缓存为空,同步只读缓存,只读缓存和读写缓存都是空,这时候请求真正的内存,存有最新的服务信息,然后同步读写和只读缓存。我们这里只保证最终一致性。
-
服务剔除的时间并不是90s就进行剔除,我们剔除的时间默认是180s,即2*duration。这是由于Eureka的bug导致的,判断过期时间的时候,多加了一个duration(默认90s)