“不积跬步,无以至千里。”
之前说过,Eureka client第一次启动的时候,回去一次性全量拉取server端的注册表信息,缓存在本地,后续会通过一个定时调用的线程任务,每隔一定时间(默认30s),发动请求到server端,拉取增量的注册表。
这个所谓的“增量”,其实是说,跟上一次拉取相比,后续有变动的服务实例,而不是每次都全部拉取下来。
这个定时调度的线程池是通过DiscoveryClient构造方法中的一个initScheduledTasks()
方法初始化的
// finally, init the schedule tasks (e.g. cluster resolvers, heartbeat, instanceInfo replicator, fetch
initScheduledTasks();
private void initScheduledTasks() {
if (clientConfig.shouldFetchRegistry()) {
// registry cache refresh timer
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
cacheRefreshTask = new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
new CacheRefreshThread()
);
scheduler.schedule(
cacheRefreshTask,
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
... ...
}
这个CacheRefreshThread
线程的run()
方法就是去拉取注册表
class CacheRefreshThread implements Runnable {
public void run() {
refreshRegistry();
}
}
@VisibleForTesting
void refreshRegistry() {
try {
boolean isFetchingRemoteRegionRegistries = isFetchingRemoteRegionRegistries();
boolean remoteRegionsModified = false;
// This makes sure that a dynamic change to remote regions to fetch is honored.
//无用逻辑... ...
//拉取注册表
boolean success = fetchRegistry(remoteRegionsModified);
if (success) {
registrySize = localRegionApps.get().size();
lastSuccessfulRegistryFetchTimestamp = System.currentTimeMillis();
}
这个拉取的方法之前看过,只不过这一次走的是拉取增量注册表的逻辑,所以会执行getAndUpdateDelta(applications)
这个方法
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
try {
// If the delta is disabled or if it is the first time, get all
// applications
Applications applications = getApplications();
if (clientConfig.shouldDisableDelta()
|| (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
|| forceFullRegistryFetch
|| (applications == null)
|| (applications.getRegisteredApplications().size() == 0)
|| (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
{
logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
logger.info("Application is null : {}", (applications == null));
logger.info("Registered Applications size is zero : {}",
(applications.getRegisteredApplications().size() == 0));
logger.info("Application version is -1: {}", (applications.getVersion() == -1));
getAndStoreFullRegistry();
} else {
//这次上面的判断全部是false,所以会执行拉取增量注册表的方法
getAndUpdateDelta(applications);
}
在getAndUpdateDelta方法里,会走EurekaHttpClient的getDelta()方法和接口,http://localhost:8080/v2/apps/delta,GET请求
private void getAndUpdateDelta(Applications applications) throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
Applications delta = null;
EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
delta = httpResponse.getEntity();
}
... ...
@Override
public EurekaHttpResponse<Applications> getDelta(String... regions) {
return getApplicationsInternal("apps/delta", regions);
}
private EurekaHttpResponse<Applications> getApplicationsInternal(String urlPath, String[] regions) {
Response response = null;
try {
WebTarget webTarget = jerseyClient.target(serviceUrl).path(urlPath);
if (regions != null && regions.length > 0) {
webTarget = webTarget.queryParam("regions", StringUtil.join(regions));
}
Builder requestBuilder = webTarget.request();
addExtraProperties(requestBuilder);
addExtraHeaders(requestBuilder);
response = requestBuilder.accept(MediaType.APPLICATION_JSON_TYPE).get();
... ...
}
根据路径匹配,我们找到server端处理拉取增量请求的处理器:ApplicationsResource
,可以看到,是由getContainerDifferential(...)
方法来处理的
@Path("delta")
@GET
public Response getContainerDifferential(
@PathParam("version") String version,
@HeaderParam(HEADER_ACCEPT) String acceptHeader,
@HeaderParam(HEADER_ACCEPT_ENCODING) String acceptEncoding,
@HeaderParam(EurekaAccept.HTTP_X_EUREKA_ACCEPT) String eurekaAccept,
@Context UriInfo uriInfo, @Nullable @QueryParam("regions") String regionsStr) {
boolean isRemoteRegionRequested = null != regionsStr && !regionsStr.isEmpty();
// If the delta flag is disabled in discovery or if the lease expiration
// has been disabled, redirect clients to get all instances
if ((serverConfig.shouldDisableDelta()) || (!registry.shouldAllowAccess(isRemoteRegionRequested))) {
return Response.status(Status.FORBIDDEN).build();
}
String[] regions = null;
if (!isRemoteRegionRequested) {
EurekaMonitors.GET_ALL_DELTA.increment();
} else {
regions = regionsStr.toLowerCase().split(",");
Arrays.sort(regions); // So we don't have different caches for same regions queried in different order.
EurekaMonitors.GET_ALL_DELTA_WITH_REMOTE_REGIONS.increment();
}
CurrentRequestVersion.set(Version.toEnum(version));
KeyType keyType = Key.KeyType.JSON;
String returnMediaType = MediaType.APPLICATION_JSON;
if (acceptHeader == null || !acceptHeader.contains(HEADER_JSON_VALUE)) {
keyType = Key.KeyType.XML;
returnMediaType = MediaType.APPLICATION_XML;
}
Key cacheKey = new Key(Key.EntityType.Application,
ResponseCacheImpl.ALL_APPS_DELTA,
keyType, CurrentRequestVersion.get(), EurekaAccept.fromString(eurekaAccept), regions
);
final Response response;
if (acceptEncoding != null && acceptEncoding.contains(HEADER_GZIP_VALUE)) {
response = Response.ok(responseCache.getGZIP(cacheKey))
.header(HEADER_CONTENT_ENCODING, HEADER_GZIP_VALUE)
.header(HEADER_CONTENT_TYPE, returnMediaType)
.build();
} else {
response = Response.ok(responseCache.get(cacheKey)).build();
}
CurrentRequestVersion.remove();
return response;
}
可以看到,这次构建的cacheKey就是ALL_APPS_DELTA
,即拉取增量数据
Key cacheKey = new Key(Key.EntityType.Application,
ResponseCacheImpl.ALL_APPS_DELTA,
keyType, CurrentRequestVersion.get(), EurekaAccept.fromString(eurekaAccept), regions
);
依然是通过responseCache来获取缓存,
response = Response.ok(responseCache.get(cacheKey)).build();
这次走的是ALL_APPS_DELTA这个分支,调用的是registry的getApplicationDeltasFromMultipleRegions
方法,来获取增量注册表信息
private Value generatePayload(Key key) {
Stopwatch tracer = null;
try {
String payload;
switch (key.getEntityType()) {
case Application:
boolean isRemoteRegionRequested = key.hasRegions();
if (ALL_APPS.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeAllAppsWithRemoteRegionTimer.start();
payload = getPayLoad(key, registry.getApplicationsFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeAllAppsTimer.start();
payload = getPayLoad(key, registry.getApplications());
}
} else if (ALL_APPS_DELTA.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeDeltaAppsWithRemoteRegionTimer.start();
versionDeltaWithRegions.incrementAndGet();
versionDeltaWithRegionsLegacy.incrementAndGet();
payload = getPayLoad(key,
registry.getApplicationDeltasFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeDeltaAppsTimer.start();
versionDelta.incrementAndGet();
versionDeltaLegacy.incrementAndGet();
payload = getPayLoad(key, registry.getApplicationDeltas());
}
} else {
tracer = serializeOneApptimer.start();
payload = getPayLoad(key, registry.getApplication(key.getName()));
}
... ...
}
这个recentlyChangedQueue
,代表的含义是,最近有变化的服务实例,比如说,服务注册、下线什么的,在AbstractInstanceRegistry(注册表)
构造的时候,有一个定时调度的任务,默认是30秒一次,看一下服务实例的变更记录,是否在队列里停留了超过3分钟,如果超过了3分钟,就会从队列里将这个服务实例变更记录给移除掉。也就是说,这个queue,就保留最近3分钟的服务实例变更记录。
public Applications getApplicationDeltasFromMultipleRegions(String[] remoteRegions) {
if (null == remoteRegions) {
remoteRegions = allKnownRemoteRegions; // null means all remote regions.
}
boolean includeRemoteRegion = remoteRegions.length != 0;
if (includeRemoteRegion) {
GET_ALL_WITH_REMOTE_REGIONS_CACHE_MISS_DELTA.increment();
} else {
GET_ALL_CACHE_MISS_DELTA.increment();
}
Applications apps = new Applications();
apps.setVersion(responseCache.getVersionDeltaWithRegions().get());
Map<String, Application> applicationInstancesMap = new HashMap<String, Application>();
write.lock();
try {
Iterator<RecentlyChangedItem> iter = this.recentlyChangedQueue.iterator();
logger.debug("The number of elements in the delta queue is :{}", this.recentlyChangedQueue.size());
while (iter.hasNext()) {
//从queue中获取leaseInfo,即实例信息
Lease<InstanceInfo> lease = iter.next().getLeaseInfo();
InstanceInfo instanceInfo = lease.getHolder();
logger.debug("The instance id {} is found with status {} and actiontype {}",
instanceInfo.getId(), instanceInfo.getStatus().name(), instanceInfo.getActionType().name());
Application app = applicationInstancesMap.get(instanceInfo.getAppName());
if (app == null) {
app = new Application(instanceInfo.getAppName());
applicationInstancesMap.put(instanceInfo.getAppName(), app);
apps.addApplication(app);
}
app.addInstance(new InstanceInfo(decorateInstanceInfo(lease)));
}
//... ...
Applications allApps = getApplicationsFromMultipleRegions(remoteRegions);
apps.setAppsHashCode(allApps.getReconcileHashCode());
return apps;
} finally {
write.unlock();
}
}
protected AbstractInstanceRegistry(EurekaServerConfig serverConfig, EurekaClientConfig clientConfig, ServerCodecs serverCodecs) {
this.serverConfig = serverConfig;
this.clientConfig = clientConfig;
this.serverCodecs = serverCodecs;
this.recentCanceledQueue = new CircularQueue<Pair<Long, String>>(1000);
this.recentRegisteredQueue = new CircularQueue<Pair<Long, String>>(1000);
this.renewsLastMin = new MeasuredRate(1000 * 60 * 1);
//定时调度任务,定时清理recentlyChangedQueue中超过3分钟的数据,默认30s执行一次
this.deltaRetentionTimer.schedule(getDeltaRetentionTask(),
serverConfig.getDeltaRetentionTimerIntervalInMs(),
serverConfig.getDeltaRetentionTimerIntervalInMs());
}
@Override
public long getDeltaRetentionTimerIntervalInMs() {
return configInstance.getLongProperty(
namespace + "deltaRetentionTimerIntervalInMs", (30 * 1000))
.get();
}
private TimerTask getDeltaRetentionTask() {
return new TimerTask() {
@Override
public void run() {
Iterator<RecentlyChangedItem> it = recentlyChangedQueue.iterator();
while (it.hasNext()) {
if (it.next().getLastUpdateTime() <
//这个getRetentionTimeInMSInDeltaQueue()方法的默认值是3 * 60 * 1000ms,即180s
System.currentTimeMillis() - serverConfig.getRetentionTimeInMSInDeltaQueue()) {
//这里会移除超过3分钟的服务实例信息RecentlyChangedItem
it.remove();
} else {
break;
}
}
}
};
}
最后总结一下,eureka client通过把CacheRefreshThread
这个线程类放进一个调度线程池里,每次30秒,去抓取一次增量注册表,只返回最近3分钟内发生过变化的服务实例。
拉取下来之后,会调用DiscoveryClient的updateDelta
方法,与本地的注册表进行合并,通过ActionType进行不同的操作,增删改。
private void updateDelta(Applications delta) {
int deltaCount = 0;
for (Application app : delta.getRegisteredApplications()) {
for (InstanceInfo instance : app.getInstances()) {
Applications applications = getApplications();
String instanceRegion = instanceRegionChecker.getInstanceRegion(instance);
if (!instanceRegionChecker.isLocalRegion(instanceRegion)) {
Applications remoteApps = remoteRegionVsApps.get(instanceRegion);
if (null == remoteApps) {
remoteApps = new Applications();
remoteRegionVsApps.put(instanceRegion, remoteApps);
}
applications = remoteApps;
}
++deltaCount;
if (ActionType.ADDED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Added instance {} to the existing apps in region {}", instance.getId(), instanceRegion);
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
} else if (ActionType.MODIFIED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Modified instance {} to the existing apps ", instance.getId());
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
} else if (ActionType.DELETED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp != null) {
logger.debug("Deleted instance {} to the existing apps ", instance.getId());
existingApp.removeInstance(instance);
/*
* We find all instance list from application(The status of instance status is not only the status is UP but also other status)
* if instance list is empty, we remove the application.
*/
if (existingApp.getInstancesAsIsFromEureka().isEmpty()) {
applications.removeApplication(existingApp);
}
}
}
}
}
logger.debug("The total number of instances fetched by the delta processor : {}", deltaCount);
getApplications().setVersion(delta.getVersion());
getApplications().shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
for (Applications applications : remoteRegionVsApps.values()) {
applications.setVersion(delta.getVersion());
applications.shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
}
}
public enum ActionType {
ADDED, // Added in the discovery server
MODIFIED, // Changed in the discovery server
DELETED
// Deleted from the discovery server
}
对更新合并完以后的注册表,会计算一个hash值;从eureka server端增量拉取的时候也会带上一个eureka server端的全量注册表的hash值;此时会将eureka client端的合并完的注册表的hash值,跟eureka server端的全量注册表的hash值进行一个比对;如果说不一样的话,说明本地注册表跟server端不一样,此时就会重新从eureka server拉取全量的注册表到本地来更新到缓存里去。
reconcileHashCode
计算本地注册表hash,与服务端全量的delta.getAppsHashCode()
比对,发现不一致的话,调用reconcileAndLogDifference
方法全量拉取一次
// There is a diff in number of instances for some reason
if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
reconcileAndLogDifference(delta, reconcileHashCode); // this makes a remoteCall
}
private void reconcileAndLogDifference(Applications delta, String reconcileHashCode) throws Throwable {
logger.debug("The Reconcile hashcodes do not match, client : {}, server : {}. Getting the full registry",
reconcileHashCode, delta.getAppsHashCode());
RECONCILE_HASH_CODES_MISMATCH.increment();
long currentUpdateGeneration = fetchRegistryGeneration.get();
EurekaHttpResponse<Applications> httpResponse = clientConfig.getRegistryRefreshSingleVipAddress() == null
? eurekaTransport.queryClient.getApplications(remoteRegionsRef.get())
: eurekaTransport.queryClient.getVip(clientConfig.getRegistryRefreshSingleVipAddress(), remoteRegionsRef.get());
Applications serverApps = httpResponse.getEntity();
if (serverApps == null) {
logger.warn("Cannot fetch full registry from the server; reconciliation failure");
return;
}
... ...
}
你以为它的计算hash值是使用Object里的hashCode()方法吗?
NO!
服务的状态 + “_” + 实例数量,是这么来计算的,ok这个了解一下即可。这就是细节地方的代码了,属于学有余力的状态来研究看看的。
public void populateInstanceCountMap(Map<String, AtomicInteger> instanceCountMap) {
for (Application app : this.getRegisteredApplications()) {
for (InstanceInfo info : app.getInstancesAsIsFromEureka()) {
//可以看到,这里Map的key是服务实例的状态,value为当前状态下服务的实例个数
AtomicInteger instanceCount = instanceCountMap.computeIfAbsent(info.getStatus().name(),
k -> new AtomicInteger(0));
instanceCount.incrementAndGet();
}
}
}
服务状态种类如下
public enum InstanceStatus {
UP, // Ready to receive traffic
DOWN, // Do not send traffic- healthcheck callback failed
STARTING, // Just about starting- initializations to be done - do not
// send traffic
OUT_OF_SERVICE, // Intentionally shutdown for traffic
UNKNOWN;
... ...
}
public static String getReconcileHashCode(Map<String, AtomicInteger> instanceCountMap) {
StringBuilder reconcileHashCode = new StringBuilder(75);
for (Map.Entry<String, AtomicInteger> mapEntry : instanceCountMap.entrySet()) {
reconcileHashCode.append(mapEntry.getKey()).append(STATUS_DELIMITER).append(mapEntry.getValue().get())
.append(STATUS_DELIMITER);
}
return reconcileHashCode.toString();
}
到此为止,eureka client拉取注册表的逻辑代码就是这些,总体上来讲,server这端设计还是有一些值得借鉴的地方,比如多级缓存机制、hash比对的思想、最新变化的数据放入一个queue中,使用后台线程定时清理超过一定时间的数据,只保存最近一段时间的数据等等。
最后,附上一张手绘的服务发现的流程图。