Eureka最为常用的微服务治理组件,常见的功能:
1.服务注册
2.保持心跳
3.服务获取
1.服务注册
在springboot启动后,会发起第一次服务注册。springboot启动时,方法调用依次:
refreshContext(context) --> refresh(context) --> finishRefresh() --> getLifecycleProcessor().onRefresh() --> startBeans(true) --> phases.get(key).start() --> phases.get(key).start() --> doStart(this.lifecycleBeans, member.name, this.autoStartupOnly) --> bean.start()
其中bean包含 EurekaAutoServiceRegistration implement SmartLifecycle,所以看EurekaAutoServiceRegistration的start方法:包含 this.serviceRegistry.register(this.registration)方法,然后调用reg.getApplicationInfoManager().setInstanceStatus(reg.getInstanceConfig().getInitialStatus());
其中 reg.getInstanceConfig().getInitialStatus()=UP,setInstanceStatus方法如下:
/**
* Set the status of this instance. Application can use this to indicate
* whether it is ready to receive traffic. Setting the status here also notifies all registered listeners
* of a status change event.
*
* @param status Status of the instance
*/
public synchronized void setInstanceStatus(InstanceStatus status) {
InstanceStatus next = instanceStatusMapper.map(status);
if (next == null) {
return;
}
InstanceStatus prev = instanceInfo.setStatus(next);
if (prev != null) {
for (StatusChangeListener listener : listeners.values()) {
try {
// 发布一个new StatusChangeEvent 对象
listener.notify(new StatusChangeEvent(prev, next));
} catch (Exception e) {
logger.warn("failed to notify listener: {}", listener.getId(), e);
}
}
}
}
找到发布实例状态的事件了,就需要找在哪监听这个事件了。
Spring-cloud-common包定义了一个DiscoveryClient接口,用于操作注册中心,具体实现依靠各注册中心服务的提供。
Eureka Client 包spring.factories提供了 EurekaClientAutoConfiguration类:通过@Bean new EurekaDiscoveryClient(client, clientConfig) 创建了DiscoveryClient(SpringCloud定义的对象) 对象。DiscoveryClient 引入了EurekaClient对象,所以EurekaClientAutoConfiguration通过@Bean new CloudEurekaClient(manager, config, this.optionalArgs,this.context) 对象,CloudEurekaClient implement DiscoveryClient(Eureka定义的,不是Springcloud定义的),new DiscoveryClient代码如下:
@Inject
DiscoveryClient(ApplicationInfoManager applicationInfoManager, EurekaClientConfig config, AbstractDiscoveryClientOptionalArgs args,
Provider<BackupRegistry> backupRegistryProvider) {
。。。。。。。。。。。。。。。。。。。。。。
。。。。。。。。。。。。。。。。。。。。。。
try {
// default size of 2 - 1 each for heartbeat and cacheRefresh
// 定时的线程池
scheduler = Executors.newScheduledThreadPool(2,
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-%d")
.setDaemon(true)
.build());
// 心跳线程池
heartbeatExecutor = new ThreadPoolExecutor(
1, clientConfig.getHeartbeatExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-HeartbeatExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
// 获取服务列表线程池
cacheRefreshExecutor = new ThreadPoolExecutor(
1, clientConfig.getCacheRefreshExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
。。。。。。。
// finally, init the schedule tasks (e.g. cluster resolvers, heartbeat, instanceInfo replicator, fetch
// 开启定时任务
initScheduledTasks();
。。。。。。。。。。。。。
。。。。。。。。。。。。。
}
主要逻辑在initScheduledTasks()方法里:
/**
* Initializes all scheduled tasks.
*/
private void initScheduledTasks() {
//服务列表获取
if (clientConfig.shouldFetchRegistry()) {
// registry cache refresh timer
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
scheduler.schedule(
new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
new CacheRefreshThread()
),
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
//心跳保持
if (clientConfig.shouldRegisterWithEureka()) {
int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
logger.info("Starting heartbeat executor: " + "renew interval is: {}", renewalIntervalInSecs);
// Heartbeat timer
scheduler.schedule(
new TimedSupervisorTask(
"heartbeat",
scheduler,
heartbeatExecutor,
renewalIntervalInSecs,
TimeUnit.SECONDS,
expBackOffBound,
new HeartbeatThread()
),
renewalIntervalInSecs, TimeUnit.SECONDS);
// InstanceInfo replicator
instanceInfoReplicator = new InstanceInfoReplicator(
this,
instanceInfo,
clientConfig.getInstanceInfoReplicationIntervalSeconds(),
2); // burstSize
// 设置一个监听,监听 StatusChangeEvent 事件
statusChangeListener = new ApplicationInfoManager.StatusChangeListener() {
@Override
public String getId() {
return "statusChangeListener";
}
@Override
//监听到事件后执行notify方法
public void notify(StatusChangeEvent statusChangeEvent) {
if (InstanceStatus.DOWN == statusChangeEvent.getStatus() ||
InstanceStatus.DOWN == statusChangeEvent.getPreviousStatus()) {
// log at warn level if DOWN was involved
logger.warn("Saw local status change event {}", statusChangeEvent);
} else {
logger.info("Saw local status change event {}", statusChangeEvent);
}
// 去执行registry注册方法
instanceInfoReplicator.onDemandUpdate();
}
};
if (clientConfig.shouldOnDemandUpdateStatusChange()) {
applicationInfoManager.registerStatusChangeListener(statusChangeListener);
}
instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
} else {
logger.info("Not registering with Eureka server per configuration");
}
}
服务启动,Eureka注册结束。
总结:服务的注册是通过监听实现的。EurekaAutoServiceRegistration实现了SmartLifecycle接口,在SpringBoot启动时(refreshContext方法),会调用SmartLifecycle的start方法,start方法就发布了一个实例状态的事件;通过SpringBoot的自动装配机制,会加载实现了DiscoveryClient(Springcloud定义)接口的Eureka客户端对象,在new DiscoveryClient(Eureka定义的客户端对象)时,会启动两个定时任务,并构建事件监听,监听到事件后,最终调用registry方法对Eureka Service 进行rest http 接口调用。
2.保持心跳
服务提供者靠定时任务来维持心跳,在 new DiscoveryClient(Eureka提供)时,定义定时任务和线程池:
// default size of 2 - 1 each for heartbeat and cacheRefresh
// schedule定时
scheduler = Executors.newScheduledThreadPool(2,
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-%d")
.setDaemon(true)
.build());
// ThreadPoolExecutor创建线程池
heartbeatExecutor = new ThreadPoolExecutor(
1, clientConfig.getHeartbeatExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-HeartbeatExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
initScheduledTasks()创建定时任务:
if (clientConfig.shouldRegisterWithEureka()) {
int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
logger.info("Starting heartbeat executor: " + "renew interval is: {}", renewalIntervalInSecs);
// Heartbeat timer
// 定时任务
scheduler.schedule(
//实现了runnable的任务
new TimedSupervisorTask(
"heartbeat",
scheduler,
//线程池执行线程
heartbeatExecutor,
renewalIntervalInSecs,
TimeUnit.SECONDS,
expBackOffBound,
//实现了Runnable的,来定义要执行的内容
new HeartbeatThread()
),
renewalIntervalInSecs, TimeUnit.SECONDS);
....................
....................
} else {
logger.info("Not registering with Eureka server per configuration");
}
new HeartbeatThread()实现了Runnable,定义了run方法:
private class HeartbeatThread implements Runnable {
public void run() {
// renew()方法向Eureka发送心跳
if (renew()) {
lastSuccessfulHeartbeatTimestamp = System.currentTimeMillis();
}
}
}
=====================================
boolean renew() {
EurekaHttpResponse<InstanceInfo> httpResponse;
try {
// 请求Eureka service 进行续约
httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
if (httpResponse.getStatusCode() == Status.NOT_FOUND.getStatusCode()) {
REREGISTER_COUNTER.increment();
logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
long timestamp = instanceInfo.setIsDirtyWithTime();
//续约时,404 需要重新注册
boolean success = register();
if (success) {
instanceInfo.unsetIsDirty(timestamp);
}
return success;
}
return httpResponse.getStatusCode() == Status.OK.getStatusCode();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
return false;
}
}
3.服务列表获取
服务启动时,第一次全量获取服务列表:
// fetchRegistry 获取全量注册表
if (clientConfig.shouldFetchRegistry() && !fetchRegistry(false)) {
fetchRegistryFromBackup();
}
fetchRegistry 方法:
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
try {
// If the delta is disabled or if it is the first time, get all
// applications
Applications applications = getApplications();
if (clientConfig.shouldDisableDelta()
|| (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
|| forceFullRegistryFetch
|| (applications == null)
|| (applications.getRegisteredApplications().size() == 0)
|| (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
{
logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
logger.info("Application is null : {}", (applications == null));
logger.info("Registered Applications size is zero : {}",
(applications.getRegisteredApplications().size() == 0));
logger.info("Application version is -1: {}", (applications.getVersion() ==-1));
//全量获取服务列表
getAndStoreFullRegistry();
} else {
//获取更新的服务列表
getAndUpdateDelta(applications);
}
applications.setAppsHashCode(applications.getReconcileHashCode());
logTotalInstances();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to refresh its cache! status = {}", appPathIdentifier, e.getMessage(), e);
return false;
} finally {
if (tracer != null) {
tracer.stop();
}
}
// Notify about cache refresh before updating the instance remote status
onCacheRefreshed();
// Update remote status based on refreshed data held in the cache
updateInstanceRemoteStatus();
// registry was fetched successfully, so return true
return true;
}
getAndStoreFullRegistry方法:
private void getAndStoreFullRegistry() throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
logger.info("Getting all instance registry info from the eureka server");
Applications apps = null;
EurekaHttpResponse<Applications> httpResponse = clientConfig.getRegistryRefreshSingleVipAddress() == null
//第一次获取服务列表的时候,肯定是null
? eurekaTransport.queryClient.getApplications(remoteRegionsRef.get())
: eurekaTransport.queryClient.getVip(clientConfig.getRegistryRefreshSingleVipAddress(), remoteRegionsRef.get());
//响应200,从response中获取Applications
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
apps = httpResponse.getEntity();
}
logger.info("The response status is {}", httpResponse.getStatusCode());
if (apps == null) {
logger.error("The application is null for some reason. Not storing this information");
} else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
// localRegionApps就是存放服务列表的地方
localRegionApps.set(this.filterAndShuffle(apps));
logger.debug("Got full registry with apps hashcode {}", apps.getAppsHashCode());
} else {
logger.warn("Not updating applications as another thread is updating it already");
}
}
定时获取服务列表
initScheduledTasks()方法中,启动获取列表定时任务:
if (clientConfig.shouldFetchRegistry()) {
// registry cache refresh timer
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
scheduler.schedule(
new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
// eureka client更新服务列表
new CacheRefreshThread()
),
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
CacheRefreshThread执行方法:
class CacheRefreshThread implements Runnable {
public void run() {
refreshRegistry();
}
}
=======================================
@VisibleForTesting
void refreshRegistry() {
try {
boolean isFetchingRemoteRegionRegistries = isFetchingRemoteRegionRegistries();
boolean remoteRegionsModified = false;
// This makes sure that a dynamic change to remote regions to fetch is honored.
String latestRemoteRegions = clientConfig.fetchRegistryForRemoteRegions();
if (null != latestRemoteRegions) {
String currentRemoteRegions = remoteRegionsToFetch.get();
if (!latestRemoteRegions.equals(currentRemoteRegions)) {
// Both remoteRegionsToFetch and AzToRegionMapper.regionsToFetch need to be in sync
synchronized (instanceRegionChecker.getAzToRegionMapper()) {
if (remoteRegionsToFetch.compareAndSet(currentRemoteRegions, latestRemoteRegions)) {
String[] remoteRegions = latestRemoteRegions.split(",");
remoteRegionsRef.set(remoteRegions);
instanceRegionChecker.getAzToRegionMapper().setRegionsToFetch(remoteRegions);
remoteRegionsModified = true;
} else {
logger.info("Remote regions to fetch modified concurrently," +
" ignoring change from {} to {}", currentRemoteRegions, latestRemoteRegions);
}
}
} else {
// Just refresh mapping to reflect any DNS/Property change
instanceRegionChecker.getAzToRegionMapper().refreshMapping();
}
}
//获取服务列表
boolean success = fetchRegistry(remoteRegionsModified);
if (success) {
registrySize = localRegionApps.get().size();
lastSuccessfulRegistryFetchTimestamp = System.currentTimeMillis();
}
if (logger.isDebugEnabled()) {
StringBuilder allAppsHashCodes = new StringBuilder();
allAppsHashCodes.append("Local region apps hashcode: ");
allAppsHashCodes.append(localRegionApps.get().getAppsHashCode());
allAppsHashCodes.append(", is fetching remote regions? ");
allAppsHashCodes.append(isFetchingRemoteRegionRegistries);
for (Map.Entry<String, Applications> entry : remoteRegionVsApps.entrySet()) {
allAppsHashCodes.append(", Remote region: ");
allAppsHashCodes.append(entry.getKey());
allAppsHashCodes.append(" , apps hashcode: ");
allAppsHashCodes.append(entry.getValue().getAppsHashCode());
}
logger.debug("Completed cache refresh task for discovery. All Apps hash code is {} ",
allAppsHashCodes);
}
} catch (Throwable e) {
logger.error("Cannot fetch registry from server", e);
}
}
最后也是走fetchRegistry 方法。
定时获取走的是更新的方法:
/**
* Get the delta registry information from the eureka server and update it locally.
* When applying the delta, the following flow is observed:
*
* if (update generation have not advanced (due to another thread))
* atomically try to: update application with the delta and get reconcileHashCode
* abort entire processing otherwise
* do reconciliation if reconcileHashCode clash
* fi
*
* @return the client response
* @throws Throwable on error
*/
private void getAndUpdateDelta(Applications applications) throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
Applications delta = null;
//获取增量服务列表
EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
delta = httpResponse.getEntity();
}
if (delta == null) {
logger.warn("The server does not allow the delta revision to be applied because it is not safe. "
+ "Hence got the full registry.");
//获取全量服务列表
getAndStoreFullRegistry();
}
// cas 防止多线程更新,只有设置成功,才去更新列表
else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
logger.debug("Got delta update with apps hashcode {}", delta.getAppsHashCode());
String reconcileHashCode = "";
if (fetchRegistryUpdateLock.tryLock()) {
try {
//获取到增量列表时,进行本地服务列表更新
updateDelta(delta);
reconcileHashCode = getReconcileHashCode(applications);
} finally {
fetchRegistryUpdateLock.unlock();
}
} else {
logger.warn("Cannot acquire update lock, aborting getAndUpdateDelta");
}
// There is a diff in number of instances for some reason
// 比较当前获取的hashcode和本地列表的hashcode,不一致就重新获取全量列表
if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
reconcileAndLogDifference(delta, reconcileHashCode); // this makes a remoteCall
}
} else {
logger.warn("Not updating application delta as another thread is updating it already");
logger.debug("Ignoring delta update with apps hashcode {}, as another thread is updating it already", delta.getAppsHashCode());
}
}
增量服务列表和全量服务列表:
Eureka服务端提供一个只读缓存供全量服务列表获取;同时默认3分钟(每3分钟针对服务上线、下线、服务健康检查改变的数据进行存储)进行一次服务列表的增量更新,供client进行增量服务列表的获取。
增量获取和全量获取调用的同一个方法,只是url不一样,如图:
增量服务列表的优缺点:
优点--可以减少网络带宽和延迟
缺点--增加服务注册中心内存的负载
ureka client获取服务列表时,为什么不只获取在服务中定义了的服务提供者实例呢?
原因:
-
多版本管理:一个服务可以有多个版本或变种,通过向Eureka注册各个版本的服务提供者实例,客户端可以方便地在服务列表中获取所有可用的版本。这样,根据自己的需要,客户端可以选择特定版本的服务实例进行请求。比如搭建预发环境,代码是一套,可以根据配置文件配置固定标识,来确定调用同一个服务提供者的不同版本。
-
故障恢复和容错能力:如果只获取在服务中定义了的服务提供者实例,那么可能会忽略其他可能存在的服务实例,导致服务的单点故障或容错能力下降。通过获取所有注册的服务实例,Eureka客户端可以选择替代性的服务实例,以保证服务的高可用性和容错能力。
增量服务列表更新方法updateDelta(Applications delta):
private void updateDelta(Applications delta) {
int deltaCount = 0;
//循环远程获取的服务列表
for (Application app : delta.getRegisteredApplications()) {
for (InstanceInfo instance : app.getInstances()) {
//获取本地服务列表缓存
Applications applications = getApplications();
String instanceRegion = instanceRegionChecker.getInstanceRegion(instance);
//非本地region--暂时不考虑
==================================================
if (!instanceRegionChecker.isLocalRegion(instanceRegion)) {
Applications remoteApps = remoteRegionVsApps.get(instanceRegion);
if (null == remoteApps) {
remoteApps = new Applications();
remoteRegionVsApps.put(instanceRegion, remoteApps);
}
applications = remoteApps;
}
==================================================
++deltaCount;
if (ActionType.ADDED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Added instance {} to the existing apps in region {}", instance.getId(), instanceRegion);
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
} else if (ActionType.MODIFIED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp == null) {
applications.addApplication(app);
}
logger.debug("Modified instance {} to the existing apps ", instance.getId());
applications.getRegisteredApplications(instance.getAppName()).addInstance(instance);
} else if (ActionType.DELETED.equals(instance.getActionType())) {
Application existingApp = applications.getRegisteredApplications(instance.getAppName());
if (existingApp != null) {
logger.debug("Deleted instance {} to the existing apps ", instance.getId());
existingApp.removeInstance(instance);
/*
* We find all instance list from application(The status of instance status is not only the status is UP but also other status)
* if instance list is empty, we remove the application.
*/
if (existingApp.getInstancesAsIsFromEureka().isEmpty()) {
applications.removeApplication(existingApp);
}
}
}
// 以上针对 新增(服务上线)、实例信息的改变(端口什么的)、删除(服务下线)
========================================================
}
}
logger.debug("The total number of instances fetched by the delta processor : {}", deltaCount);
getApplications().setVersion(delta.getVersion());
//对本地实例列表进行排序
getApplications().shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
for (Applications applications : remoteRegionVsApps.values()) {
applications.setVersion(delta.getVersion());
applications.shuffleInstances(clientConfig.shouldFilterOnlyUpInstances());
}
}
Eureka的分析结束。