EurekaServer自我保护
自我保护是发生在Server端的,服务注册和发现,续约,以及剔除整体逻辑如下:
PeerAwareInstanceRegistryImpl
private void scheduleRenewalThresholdUpdateTask() {
timer.schedule(new TimerTask() {
@Override
public void run() {
updateRenewalThreshold();
}
}, serverConfig.getRenewalThresholdUpdateIntervalMs(),
serverConfig.getRenewalThresholdUpdateIntervalMs());
}
private void updateRenewalThreshold() {
try {
Applications apps = eurekaClient.getApplications();
int count = 0;
for (Application app : apps.getRegisteredApplications()) {
for (InstanceInfo instance : app.getInstances()) {
if (this.isRegisterable(instance)) {
++count;
}
}
}
synchronized (lock) {
// Update threshold only if the threshold is greater than the
// current expected threshold or if self preservation is disabled.
if ((count) > (serverConfig.getRenewalPercentThreshold() * expectedNumberOfClientsSendingRenews)
|| (!this.isSelfPreservationModeEnabled())) {
this.expectedNumberOfClientsSendingRenews = count;
updateRenewsPerMinThreshold();
}
}
logger.info("Current renewal threshold is : {}", numberOfRenewsPerMinThreshold);
} catch (Throwable e) {
logger.error("Cannot update renewal threshold", e);
}
}
EurekaServerConfigBean中几个默认值:
-
renewalThresholdUpdateIntervalMs = 15 * MINUTES;
续约阈值更新时间间隔 -
renewalPercentThreshold = 0.85;
续约阈值 -
enableSelfPreservation = true;
开启自我保护
期望的发送续约的客户端实例数
expectedNumberOfClientsSendingRenews:就是字面意思,就是期望发送续约的实例数。该值在服务注册和服务取消,以及每15分钟更新续约阈值任务更新。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-sOJgsyM3-1614083032069)(D:\home\appData\bolg\Eureka.assets\image-20210223150627917.png)]
每分钟续约阈值
expectedNumberOfClientsSendingRenews它的每次更新都会更新每分钟续约阈值
都会重新计算。
protected void updateRenewsPerMinThreshold() {
this.numberOfRenewsPerMinThreshold = (int) (this.expectedNumberOfClientsSendingRenews
* (60.0 / serverConfig.getExpectedClientRenewalIntervalSeconds())
* serverConfig.getRenewalPercentThreshold());
}
EurekaServerConfigBean
-
expectedClientRenewalIntervalSeconds=30
期望客户端实例续约间隔秒数 -
renewalPercentThreshold=0.85
续约阈值0.85
所以numberOfRenewsPerMinThreshold,就是每分钟续约阈值。
那么实际每分钟续约值,从何而来?
实际每分钟续约值
前面提过expectedNumberOfClientsSendingRenews,那么必然会有客户端会发送续约请求,服务端接受续约请求。
AbstractInstanceRegistry
protected AbstractInstanceRegistry(){
this.renewsLastMin = new MeasuredRate(1000 * 60 * 1);
}
public boolean renew(String appName, String id, boolean isReplication) {
// ........
renewsLastMin.increment();
leaseToRenew.renew();
return true;
}
EurekaServer接受续约,增加了每分钟续约值renewsLastMin.increment();
实际就是一个AtomicLong 增加了一下,如下代码:
public class MeasuredRate {
private static final Logger logger = LoggerFactory.getLogger(MeasuredRate.class);
private final AtomicLong lastBucket = new AtomicLong(0);
private final AtomicLong currentBucket = new AtomicLong(0);
private final long sampleInterval;
private final Timer timer;
private volatile boolean isActive;
/**
* @param sampleInterval in milliseconds
*/
public MeasuredRate(long sampleInterval) {
this.sampleInterval = sampleInterval;
this.timer = new Timer("Eureka-MeasureRateTimer", true);
this.isActive = false;
}
public synchronized void start() {
if (!isActive) {
timer.schedule(new TimerTask() {
@Override
public void run() {
try {
// Zero out the current bucket.
lastBucket.set(currentBucket.getAndSet(0));
} catch (Throwable e) {
logger.error("Cannot reset the Measured Rate", e);
}
}
}, sampleInterval, sampleInterval);
isActive = true;
}
}
public synchronized void stop() {
if (isActive) {
timer.cancel();
isActive = false;
}
}
/**
* Returns the count in the last sample interval.
*/
public long getCount() {
return lastBucket.get();
}
/**
* Increments the count in the current sample interval.
*/
public void increment() {
currentBucket.incrementAndGet();
}
}
renewsLastMin.increment(); 增加的是currentBucket
每分钟都会讲currentBucket设置到lastBucket
现在实际每分钟续约数和每分钟续约数都有了,哪里使用呢?
EurekaServer实例剔除任务
AbstractInstanceRegistry$EvictionTask
剔除任务时间间隔:evictionIntervalTimerInMs=60000
class EvictionTask extends TimerTask {
private final AtomicLong lastExecutionNanosRef = new AtomicLong(0l);
@Override
public void run() {
try {
long compensationTimeMs = getCompensationTimeMs();
logger.info("Running the evict task with compensationTime {}ms", compensationTimeMs);
evict(compensationTimeMs);
} catch (Throwable e) {
logger.error("Could not run the evict task", e);
}
}
准备剔除逻辑:
public void evict(long additionalLeaseMs) {
if (!isLeaseExpirationEnabled()) { // 是否续约过期!!!!
logger.debug("DS: lease expiration is currently disabled.");
return;
}
// ....真正剔除逻辑....
}
isLeaseExpirationEnabled方法返回true,表示要剔除
public boolean isLeaseExpirationEnabled() {
if (!isSelfPreservationModeEnabled()) {
return true;
}
return numberOfRenewsPerMinThreshold > 0 && getNumOfRenewsInLastMin() > numberOfRenewsPerMinThreshold;
}
public long getNumOfRenewsInLastMin() {
return renewsLastMin.getCount(); // 这个值是实际续约时增加的那个值上一分钟的值
}
- 关闭自我保护,返回true,即要剔除
- 上一分钟实际续约值大于每分钟续约阈值,即可剔除
换句话说如果开启自我保护并且上一分钟续约数小于每分钟续约阈值,就不执行剔除了,就保护了。
剔除逻辑
public void evict(long additionalLeaseMs) {
logger.debug("Running the evict task");
if (!isLeaseExpirationEnabled()) {
logger.debug("DS: lease expiration is currently disabled.");
return;
}
// We collect first all expired items, to evict them in random order. For large eviction sets,
// if we do not that, we might wipe out whole apps before self preservation kicks in. By randomizing it,
// the impact should be evenly distributed across all applications.
List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
if (leaseMap != null) {
for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
Lease<InstanceInfo> lease = leaseEntry.getValue();
if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {
expiredLeases.add(lease);
}
}
}
}
// To compensate for GC pauses or drifting local time, we need to use current registry size as a base for
// triggering self-preservation. Without that we would wipe out full registry.
int registrySize = (int) getLocalRegistrySize();
int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
int evictionLimit = registrySize - registrySizeThreshold;
int toEvict = Math.min(expiredLeases.size(), evictionLimit);
if (toEvict > 0) {
logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);
Random random = new Random(System.currentTimeMillis());
for (int i = 0; i < toEvict; i++) {
// Pick a random item (Knuth shuffle algorithm)
int next = i + random.nextInt(expiredLeases.size() - i);
Collections.swap(expiredLeases, i, next);
Lease<InstanceInfo> lease = expiredLeases.get(i);
String appName = lease.getHolder().getAppName();
String id = lease.getHolder().getId();
EXPIRED.increment();
logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
internalCancel(appName, id, false);
}
}
}
-
registry:Map<String,Map<String,Lease>> 服务实例集合,注册/取消/续约/剔除/启动时fetchall的都是操作这个map,没有持久化
-
lease.isExpired(additionalLeaseMs) 判断当前实例是否已过期
-
public boolean isExpired(long additionalLeaseMs) { return (evictionTimestamp > 0 || System.currentTimeMillis() > (lastUpdateTimestamp + duration + additionalLeaseMs)); }
-
duration来自于客户端实例配置
leaseExpirationDurationInSeconds=90
-
客户端实例续约间隔
leaseRenewalIntervalInSeconds=30
服务端在收到最后一次续约之后再等leaseExpirationDurationInSeconds=90
就达到剔除条件了
-
-
剔除还会考虑,不能剔除到保护了,要剔除的实例跟要达到保护钱还可以剔除的实例取小
小结
EurekaServer自我保护,实质是通过一个服务剔除任务
实际每分钟实例续约小于每分钟实例续约阈值就不剔除了,实例数注册、取消、定时三种情况更新,即同时每分钟续约阈值也重新计算:
this.numberOfRenewsPerMinThreshold = this.expectedNumberOfClientsSendingRenews
*
(60.0 / serverConfig.getExpectedClientRenewalIntervalSeconds())
*
serverConfig.getRenewalPercentThreshold()
每个实例都有自己的续约过期时间(client传递给Server 默认90s)和续约间隔(client端定时续约任务 默认30s),client 30秒发送一次续约,server端90s后还未收到,在执行服务剔除任务时候,就到达可剔除条件
想对于leader、主从,选举时都需要停止服务