实战:微服务之Spring Cloud 负载均衡组件loadbalance和ribbon的超时与重试机制-CSDN博客
负载均衡示例见上面文章,下面可以跟一下服务调用过程,走一遍loadbalancer的源码.
RestTemplate重试解读
基于RestTemplate形式的RPC调用,要实现失败重试,首先需要添加Spring的重试组件:
<dependency>
<groupId>org.springframework.retry</groupId>
<artifactId>spring-retry</artifactId>
</dependency>
在SpringCloud中添加上述依赖,引入重试组件。
RibbonAutoConfiguration
当引入了上述组件后,主要是引入 “org.springframework.retry.support.RetryTemplate” 这个类,这样SpringCloud整合Ribbon的配置上就能注入如下的Bean:
@Configuration
// 指定在负载均衡配置类之前加载,这样就能覆盖使自己定义的 LoadBalancedRetryPolicyFactory 生效
@AutoConfigureBefore({LoadBalancerAutoConfiguration.class, AsyncLoadBalancerAutoConfiguration.class})
public class RibbonAutoConfiguration {
@Bean
@ConditionalOnClass(name = "org.springframework.retry.support.RetryTemplate")
@ConditionalOnMissingBean
public LoadBalancedRetryPolicyFactory loadBalancedRetryPolicyFactory(SpringClientFactory clientFactory) {
return new RibbonLoadBalancedRetryPolicyFactory(clientFactory);
}
}
LoadBalancedRetryPolicyFactory
// 负载均衡重试策略生成工厂
public class RibbonLoadBalancedRetryPolicyFactory implements LoadBalancedRetryPolicyFactory {
private SpringClientFactory clientFactory;
public RibbonLoadBalancedRetryPolicyFactory(SpringClientFactory clientFactory) {
this.clientFactory = clientFactory;
}
@Override
public LoadBalancedRetryPolicy create(String serviceId, ServiceInstanceChooser loadBalanceChooser) {
RibbonLoadBalancerContext lbContext = this.clientFactory
.getLoadBalancerContext(serviceId);
// 创建Ribbon的负载均衡重试策略
return new RibbonLoadBalancedRetryPolicy(serviceId, lbContext, loadBalanceChooser, clientFactory.getClientConfig(serviceId));
}
}
RibbonAutoConfiguration
RibbonAutoConfiguration 定义了一个重试策略生成器,可以使用其生成针对Ribbon的负载均衡重试策略。
SpringCloud的负载均衡组件会对RestTemplate做相应的处理,生成一个 ClientHttpRequestInterceptor实现了,注入所有需要负载均衡的RestTemplate的Bean中:
@Configuration
@ConditionalOnClass(RestTemplate.class)
public class LoadBalancerAutoConfiguration {
@LoadBalanced
@Autowired(required = false)
private List<RestTemplate> restTemplates = Collections.emptyList();
@Bean
// 在所有Singleton Bean 实例化后,客制化RestTemplate。将刚刚定义的负载均衡相关的Interceptor 注入到 每个需要负载均衡的RestTemplate中
public SmartInitializingSingleton loadBalancedRestTemplateInitializer(final List<RestTemplateCustomizer> customizers) {
return new SmartInitializingSingleton() {
@Override
public void afterSingletonsInstantiated() {
for (RestTemplate restTemplate : LoadBalancerAutoConfiguration.this.restTemplates) {
for (RestTemplateCustomizer customizer : customizers) {
customizer.customize(restTemplate);
}
}
}
};
}
@Configuration
@ConditionalOnClass(RetryTemplate.class)
public static class RetryAutoConfiguration {
......
@Bean
@ConditionalOnMissingBean
// 因为通过Ribbon的配置类定义了一个负载均衡策略生成器,所以这个Bean不会被注册
public LoadBalancedRetryPolicyFactory loadBalancedRetryPolicyFactory() {
return new LoadBalancedRetryPolicyFactory.NeverRetryFactory();
}
......
}
@Configuration
@ConditionalOnClass(RetryTemplate.class)
public static class RetryInterceptorAutoConfiguration {
@Bean
@ConditionalOnMissingBean
// 通过负载均衡相关信息,生成针对RestTemplate的重试拦截器,也就是 ClientHttpRequestInterceptor 的一个实现类
public RetryLoadBalancerInterceptor ribbonInterceptor(
LoadBalancerClient loadBalancerClient, LoadBalancerRetryProperties properties,
LoadBalancedRetryPolicyFactory lbRetryPolicyFactory,
LoadBalancerRequestFactory requestFactory,
LoadBalancedBackOffPolicyFactory backOffPolicyFactory,
LoadBalancedRetryListenerFactory retryListenerFactory) {
return new RetryLoadBalancerInterceptor(loadBalancerClient, properties,
lbRetryPolicyFactory, requestFactory, backOffPolicyFactory, retryListenerFactory);
}
@Bean
@ConditionalOnMissingBean
// 定义一个RestTemplate的客制化器,将负载均衡拦截器注入到相应的RestTemplate中
public RestTemplateCustomizer restTemplateCustomizer(
final RetryLoadBalancerInterceptor loadBalancerInterceptor) {
return new RestTemplateCustomizer() {
@Override
public void customize(RestTemplate restTemplate) {
List<ClientHttpRequestInterceptor> list = new ArrayList<>(
restTemplate.getInterceptors());
list.add(loadBalancerInterceptor);
restTemplate.setInterceptors(list);
}
};
}
}
}
经过上述步骤后,负载均衡策略相关的组件就能以 ClientHttpRequestInterceptor 的形式注入到RestTemplate中,在执行Http请求的时候起作用。
拦截器ClientHttpRequestInterceptor(负载均衡生效)
ClientHttpRequestInterceptor 在 RestTemplate中以类似FilterChain的形式其作用,下面我们就看看 RetryLoadBalancerInterceptor 的作用代码:
public class RetryLoadBalancerInterceptor implements ClientHttpRequestInterceptor {
private LoadBalancedRetryPolicyFactory lbRetryPolicyFactory;
private RetryTemplate retryTemplate;
private LoadBalancerClient loadBalancer;
private LoadBalancerRetryProperties lbProperties;
private LoadBalancerRequestFactory requestFactory;
private LoadBalancedBackOffPolicyFactory backOffPolicyFactory;
private LoadBalancedRetryListenerFactory retryListenerFactory;
......
@Override
public ClientHttpResponse intercept(final HttpRequest request, final byte[] body,
final ClientHttpRequestExecution execution) throws IOException {
final URI originalUri = request.getURI();
final String serviceName = originalUri.getHost();
Assert.state(serviceName != null, "Request URI does not contain a valid hostname: " + originalUri);
// 通过负载均衡重试策略工厂生成重试策略实例,也就是Ribbon里定义的 RibbonLoadBalancedRetryPolicy对象
final LoadBalancedRetryPolicy retryPolicy = lbRetryPolicyFactory.create(serviceName,loadBalancer);
// 重试模板,默认情况下每次执行都实例化一个
RetryTemplate template = this.retryTemplate == null ? new RetryTemplate() : this.retryTemplate;
// 补偿策略,默认不补偿
BackOffPolicy backOffPolicy = backOffPolicyFactory.createBackOffPolicy(serviceName);
template.setBackOffPolicy(backOffPolicy == null ? new NoBackOffPolicy() : backOffPolicy);
// 设置当重试流程结束时,最终未成功,则抛出最后的异常
template.setThrowLastExceptionOnExhausted(true);
// 生成重试监听器,可监听重试的流程,比如开启,关闭,失败;默认无,可自定义
RetryListener[] retryListeners = this.retryListenerFactory.createRetryListeners(serviceName);
if (retryListeners != null && retryListeners.length != 0) {
template.setListeners(retryListeners);
}
// 设置重试策略,将 Ribbon的负载均衡策略,当前请求,负载均衡器 和 服务名称 封装进 一个 InterceptorRetryPolicy
// 目的是将SpringCloud的重试机制适配为Spring的重试机制,将RestTemplate的重试逻辑转发给Ribbon的重试策略
template.setRetryPolicy(!lbProperties.isEnabled() || retryPolicy == null ?
new NeverRetryPolicy(): new InterceptorRetryPolicy(request, retryPolicy, loadBalancer, serviceName));
// RetryTemplate会根据RetryPolicy,在满足条件下不断执行RetryCallback#doWithRetry(context) 方法
return template.execute(
new RetryCallback<ClientHttpResponse, IOException>() {
@Override
// 在可重试的机制下执行操作,也就是执行Http请求
public ClientHttpResponse doWithRetry(RetryContext context)throws IOException {
ServiceInstance serviceInstance = null;
if (context instanceof LoadBalancedRetryContext) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
serviceInstance = lbContext.getServiceInstance();
}
// 在第一次执行时,未指定调用的服务,需要通过负载均衡器选择一个;之后就都能从LoadBalancedRetryContext 里获取
if (serviceInstance == null) {
serviceInstance = loadBalancer.choose(serviceName);
}
ClientHttpResponse response = RetryLoadBalancerInterceptor.this.loadBalancer.execute(
serviceName, serviceInstance,requestFactory.createRequest(request, body, execution));
int statusCode = response.getRawStatusCode();
if (retryPolicy != null && retryPolicy.retryableStatusCode(statusCode)) {
byte[] body = StreamUtils.copyToByteArray(response.getBody());
response.close();
throw new ClientHttpResponseStatusCodeException(serviceName, response, body);
}
return response;
}
},
new RibbonRecoveryCallback<ClientHttpResponse, ClientHttpResponse>() {
@Override
protected ClientHttpResponse createResponse(ClientHttpResponse response, URI uri) {
return response;
}
}
);
}
}
将SpringCloud的重试策略,请求信息,服务名称,负载均衡器封装成一个Spring的 RetryPolicy,注入RetryTemplate中。RetryTemplate根据RetryPolicy的重试逻辑来决定对RetryCallback#doWithRetry(context)的重试策略。
RetryTemplate:请求重试
下面看RetryTemplate执行可重试请求:
public class RetryTemplate implements RetryOperations {
private volatile RetryPolicy retryPolicy;
public final <T, E extends Throwable> T execute(RetryCallback<T, E> retryCallback,
RecoveryCallback<T> recoveryCallback) throws E {
return doExecute(retryCallback, recoveryCallback, null);
}
// 重试模板执行重试方法
protected <T, E extends Throwable> T doExecute(RetryCallback<T, E> retryCallback,
RecoveryCallback<T> recoveryCallback, RetryState state)
throws E, ExhaustedRetryException {
......
/*
* We allow the whole loop to be skipped if the policy or context already
* forbid the first try. This is used in the case of external retry to allow a
* recovery in handleRetryExhausted without the callback processing (which
* would throw an exception).
*/
while (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Retry: count=" + context.getRetryCount());
}
// Reset the last exception, so if we are successful
// the close interceptors will not think we failed...
lastException = null;
// 执行Http请求,当执行失败时,肯定会抛出异常,捕获异常
return retryCallback.doWithRetry(context);
}
catch (Throwable e) {
lastException = e;
try {
// 捕获异常,注册此异常信息
registerThrowable(retryPolicy, state, context, e);
}
catch (Exception ex) {
throw new TerminatedRetryException("Could not register throwable",
ex);
}
finally {
doOnErrorInterceptors(retryCallback, context, e);
}
......
}
}
......
}
// 由SpringCloud的重试策略来决定是否再继续
protected boolean canRetry(RetryPolicy retryPolicy, RetryContext context) {
return retryPolicy.canRetry(context);
}
// 注册异常信息
protected void registerThrowable(RetryPolicy retryPolicy, RetryState state,
RetryContext context, Throwable e) {
retryPolicy.registerThrowable(context, e);
registerContext(context, state);
}
}
每次尝试失败,抛出异常,捕获异常,注册(处理)异常信息,然后继续循环。肯定是在注册异常信息方法里对是否继续重试的条件做了更新,这样下次循环时才能根据最新的重试环境做是否继续的判断。
我们先看执行失败时的异常处理:
public class InterceptorRetryPolicy implements RetryPolicy {
private LoadBalancedRetryPolicy policy;
......
@Override
public void registerThrowable(RetryContext context, Throwable throwable) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
//this is important as it registers the last exception in the context and also increases the retry count
lbContext.registerThrowable(throwable);
//let the policy know about the exception as well
policy.registerThrowable(lbContext, throwable);
}
}
public class RetryContextSupport extends AttributeAccessorSupport implements RetryContext {
// 记录最后一个异常,当最终失败时抛出此异常;同时记录重试次数,方便日志打印
public void registerThrowable(Throwable throwable) {
this.lastException = throwable;
if (throwable != null)
count++;
}
}
public class RibbonLoadBalancedRetryPolicy implements LoadBalancedRetryPolicy {
private int sameServerCount = 0;
private int nextServerCount = 0;
private String serviceId;
@Override
public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) {
//if this is a circuit tripping exception then notify the load balancer
// 如果此异常是一个回环断路异常,比如SocketException,那么需要通知下负载均衡器,尽量降低选择到此Server的几率。因为此路不通
if (lbContext.getRetryHandler().isCircuitTrippingException(throwable)) {
updateServerInstanceStats(context);
}
//Check if we need to ask the load balancer for a new server.
//Do this before we increment the counters because the first call to this method
//is not a retry it is just an initial failure.
// 如果不能对同一个Server再重试了,但是能够尝试NextServer,那么通过负载均衡重新选择一个Server,设置到Context中
if(!canRetrySameServer(context) && canRetryNextServer(context)) {
context.setServiceInstance(loadBalanceChooser.choose(serviceId));
}
//This method is called regardless of whether we are retrying or making the first request.
//Since we do not count the initial request in the retry count we don't reset the counter
//until we actually equal the same server count limit. This will allow us to make the initial
//request plus the right number of retries.
// 更新对当前Server的重试次数,和总的尝试过的Server的数量
if(sameServerCount >= lbContext.getRetryHandler().getMaxRetriesOnSameServer() && canRetry(context)) {
//reset same server since we are moving to a new server
sameServerCount = 0;
nextServerCount++;
if(!canRetryNextServer(context)) {
context.setExhaustedOnly();
}
} else {
sameServerCount++;
}
}
}
在 RibbonLoadBalancedRetryPolicy 对失败的异常处理过程中,有一步对异常是否是回环断路异常的判断,我们先看看这个逻辑:
DefaultLoadBalancerRetryHandler
public class DefaultLoadBalancerRetryHandler implements RetryHandler {
private List<Class<? extends Throwable>> circuitRelated =
Lists.<Class<? extends Throwable>>newArrayList(SocketException.class, SocketTimeoutException.class);
/**
* @return true if {@link SocketException} or {@link SocketTimeoutException} is a cause in the Throwable.
*/
@Override
// 判断异常是否是会触发回路断开的异常
public boolean isCircuitTrippingException(Throwable e) {
return Utils.isPresentAsCause(e, getCircuitRelatedExceptions());
}
protected List<Class<? extends Throwable>> getCircuitRelatedExceptions() {
return circuitRelated;
}
}
public class Utils {
public static boolean isPresentAsCause(Throwable throwableToSearchIn,
Collection<Class<? extends Throwable>> throwableToSearchFor) {
int infiniteLoopPreventionCounter = 10;
while (throwableToSearchIn != null && infiniteLoopPreventionCounter > 0) {
infiniteLoopPreventionCounter--;
for (Class<? extends Throwable> c: throwableToSearchFor) {
if (c.isAssignableFrom(throwableToSearchIn.getClass())) {
return true;
}
}
throwableToSearchIn = throwableToSearchIn.getCause();
}
return false;
}
}
先判断当前失败的异常是否是断路异常,也就是判断此异常是否是SocketException 和 SocketTimeoutException的子类 :
SocketException 的子类有上述这些,SocketTimeoutException 没有子类,也就是如果抛出的是上述异常,那么就触发了回路断开机制。也就是说是连接Server失败而导致的异常,可能当前Server不可用,那么就需要对此Server做一些状态更新,尽量让下次负载均衡时不要选择到此Server,保证服务的可用性。
如果是,那么需要更新Server实例的状态:RibbonLoadBalancedRetryPolicy
RibbonLoadBalancedRetryPolicy:重试策略
public class RibbonLoadBalancedRetryPolicy implements LoadBalancedRetryPolicy {
private void updateServerInstanceStats(LoadBalancedRetryContext context) {
ServiceInstance serviceInstance = context.getServiceInstance();
if (serviceInstance instanceof RibbonServer) {
Server lbServer = ((RibbonServer)serviceInstance).getServer();
ServerStats serverStats = lbContext.getServerStats(lbServer);
// 更新服务状态里的持续性的连接失败次数
serverStats.incrementSuccessiveConnectionFailureCount();
// 更新总的失败次数
serverStats.addToFailureCount();
LOGGER.debug(lbServer.getHostPort() + " RetryCount: " + context.getRetryCount()
+ " Successive Failures: " + serverStats.getSuccessiveConnectionFailureCount()
+ " CircuitBreakerTripped:" + serverStats.isCircuitBreakerTripped());
}
}
}
// Eureka的Server状态类
public class ServerStats {
private final DynamicIntProperty connectionFailureThreshold; // 连接失败阈值,默认 3
private final DynamicIntProperty circuitTrippedTimeoutFactor; // 触发回环断路超时因子, 默认 10
private final DynamicIntProperty maxCircuitTrippedTimeout; // 最大回环断路时间, 默认 30S
public ServerStats() {
connectionFailureThreshold = DynamicPropertyFactory.getInstance().getIntProperty(
"niws.loadbalancer.default.connectionFailureCountThreshold", 3);
circuitTrippedTimeoutFactor = DynamicPropertyFactory.getInstance().getIntProperty(
"niws.loadbalancer.default.circuitTripTimeoutFactorSeconds", 10);
maxCircuitTrippedTimeout = DynamicPropertyFactory.getInstance().getIntProperty(
"niws.loadbalancer.default.circuitTripMaxTimeoutSeconds", 30);
}
// 更新持续性的连接失败次数
public void incrementSuccessiveConnectionFailureCount() {
lastConnectionFailedTimestamp = System.currentTimeMillis();
successiveConnectionFailureCount.incrementAndGet();
// 更新总的被禁止的时间
totalCircuitBreakerBlackOutPeriod.addAndGet(getCircuitBreakerBlackoutPeriod());
}
private long getCircuitBreakerBlackoutPeriod() {
// 持续性的连接失败次数
int failureCount = successiveConnectionFailureCount.get();
// 连接失败阈值
int threshold = connectionFailureThreshold.get();
if (failureCount < threshold) {
return 0;
}
// 计算连接失败次数和阈值的差值,差值不能超过16
int diff = (failureCount - threshold) > 16 ? 16 : (failureCount - threshold);
// 关闭时间,失败次数差值 * 2 * 触发回环断路超时因子(10); 也就是这个Server被ban(禁止)的时间
int blackOutSeconds = (1 << diff) * circuitTrippedTimeoutFactor.get();
// 禁止时间不能超过30S
if (blackOutSeconds > maxCircuitTrippedTimeout.get()) {
blackOutSeconds = maxCircuitTrippedTimeout.get();
}
return blackOutSeconds * 1000L;
}
// 验证当前Server是否被禁止
public boolean isCircuitBreakerTripped(long currentTime) {
long circuitBreakerTimeout = getCircuitBreakerTimeout();
if (circuitBreakerTimeout <= 0) {
return false;
}
// 当前时间是否已经超过了禁止的截止时间
return circuitBreakerTimeout > currentTime;
}
// 获取当前Server被禁止的截止时间
private long getCircuitBreakerTimeout() {
long blackOutPeriod = getCircuitBreakerBlackoutPeriod();
if (blackOutPeriod <= 0) {
return 0;
}
return lastConnectionFailedTimestamp + blackOutPeriod;
}
}
重试异常的处理关联到Ribbon对Server的负载均衡策略,对于持续连接失败超过3次时,将当前Server禁闭一段时间,在下次负载均衡时忽略选择此Server;禁闭时间结束后,执行请求成功才会清空累积的持续失败次数;若一直未成功,那么每失败一次就会被关禁闭一段时间
Ribbon的默认负载均衡策略是一个Composite类型的,优先选择那些同一个Zone且是可用的Server,可用就是指Server未被关禁闭。
RibbonLoadBalancedRetryPolicy
在对抛出的异常做处理后,接下来就需要判断是否可继续重试,是重试当前Server还是重试下一个Server:
public class RibbonLoadBalancedRetryPolicy implements LoadBalancedRetryPolicy {
public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) {
......
//Check if we need to ask the load balancer for a new server.
//Do this before we increment the counters because the first call to this method
//is not a retry it is just an initial failure.
// 如果不能能重试当前Server,但能重试另一个Server,那么需要通过负载均衡器重新选择一个Server。不保证选择的肯定是另一个,可能还是当前
if(!canRetrySameServer(context) && canRetryNextServer(context)) {
context.setServiceInstance(loadBalanceChooser.choose(serviceId));
}
......
}
// 是否能重试当前Server
public boolean canRetrySameServer(LoadBalancedRetryContext context) {
return sameServerCount < lbContext.getRetryHandler().getMaxRetriesOnSameServer() && canRetry(context);
}
// 是否能重试下一个Server
public boolean canRetryNextServer(LoadBalancedRetryContext context) {
//this will be called after a failure occurs and we increment the counter
//so we check that the count is less than or equals to too make sure
//we try the next server the right number of times
return nextServerCount <= lbContext.getRetryHandler().getMaxRetriesOnNextServer() && canRetry(context);
}
public boolean canRetry(LoadBalancedRetryContext context) {
HttpMethod method = context.getRequest().getMethod();
return HttpMethod.GET == method || lbContext.isOkToRetryOnAllOperations();
}
}
上述判断是否能重试基本都依据 LoadBalancedRetryContext 里的数据,我们看看 LoadBalancedRetryContext 内 这几个关键参数的值怎么来:LoadBalancerContext
LoadBalancerContext
public class LoadBalancerContext implements IClientConfigAware {
// 最多可重试几个Server
protected int maxAutoRetriesNextServer = DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES_NEXT_SERVER;
// 每个Server的最大重试次数
protected int maxAutoRetries = DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES;
protected RetryHandler defaultRetryHandler = new DefaultLoadBalancerRetryHandler();
/**
* Set necessary parameters from client configuration and register with Servo monitors.
*/
@Override
public void initWithNiwsConfig(IClientConfig clientConfig) {
......
maxAutoRetries = clientConfig.getPropertyAsInteger(CommonClientConfigKey.MaxAutoRetries, DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES);
maxAutoRetriesNextServer = clientConfig.getPropertyAsInteger(CommonClientConfigKey.MaxAutoRetriesNextServer,maxAutoRetriesNextServer);
okToRetryOnAllOperations = clientConfig.getPropertyAsBoolean(CommonClientConfigKey.OkToRetryOnAllOperations, okToRetryOnAllOperations);
defaultRetryHandler = new DefaultLoadBalancerRetryHandler(clientConfig);
......
}
}
public class DefaultLoadBalancerRetryHandler implements RetryHandler {
......
public DefaultLoadBalancerRetryHandler(IClientConfig clientConfig) {
this.retrySameServer = clientConfig.get(CommonClientConfigKey.MaxAutoRetries, DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES);
this.retryNextServer = clientConfig.get(CommonClientConfigKey.MaxAutoRetriesNextServer, DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES_NEXT_SERVER);
this.retryEnabled = clientConfig.get(CommonClientConfigKey.OkToRetryOnAllOperations, false);
}
@Override
public int getMaxRetriesOnSameServer() {
return retrySameServer;
}
@Override
public int getMaxRetriesOnNextServer() {
return retryNextServer;
}
}
从上述代码上看到,这几个重试参数都是来自于IClientConfig ,这个类是Ribbon的负载均衡配置接口,他提供了很多负载均衡和失败重试以及Server连接的参数配置,默认实现是 DefaultClientConfigImpl,我们看看SpringCloud对 DefaultClientConfigImpl 的定义 :
@Configuration
public class RibbonClientConfiguration {
@Value("${ribbon.client.name}")
private String name = "client"; // 实际注入 'ribbon'
@Bean
@ConditionalOnMissingBean
public IClientConfig ribbonClientConfig() {
DefaultClientConfigImpl config = new DefaultClientConfigImpl();
// 从环境变量加载 ribbon 的负载均衡配置
config.loadProperties(this.name);
config.set(CommonClientConfigKey.ConnectTimeout, DEFAULT_CONNECT_TIMEOUT);
config.set(CommonClientConfigKey.ReadTimeout, DEFAULT_READ_TIMEOUT);
return config;
}
}
SpringCloud在加载时实例化一个DefaultClientConfigImpl 作为Ribbon的负载均衡配置Bean,同时加载环境里所有以 “ribbon” 开头的,结尾为 CommonClientConfigKey 里常量字符串的变量信息:DefaultClientConfigImpl
DefaultClientConfigImpl
public class DefaultClientConfigImpl implements IClientConfig {
public static final int DEFAULT_MAX_AUTO_RETRIES = 0;
public static final int DEFAULT_BACKOFF_INTERVAL = 0;
/**
* Load properties for a given client. It first loads the default values for all properties,
* and any properties already defined with Archaius ConfigurationManager.
*/
@Override
public void loadProperties(String restClientName){
enableDynamicProperties = true;
setClientName(restClientName);
loadDefaultValues();
......
}
public void loadDefaultValues() {
......
// 加载单个Server最大重试次数,环境变量的key为:ribbon.MaxAutoRetries
putDefaultIntegerProperty(CommonClientConfigKey.MaxAutoRetries, getDefaultMaxAutoRetries());
// 加载最多可重试的Server个数, 环境变量的key为:ribbon.MaxAutoRetriesNextServer
putDefaultIntegerProperty(CommonClientConfigKey.MaxAutoRetriesNextServer, getDefaultMaxAutoRetriesNextServer());
// 加载是否对所有请求类型都重试, 环境变量的key为:ribbon.OkToRetryOnAllOperations
putDefaultBooleanProperty(CommonClientConfigKey.OkToRetryOnAllOperations, getDefaultOkToRetryOnAllOperations());
......
}
}
这些数据都有默认值,若未在配置文件中定义,则一律使用默认形式。比如单个Server重试次数默认为0;重试Server个数默认为0。
DefaultClientConfigImpl 里可配置的参数有很多,每个参数都有其各自的含义,感兴趣的可自行查看,这里就不再继续说明。
以上就是基于RestTemplate的失败重试机制的源码解析,基于Ribbon的原生的失败重试机制也类似。Ribbon框架也有集成HttpClient的组件,所以可以不走Spring的RestTemplate,参数的配置形式和RestTemplate一样,也是环境变量里配置 DefaultClientConfigImpl 的那些参数,其代码的逻辑大部分在 : AbstractLoadBalancerAwareClient#executeWithLoadBalancer(…),感兴趣的可以自行查看,这里就不再复述了。
总结:
失败重试伴随着异常的处理,Ribbon对Socket相关异常做相应的断路处理,连续3次失败会禁闭对应Server一段时间,尽量下次重试时不选择到此Server,以保证服务的高可用性
失败重试包含对单个Server的多次重试以及可重试多个Server