1、问题描述
在本地环境测试调用南向服务超时会记录失败次数,达到阈值熔断,服务器上却不生效。
2、定位过程
经过远程debug,发现是由于在服务器上的executionTimeoutEnabled配置是false导致的,感觉很奇怪,因为默认值配置是true,也没有指定为false,只能在代码里传入这个配置项,置为true,具体代码跟踪流程如下:
Hystrix是在com.netflix.hystrix.AbstractCommand#executeCommandAndObserve方法中进行是否开启超时熔断的判断的
而这个executionTimeoutEnabled配置是从properties属性中取的,可以看到这个属性的值来源于com.netflix.hystrix.HystrixCommandProperties#executionTimeoutEnabled。
com.netflix.hystrix.HystrixCommandProperties#executionTimeoutEnabled又是在构造HystrixCommandProperties对象时通过构造函数初始化的
this.executionTimeoutEnabled = getProperty(propertyPrefix, key, "execution.timeout.enabled", builder.getExecutionTimeoutEnabled(), default_executionTimeoutEnabled);
// known that we're using deprecated HystrixPropertiesChainedServoProperty until ChainedDynamicProperty exists in Archaius
protected HystrixCommandProperties(HystrixCommandKey key, HystrixCommandProperties.Setter builder, String propertyPrefix) {
this.key = key;
this.circuitBreakerEnabled = getProperty(propertyPrefix, key, "circuitBreaker.enabled", builder.getCircuitBreakerEnabled(), default_circuitBreakerEnabled);
this.circuitBreakerRequestVolumeThreshold = getProperty(propertyPrefix, key, "circuitBreaker.requestVolumeThreshold", builder.getCircuitBreakerRequestVolumeThreshold(), default_circuitBreakerRequestVolumeThreshold);
this.circuitBreakerSleepWindowInMilliseconds = getProperty(propertyPrefix, key, "circuitBreaker.sleepWindowInMilliseconds", builder.getCircuitBreakerSleepWindowInMilliseconds(), default_circuitBreakerSleepWindowInMilliseconds);
this.circuitBreakerErrorThresholdPercentage = getProperty(propertyPrefix, key, "circuitBreaker.errorThresholdPercentage", builder.getCircuitBreakerErrorThresholdPercentage(), default_circuitBreakerErrorThresholdPercentage);
this.circuitBreakerForceOpen = getProperty(propertyPrefix, key, "circuitBreaker.forceOpen", builder.getCircuitBreakerForceOpen(), default_circuitBreakerForceOpen);
this.circuitBreakerForceClosed = getProperty(propertyPrefix, key, "circuitBreaker.forceClosed", builder.getCircuitBreakerForceClosed(), default_circuitBreakerForceClosed);
this.executionIsolationStrategy = getProperty(propertyPrefix, key, "execution.isolation.strategy", builder.getExecutionIsolationStrategy(), default_executionIsolationStrategy);
//this property name is now misleading. //TODO figure out a good way to deprecate this property name
this.executionTimeoutInMilliseconds = getProperty(propertyPrefix, key, "execution.isolation.thread.timeoutInMilliseconds", builder.getExecutionIsolationThreadTimeoutInMilliseconds(), default_executionTimeoutInMilliseconds);
this.executionTimeoutEnabled = getProperty(propertyPrefix, key, "execution.timeout.enabled", builder.getExecutionTimeoutEnabled(), default_executionTimeoutEnabled);
因此就能知道,这个属性的赋值是由我们代码中通过继承com.netflix.hystrix.HystrixObservableCommand类,传入com.netflix.hystrix.HystrixCommandProperties.Setter对象初始化的
protected AbstractCommand(HystrixCommandGroupKey group, HystrixCommandKey key, HystrixThreadPoolKey threadPoolKey, HystrixCircuitBreaker circuitBreaker, HystrixThreadPool threadPool,
HystrixCommandProperties.Setter commandPropertiesDefaults, HystrixThreadPoolProperties.Setter threadPoolPropertiesDefaults,
HystrixCommandMetrics metrics, TryableSemaphore fallbackSemaphore, TryableSemaphore executionSemaphore,
HystrixPropertiesStrategy propertiesStrategy, HystrixCommandExecutionHook executionHook) {
this.commandGroup = initGroupKey(group);
this.commandKey = initCommandKey(key, getClass());
this.properties = initCommandProperties(this.commandKey, propertiesStrategy, commandPropertiesDefaults);
所以在代码中通过setter.withExecutionTimeoutEnabled(true);来打开超时熔断的开关。
Hystrix判断超时的逻辑
Hystrix是通过com.netflix.hystrix.AbstractCommand.HystrixObservableTimeoutOperator#call方法中的com.netflix.hystrix.util.HystrixTimer.TimerListener对象实现的,这个对象是通过ScheduledExecutor启动一个定时器java.util.concurrent.ScheduledThreadPoolExecutor#scheduleAtFixedRate,通过定时器判断调用服务是否超时。其中传入的超时时间是由下面的代码获取的。
跟踪properties属性的赋值是由com.netflix.hystrix.strategy.properties.HystrixPropertiesFactory#getCommandProperties方法得到
com.netflix.hystrix.strategy.properties.HystrixPropertiesFactory#getCommandProperties方法中会先从缓存取,如果缓存取不到才会根据传入的com.netflix.hystrix.HystrixCommandProperties.Setter对象赋值
这就涉及到另一个问题了,就是为什么创建的HystrixCommand属性没有生效的问题,因为优先使用了缓存的值,官方提供的动态更新配置的方法就是使用archaius来动态加载。
官方提供的动态更新hystrix配置的示例:https://github.com/Netflix/archaius/wiki/Users-Guide