Hystrix熔断器的理解
Hystrix熔断器的理解
最近想要做一个滑动窗口的统计失败率触发告警,想用Hystrix来扩展下,发现Hystrix只要发现异常就会熔断器就会处于半开状态,同时返回fallback方法,而我的需求是达到失败率才会触发fallback方法,同时在HystrixCircuitBreaker提供对外的接口中没有close的状态返回,Hystrix缺少了一种自定义的异常既可以统计失败率却不会立即触发fallback方法。只能研究下自己去实现了,很多是在网上看到别人的一些转载过来。
参考:
https://blog.csdn.net/alex_xfboy/article/details/88720949
https://www.jianshu.com/p/6e9619cbdfc3?tdsourcetag=s_pctim_aiomsg
/* package */class HystrixCircuitBreakerImpl implements HystrixCircuitBreaker {
private final HystrixCommandProperties properties;
private final HystrixCommandMetrics metrics;
enum Status {
CLOSED, OPEN, HALF_OPEN;
}
private final AtomicReference<Status> status = new AtomicReference<Status>(Status.CLOSED);
private final AtomicLong circuitOpened = new AtomicLong(-1);
private final AtomicReference<Subscription> activeSubscription = new AtomicReference<Subscription>(null);
protected HystrixCircuitBreakerImpl(HystrixCommandKey key, HystrixCommandGroupKey commandGroup, final HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
this.properties = properties;
this.metrics = metrics;
//On a timer, this will set the circuit between OPEN/CLOSED as command executions occur
Subscription s = subscribeToStream();
activeSubscription.set(s);
}
private Subscription subscribeToStream() {
/*
* This stream will recalculate the OPEN/CLOSED status on every onNext from the health stream
*/
return metrics.getHealthCountsStream()
.observe()
.subscribe(new Subscriber<HealthCounts>() {
@Override
public void onCompleted() {
}
@Override
public void onError(Throwable e) {
}
@Override
public void onNext(HealthCounts hc) {
// check if we are past the statisticalWindowVolumeThreshold
if (hc.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
// we are not past the minimum volume threshold for the stat window,
// so no change to circuit status.
// if it was CLOSED, it stays CLOSED
// if it was half-open, we need to wait for a successful command execution
// if it was open, we need to wait for sleep window to elapse
} else {
if (hc.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
//we are not past the minimum error threshold for the stat window,
// so no change to circuit status.
// if it was CLOSED, it stays CLOSED
// if it was half-open, we need to wait for a successful command execution
// if it was open, we need to wait for sleep window to elapse
} else {
// our failure rate is too high, we need to set the state to OPEN
// 这里是完全打开断路器,就是我想拿到的状态
if (status.compareAndSet(Status.CLOSED, Status.OPEN)) {
circuitOpened.set(System.currentTimeMillis());
}
}
}
}
});
}
@Override
public void markSuccess() {
if (status.compareAndSet(Status.HALF_OPEN, Status.CLOSED)) {
//This thread wins the race to close the circuit - it resets the stream to start it over from 0
metrics.resetStream();
Subscription previousSubscription = activeSubscription.get();
if (previousSubscription != null) {
previousSubscription.unsubscribe();
}
Subscription newSubscription = subscribeToStream();
activeSubscription.set(newSubscription);
circuitOpened.set(-1L);
}
}
@Override
public void markNonSuccess() {
if (status.compareAndSet(Status.HALF_OPEN, Status.OPEN)) {
//This thread wins the race to re-open the circuit - it resets the start time for the sleep window
circuitOpened.set(System.currentTimeMillis());
}
}
@Override
public boolean isOpen() {
if (properties.circuitBreakerForceOpen().get()) {
return true;
}
if (properties.circuitBreakerForceClosed().get()) {
return false;
}
//这里其实还是半开状态
return circuitOpened.get() >= 0;
}
@Override
public boolean allowRequest() {
if (properties.circuitBreakerForceOpen().get()) {
return false;
}
if (properties.circuitBreakerForceClosed().get()) {
return true;
}
if (circuitOpened.get() == -1) {
return true;
} else {
if (status.get().equals(Status.HALF_OPEN)) {
return false;
} else {
return isAfterSleepWindow();
}
}
}
private boolean isAfterSleepWindow() {
final long circuitOpenTime = circuitOpened.get();
final long currentTime = System.currentTimeMillis();
final long sleepWindowTime = properties.circuitBreakerSleepWindowInMilliseconds().get();
return currentTime > circuitOpenTime + sleepWindowTime;
}
@Override
public boolean attemptExecution() {
if (properties.circuitBreakerForceOpen().get()) {
return false;
}
if (properties.circuitBreakerForceClosed().get()) {
return true;
}
if (circuitOpened.get() == -1) {
return true;
} else {
if (isAfterSleepWindow()) {
if (status.compareAndSet(Status.OPEN, Status.HALF_OPEN)) {
//only the first request after sleep window should execute
return true;
} else {
return false;
}
} else {
return false;
}
}
}
1.HystrixCommandAspect 入口解析
spring-cloud-netflix-core包中SPI接口定义了
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
org.springframework.cloud.netflix.hystrix.HystrixAutoConfiguration,\
org.springframework.cloud.netflix.hystrix.security.HystrixSecurityAutoConfiguration
org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker=\
org.springframework.cloud.netflix.hystrix.HystrixCircuitBreakerConfiguration
当项目中添加了EnableCircuitBreaker注解会初始化HystrixCircuitBreakerConfiguration类,这个类会初始化HystrixCommandAspect类,这个类在hytrix-javanica.jar中,当maven引入
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
会自动引入
HystrixCommandAspect是AOP会拦截包含HystrixCommand和HystrixCollapser的注解的方法
@Aspect
public class HystrixCommandAspect {
@Pointcut("@annotation(com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand)")
public void hystrixCommandAnnotationPointcut() {
}
// 请求的合并
@Pointcut("@annotation(com.netflix.hystrix.contrib.javanica.annotation.HystrixCollapser)")
public void hystrixCollapserAnnotationPointcut() {
}
@Around("hystrixCommandAnnotationPointcut() || hystrixCollapserAnnotationPointcut()")
public Object methodsAnnotatedWithHystrixCommand(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = AopUtils.getMethodFromTarget(joinPoint);
Validate.notNull(method, "failed to get method from joinPoint: %s", new Object[]{joinPoint});
if (method.isAnnotationPresent(HystrixCommand.class) && method.isAnnotationPresent(HystrixCollapser.class)) {
throw new IllegalStateException("method cannot be annotated with HystrixCommand and HystrixCollapser annotations at the same time");
} else {
HystrixCommandAspect.MetaHolderFactory metaHolderFactory = (HystrixCommandAspect.MetaHolderFactory)META_HOLDER_FACTORY_MAP.get(HystrixCommandAspect.HystrixPointcutType.of(method));
//如果是@HystrixCommand,最终调用CommandMetaHolderFactory.create创建metaHolder
MetaHolder metaHolder = metaHolderFactory.create(joinPoint);
// 创建HystrixInvokable,只是一个空接口,没有任何方法,只是用来标记具备可执行的能力
HystrixInvokable invokable = HystrixCommandFactory.getInstance().create(metaHolder);
ExecutionType executionType = metaHolder.isCollapserAnnotationPresent() ? metaHolder.getCollapserExecutionType() : metaHolder.getExecutionType();
try {
Object result;
if (!metaHolder.isObservable()) {
// 利用工具CommandExecutor来执行
result = CommandExecutor.execute(invokable, executionType, metaHolder);
} else {
result = this.executeObservable(invokable, executionType, metaHolder);
}
return result;
} catch (HystrixBadRequestException var9) {
throw var9.getCause();
} catch (HystrixRuntimeException var10) {
throw this.getCauseOrDefault(var10, var10);
}
}
}
核心方法
//创建Command
public class HystrixCommandFactory {
private static final HystrixCommandFactory INSTANCE = new HystrixCommandFactory();
private HystrixCommandFactory() {
}
public static HystrixCommandFactory getInstance() {
return INSTANCE;
}
public HystrixInvokable create(MetaHolder metaHolder) {
Object executable;
if (metaHolder.isCollapserAnnotationPresent()) {
executable = new CommandCollapser(metaHolder);
} else if (metaHolder.isObservable()) {//如果切入的方法是Observable类型
executable = new GenericObservableCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
} else {//比如:public String hello()方法
executable = new GenericCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
}
return (HystrixInvokable)executable;
}
}
//新建GenericCommand:有点熟悉了吧
public class GenericCommand extends AbstractHystrixCommand<Object> {
private static final Logger LOGGER = LoggerFactory.getLogger(GenericCommand.class);
public GenericCommand(HystrixCommandBuilder builder) {
super(builder);
}
// 执行具体的方法,如:serviceName的hello
protected Object run() throws Exception {
LOGGER.debug("execute command: {}", this.getCommandKey().name());
return this.process(new AbstractHystrixCommand<Object>.Action() {
Object execute() {
return GenericCommand.this.getCommandAction().execute(GenericCommand.this.getExecutionType());
}
});
}
// 执行fallback方法,如:serviceName的error()
protected Object getFallback() {
final CommandAction commandAction = this.getFallbackAction();
if (commandAction != null) {
try {
return this.process(new AbstractHystrixCommand<Object>.Action() {
Object execute() {
MetaHolder metaHolder = commandAction.getMetaHolder();
Object[] args = CommonUtils.createArgsForFallback(metaHolder, GenericCommand.this.getExecutionException());
return commandAction.executeWithArgs(metaHolder.getFallbackExecutionType(), args);
}
});
} catch (Throwable var3) {
LOGGER.error(FallbackErrorMessageBuilder.create().append(commandAction, var3).build());
throw new FallbackInvocationException(var3.getCause());
}
} else {
return super.getFallback();
}
}
//实现了HystrixExecutable的方法
public R execute() {
try {
return this.queue().get();
} catch (Exception var2) {
throw Exceptions.sneakyThrow(this.decomposeException(var2));
}
}
}
//HystrixInvokable(GenericCommand)委托CommandExecutor 来执行
public class CommandExecutor {
// 全文的关键方法
public static Object execute(HystrixInvokable invokable, ExecutionType executionType, MetaHolder metaHolder) throws RuntimeException {
Validate.notNull(invokable);
Validate.notNull(metaHolder);
switch(executionType) {
case SYNCHRONOUS:
// 首先将 invokable 转换为 HystrixExecutable,再执行 HystrixCommand的execute() 方法
return castToExecutable(invokable, executionType).execute();
case ASYNCHRONOUS://如果切入的目标方法是Future返回类型时
HystrixExecutable executable = castToExecutable(invokable, executionType);
if (metaHolder.hasFallbackMethodCommand() && ExecutionType.ASYNCHRONOUS == metaHolder.getFallbackExecutionType()) {
return new FutureDecorator(executable.queue());
}
return executable.queue();
case OBSERVABLE:
HystrixObservable observable = castToObservable(invokable);
return ObservableExecutionMode.EAGER == metaHolder.getObservableExecutionMode() ? observable.observe() : observable.toObservable();
default:
throw new RuntimeException("unsupported execution type: " + executionType);
}
}
// HystrixExecutable 的 execute() 方法由 HystrixCommand.execute() 实现
private static HystrixExecutable castToExecutable(HystrixInvokable invokable, ExecutionType executionType) {
if (invokable instanceof HystrixExecutable) {
return (HystrixExecutable)invokable;
} else {
throw new RuntimeException("Command should implement " + HystrixExecutable.class.getCanonicalName() + " interface to execute in: " + executionType + " mode");
}
}
}
public interface HystrixExecutable<R> extends HystrixInvokable<R> {
R execute();
Future<R> queue();
Observable<R> observe();
}
//AbstractHystrixCommand继承public abstract class HystrixCommand<R> extends AbstractCommand<R>
public Future<R> queue() {
/*
* The Future returned by Observable.toBlocking().toFuture() does not implement the
* interruption of the execution thread when the "mayInterrupt" flag of Future.cancel(boolean) is set to true;
* thus, to comply with the contract of Future, we must wrap around it.
*/
// toObservable()这个是super父类方法开启RxJava的观察者
final Future<R> delegate = toObservable().toBlocking().toFuture();
final Future<R> f = new Future<R>() {
@Override
public boolean cancel(boolean mayInterruptIfRunning) {
if (delegate.isCancelled()) {
return false;
}
if (HystrixCommand.this.getProperties().executionIsolationThreadInterruptOnFutureCancel().get()) {
/*
* The only valid transition here is false -> true. If there are two futures, say f1 and f2, created by this command
* (which is super-weird, but has never been prohibited), and calls to f1.cancel(true) and to f2.cancel(false) are
* issued by different threads, it's unclear about what value would be used by the time mayInterruptOnCancel is checked.
* The most consistent way to deal with this scenario is to say that if *any* cancellation is invoked with interruption,
* than that interruption request cannot be taken back.
*/
interruptOnFutureCancel.compareAndSet(false, mayInterruptIfRunning);
}
final boolean res = delegate.cancel(interruptOnFutureCancel.get());
if (!isExecutionComplete() && interruptOnFutureCancel.get()) {
final Thread t = executionThread.get();
if (t != null && !t.equals(Thread.currentThread())) {
t.interrupt();
}
}
return res;
}
}
2.HystrixInvokable
HystrixInvokable 很关键,它的实现类是 GenericCommand 。接下来看看类图,三个抽象父类 AbstractHystrixCommand、HystrixCommand、AbstractCommand 帮助 GenericCommand 做了不少公共的事情,而 GenericCommand 负责执行具体的方法和fallback时的方法。
3.AbstractCommand.toObservable
由于最终入口都是 toObservable(),就从 AbstractCommand的 Observable toObservable() 方法开始。Hystrix 使用观察者模式,Observable 即被观察者,被观察者些状态变更时,观察者可以做出各项响应。
源码实现比较复杂,可以用一张图解释,这种方式是“军师型”,排兵布阵,先创造了各个处理者,然后创造被观察者,再设置Observable发生各种情况时由谁来处理,完全掌控全局。
解释下Action0、Func1这种对象,他们与Action、Func和Runnable、Callable类似,是一个可以被执行的实体。Action没有返回值,Action0…ActionN表示有0…N个参数,Action0就表示没有参数;Func有返值,0…N一样表示参数。
简单的的来看就是:创建一个Observable,然后绑定各种事件对应的处理者,各类doOnXXXX,表示发生XXX事件时做什么事情。
abstract class AbstractCommand<R> implements HystrixInvokableInfo<R>, HystrixObservable<R> {
public Observable<R> toObservable() {
final AbstractCommand<R> _cmd = this;
// 命令执行结束后的清理者
final Action0 terminateCommandCleanup = new Action0() {...};
// 取消订阅时处理者
final Action0 unsubscribeCommandCleanup = new Action0() {...};
// 重点:Hystrix 核心逻辑: 断路器、隔离
final Func0<Observable<R>> applyHystrixSemantics = new Func0<Observable<R>>() {...};
// 发射数据(OnNext表示发射数据)时的Hook
final Func1<R, R> wrapWithAllOnNextHooks = new Func1<R, R>() {...};
// 命令执行完成的Hook
final Action0 fireOnCompletedHook = new Action0() {...};
// 通过Observable.defer()创建一个Observable
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
final boolean requestCacheEnabled = isRequestCachingEnabled();
final String cacheKey = getCacheKey();
// 首先尝试从请求缓存中获取结果
if (requestCacheEnabled) {
HystrixCommandResponseFromCache<R> fromCache = (HystrixCommandResponseFromCache<R>) requestCache.get(cacheKey);
if (fromCache != null) {
isResponseFromCache = true;
return handleRequestCacheHitAndEmitValues(fromCache, _cmd);
}
}
// 使用上面的Func0:applyHystrixSemantics 来创建Observable
Observable<R> hystrixObservable =
Observable.defer(applyHystrixSemantics)
.map(wrapWithAllOnNextHooks);
Observable<R> afterCache;
// 如果启用请求缓存,将Observable包装成HystrixCachedObservable并进行相关处理
if (requestCacheEnabled && cacheKey != null) {
HystrixCachedObservable<R> toCache = HystrixCachedObservable.from(hystrixObservable, _cmd);
...
} else {
afterCache = hystrixObservable;
}
// 返回Observable
return afterCache
.doOnTerminate(terminateCommandCleanup)
.doOnUnsubscribe(unsubscribeCommandCleanup)
.doOnCompleted(fireOnCompletedHook);
}
});
}
}
applyHystrixSemantics变量:Hystrix 核心逻辑: 断路器、隔离
// applyHystrixSemantics 是一个Func0(理解为执行实体或处理者),表示没有参数,返回值是Observable。
final Func0<Observable<R>> applyHystrixSemantics = new Func0<Observable<R>>() {
// Func0 做的事情如下
@Override
public Observable<R> call() {
// 如果未订阅,返回一个"哑炮" Observable, 即一个不会发射任何数据的Observable
if (commandState.get().equals(CommandState.UNSUBSCRIBED)) {
return Observable.never();
}
// 调用applyHystrixSemantics()来创建Observable
return applyHystrixSemantics(_cmd);
}
};
applyHystrixSemantics方法
// Semantics 译为语义, 应用Hystrix语义很拗口,其实就是应用Hystrix的断路器、隔离特性
private Observable<R> applyHystrixSemantics(final AbstractCommand<R> _cmd) {
// 源码中有很多executionHook、eventNotifier的操作,这是Hystrix拓展性的一种体现。这里面啥事也没做,留了个口子,开发人员可以拓展
executionHook.onStart(_cmd);
// 判断断路器是否开启
if (circuitBreaker.attemptExecution()) {
// 获取执行信号
final TryableSemaphore executionSemaphore = getExecutionSemaphore();
final AtomicBoolean semaphoreHasBeenReleased = new AtomicBoolean(false);
final Action0 singleSemaphoreRelease = new Action0() {...};
final Action1<Throwable> markExceptionThrown = new Action1<Throwable>() {...};
// 判断是否信号量拒绝
if (executionSemaphore.tryAcquire()) {
try {
// 重点:处理隔离策略和Fallback策略
return executeCommandAndObserve(_cmd)
.doOnError(markExceptionThrown)
.doOnTerminate(singleSemaphoreRelease)
.doOnUnsubscribe(singleSemaphoreRelease);
} catch (RuntimeException e) {
return Observable.error(e);
}
} else {
return handleSemaphoreRejectionViaFallback();
}
}
// 开启了断路器,执行Fallback
else {
return handleShortCircuitViaFallback();
}
}
executeCommandAndObserve方法:处理隔离策略和Fallback策略
// 处理隔离策略和各种Fallback
private Observable<R> executeCommandAndObserve(final AbstractCommand<R> _cmd) {
final HystrixRequestContext currentRequestContext = HystrixRequestContext.getContextForCurrentThread();
final Action1<R> markEmits = new Action1<R>() {...};
final Action0 markOnCompleted = new Action0() {...};
// 利用Func1获取处理Fallback的 Observable
final Func1<Throwable, Observable<R>> handleFallback = new Func1<Throwable, Observable<R>>() {
@Override
public Observable<R> call(Throwable t) {
circuitBreaker.markNonSuccess();
Exception e = getExceptionFromThrowable(t);
executionResult = executionResult.setExecutionException(e);
// 拒绝处理
if (e instanceof RejectedExecutionException) {
return handleThreadPoolRejectionViaFallback(e);
// 超时处理
} else if (t instanceof HystrixTimeoutException) {
return handleTimeoutViaFallback();
} else if (t instanceof HystrixBadRequestException) {
return handleBadRequestByEmittingError(e);
} else {
...
return handleFailureViaFallback(e);
}
}
};
final Action1<Notification<? super R>> setRequestContext ...
Observable<R> execution;
// 利用特定的隔离策略来处理
if (properties.executionTimeoutEnabled().get()) {
execution = executeCommandWithSpecifiedIsolation(_cmd)
.lift(new HystrixObservableTimeoutOperator<R>(_cmd));
} else {
execution = executeCommandWithSpecifiedIsolation(_cmd);
}
return execution.doOnNext(markEmits)
.doOnCompleted(markOnCompleted)
// 绑定Fallback的处理者
.onErrorResumeNext(handleFallback)
.doOnEach(setRequestContext);
}
executeCommandWithSpecifiedIsolation方法:隔离处理
private Observable<R> executeCommandWithSpecifiedIsolation(final AbstractCommand<R> _cmd) {
// 线程池隔离
if (properties.executionIsolationStrategy().get() == ExecutionIsolationStrategy.THREAD) {
// 再次使用 Observable.defer(), 通过执行Func0来得到Observable
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
// 收集metric信息
metrics.markCommandStart(commandKey, threadPoolKey, ExecutionIsolationStrategy.THREAD);
...
try {
... // 获取真正的用户Task
return getUserExecutionObservable(_cmd);
} catch (Throwable ex) {
return Observable.error(ex);
}
...
}
// 绑定各种处理者
}).doOnTerminate(new Action0() {...})
.doOnUnsubscribe(new Action0() {...})
// 绑定超时处理者
.subscribeOn(threadPool.getScheduler(new Func0<Boolean>() {
@Override
public Boolean call() {
return properties.executionIsolationThreadInterruptOnTimeout().get() && _cmd.isCommandTimedOut.get() == TimedOutStatus.TIMED_OUT;
}
}));
}
// 信号量隔离,和线程池大同小异,全部省略了
else {
return Observable.defer(new Func0<Observable<R>>() {...}
}
}
getUserExecutionObservable()方法:执行用户任务
private Observable<R> getUserExecutionObservable(final AbstractCommand<R> _cmd) {
Observable<R> userObservable;
try {
userObservable = getExecutionObservable();
} catch (Throwable ex) {
// the run() method is a user provided implementation so can throw instead of using Observable.onError
// so we catch it here and turn it into Observable.error
userObservable = Observable.error(ex);
}
return userObservable
.lift(new ExecutionHookApplication(_cmd))
.lift(new DeprecatedOnRunHookApplication(_cmd));
}
@Override
final protected Observable<R> getExecutionObservable() {
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
try {
//run()是GenericCommand.run,执行切入的目标方法,前面讲过的
return Observable.just(run());
} catch (Throwable ex) {
return Observable.error(ex);
}
}
}).doOnSubscribe(new Action0() {
@Override
public void call() {
// Save thread on which we get subscribed so that we can interrupt it later if needed
executionThread.set(Thread.currentThread());
}
});
}
4.Observable的defer/just/lift
defer译为延迟,就是说:必须有观察者订阅时,Observable 才开始发射数据。而defer()的参数是个Func0,是一个会返回Observable的执行实体。
just是将一个对象映射为Observable类。
lift(Operator)是返回一个新的Subscriber ,对传递进来的Subscriber形成代理 ,对onNext()/onCompleted()/onError()方法进行拦截。
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
// 再一次使用Observable.defer()技能,这次用的是applyHystrixSemantics这个Func0
Observable<R> hystrixObservable =
Observable.defer(applyHystrixSemantics)
.map(wrapWithAllOnNextHooks);
... // 此处忽略了请求缓存处理,上面已有提及
Observable<R> afterCache;
...
// 为Observable绑定几个特定事件的处理者,这都是上面创建的Action0
return afterCache
.doOnTerminate(terminateCommandCleanup)
.doOnUnsubscribe(unsubscribeCommandCleanup)
.doOnCompleted(fireOnCompletedHook);
}
});
5.Hystrix熔断的原理流程图
其中椭圆框有包含关系是因为这几个类存在继承关系,即HealthCountsStream继承自BucketRollingCountersStream,而BucketRollingCounterStream继承自BucketCountersStream。这个图比较清晰的展现了在一个command执行完成之后,整个数据的流向。
BucketCounterStream原理图
BucketRollingCountStream原理图