Hystrix解决了什么问题?
1 当一次请求的下游依赖出问题的时候, 会影响整个请求
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZQDlHyEi-1684075577375)(null)]
同时对于qps较高的请求, 单个下游依赖故障会导致所有资源在服务器上几秒钟变的饱和
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rte6rbwU-1684075591031)(null)]
2 可行的情况下提供回退保护以保护用户免于失败
如何实现
1 将所有对外部系统(或“依赖项”)的调用包含在HystrixCommand或HystrixObservableCommand对象中,该对象通常在单独的线程中执行(这是命令模式的一个示例)。
2 定时调用的时间超过您定义的阈值。有一个默认值,但对于大多数依赖项,您可以通过“属性”自定义设置这些超时,以便它们略高于每个依赖项的测量的第99.5百分位性能。
3 为每个依赖项维护一个小的线程池(或信号量);如果它变满,将立即拒绝发往该依赖项的请求而不是排队。
4 测量成功,失败(客户端引发的异常),超时和线程拒绝。如果服务的错误百分比超过阈值,则手动或自动地使断路器跳闸以停止对特定服务的所有请求一段时间。
5 当请求失败时执行回退逻辑,被拒绝,超时或短路。
6 近乎实时地监控指标和配置更改。
流程概括
流程说明:
1:每次调用创建一个新的HystrixCommand,把依赖调用封装在run()方法中.
2:执行execute()/queue做同步或异步调用.
3:判断是否从缓存中获取到结果, 获取到结果返回, 否则跳到步骤4
4:判断熔断器(circuit-breaker)是否打开,如果打开跳到步骤8,进行降级策略,如果关闭进入步骤5.
5:判断线程池/队列/信号量是否跑满,如果跑满进入降级步骤8,否则继续后续步骤 6 .
6:调用HystrixCommand的run方法.运行依赖逻辑
6a:判断依赖逻辑调用是否执行失败, 失败则跳到步骤8,进行降级策略
6b:判断依赖逻辑调用是否超时, 超时则跳转到步骤8, 进行降级策略,
7:计算熔断器状态,所有的运行状态(成功, 失败, 拒绝,超时)上报给熔断器,用于统计从而判断熔断器状态.
8:getFallback()降级逻辑.
以下四种情况将触发getFallback调用:
(1):run()方法抛出非HystrixBadRequestException异常。
(2):run()方法调用超时
(3):熔断器开启拦截调用
(4):线程池/队列/信号量是否跑满
8a:没有实现getFallback的Command将直接抛出异常
8b:fallback降级逻辑调用成功直接返回
8c:降级逻辑调用失败抛出异常
9:返回执行成功结果
调用过程(以使用注解同步调用(返回结果非Future))举例
入口 HystrixCommandAspect
@Around("hystrixCommandAnnotationPointcut() || hystrixCollapserAnnotationPointcut()")
public Object methodsAnnotatedWithHystrixCommand(final ProceedingJoinPoint joinPoint) throws Throwable {
// 省略部分代码
// 创建包含所有执行逻辑的command (核心命令逻辑)
HystrixInvokable invokable = HystrixCommandFactory.getInstance().create(metaHolder);
ExecutionType executionType = metaHolder.isCollapserAnnotationPresent() ?
metaHolder.getCollapserExecutionType() : metaHolder.getExecutionType();
Object result;
try {
if (!metaHolder.isObservable()) {
result = CommandExecutor.execute(invokable, executionType, metaHolder);
} else {
// commandAction 执行逻辑
result = executeObservable(invokable, executionType, metaHolder);
}
} catch (HystrixBadRequestException e) {
throw e.getCause();
} catch (HystrixRuntimeException e) {
throw hystrixRuntimeExceptionToThrowable(metaHolder, e);
}
return result;
}
创建一个command并封装依赖调用
HystrixInvokable invokable = HystrixCommandFactory.getInstance().create(metaHolder);
// HystrixCommandFactory create()
public HystrixInvokable create(MetaHolder metaHolder) {
HystrixInvokable executable;
if (metaHolder.isCollapserAnnotationPresent()) {
executable = new CommandCollapser(metaHolder);
} else if (metaHolder.isObservable()) {
executable = new GenericObservableCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
} else {
// 创建command
executable = new GenericCommand(HystrixCommandBuilderFactory.getInstance().create(metaHolder));
}
return executable;
}
HystrixCommandBuilderFactory.getInstance().create(metaHolder)
public <ResponseType> HystrixCommandBuilder create(MetaHolder metaHolder, Collection<HystrixCollapser.CollapsedRequest<ResponseType, Object>> collapsedRequests) {
validateMetaHolder(metaHolder);
return HystrixCommandBuilder.builder()
// Sets the builder to create specific Hystrix setter, for instance HystrixCommand.Setter
.setterBuilder(createGenericSetterBuilder(metaHolder))
// 核心用户方法调用行为
.commandActions(createCommandActions(metaHolder))
// Sets collapsed requests.
.collapsedRequests(collapsedRequests)
// Sets CacheResult invocation context
.cacheResultInvocationContext(createCacheResultInvocationContext(metaHolder))
// Sets CacheRemove invocation context,
.cacheRemoveInvocationContext(createCacheRemoveInvocationContext(metaHolder))
// sets exceptions that should be ignored and wrapped to throw in {@link com.netflix.hystrix.exception.HystrixBadRequestException}. HystrixBadRequestException 不会进入降级方法
.ignoreExceptions(metaHolder.getCommandIgnoreExceptions())
// future - async sync or observable
.executionType(metaHolder.getExecutionType())
.build();
}
核心用户方法调用行为
private CommandActions createCommandActions(MetaHolder metaHolder) {
// 用户方法调用
CommandAction commandAction = createCommandAction(metaHolder);
// 用户降级方法调用
CommandAction fallbackAction = createFallbackAction(metaHolder);
return CommandActions.builder().commandAction(commandAction)
.fallbackAction(fallbackAction).build();
}
使用反射调用用户方法的行为
CommandAction commandAction = createCommandAction(metaHolder);
private CommandAction createCommandAction(MetaHolder metaHolder) {
return new MethodExecutionAction(metaHolder.getObj(), metaHolder.getMethod(), metaHolder.getArgs(), metaHolder);
}
// MethodExecutionAction execute()
/**
* Invokes the method.
*
* @return result of execution
*/
private Object execute(Object o, Method m, Object... args) throws CommandActionExecutionException {
Object result = null;
try {
m.setAccessible(true); // suppress Java language access
if (isCompileWeaving() && metaHolder.getAjcMethod() != null) {
result = invokeAjcMethod(metaHolder.getAjcMethod(), o, metaHolder, args);
} else {
result = m.invoke(o, args);
}
} catch (IllegalAccessException e) {
propagateCause(e);
} catch (InvocationTargetException e) {
propagateCause(e);
}
return result;
}
command创建初始化部分
command线程池的创建
AbstarctCommand构造函数
this.threadPool = initThreadPool(threadPool, this.threadPoolKey, threadPoolPropertiesDefaults);
private static HystrixThreadPool initThreadPool(HystrixThreadPool fromConstructor, HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter threadPoolPropertiesDefaults) {
if (fromConstructor == null) {
// 创建command的时候 fromConstructor 是null
// get the default implementation of HystrixThreadPool
return HystrixThreadPool.Factory.getInstance(threadPoolKey, threadPoolPropertiesDefaults);
} else {
return fromConstructor;
}
}
// getInstance
static HystrixThreadPool getInstance(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter propertiesBuilder) {
// get the key to use instead of using the object itself so that if people forget to implement equals/hashcode things will still work
String key = threadPoolKey.name();
// this should find it for all but the first time
HystrixThreadPool previouslyCached = threadPools.get(key);
if (previouslyCached != null) {
return previouslyCached;
}
// if we get here this is the first time so we need to initialize
synchronized (HystrixThreadPool.class) {
if (!threadPools.containsKey(key)) {
threadPools.put(key, new HystrixThreadPoolDefault(threadPoolKey, propertiesBuilder));
}
}
return threadPools.get(key);
}
HystrixThreadPoolDefault 默认创建的线程池
public HystrixThreadPoolDefault(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties.Setter propertiesDefaults) {
this.properties = HystrixPropertiesFactory.getThreadPoolProperties(threadPoolKey, propertiesDefaults);
HystrixConcurrencyStrategy concurrencyStrategy = HystrixPlugins.getInstance().getConcurrencyStrategy();
this.queueSize = properties.maxQueueSize().get();
this.metrics = HystrixThreadPoolMetrics.getInstance(threadPoolKey,
concurrencyStrategy.getThreadPool(threadPoolKey, properties),
properties);
this.threadPool = this.metrics.getThreadPool();
this.queue = this.threadPool.getQueue();
/* strategy: HystrixMetricsPublisherThreadPool */
HystrixMetricsPublisherFactory.createOrRetrievePublisherForThreadPool(threadPoolKey, this.metrics, this.properties);
}
this.threadPoo = this.metrics.getThreadPool(); = concurrencyStrategy.getThreadPool(threadPoolKey, properties)
concurrencyStrategy.getThreadPool(threadPoolKey, properties)
public ThreadPoolExecutor getThreadPool(final HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties threadPoolProperties) {
// "hystrix-" + threadPoolKey.name() + "-" + threadNumber.incrementAndGet()
final ThreadFactory threadFactory = getThreadFactory(threadPoolKey);
final boolean allowMaximumSizeToDivergeFromCoreSize = threadPoolProperties.getAllowMaximumSizeToDivergeFromCoreSize().get();
final int dynamicCoreSize = threadPoolProperties.coreSize().get();
final int keepAliveTime = threadPoolProperties.keepAliveTimeMinutes().get();
final int maxQueueSize = threadPoolProperties.maxQueueSize().get();
// maxQueueSize <= 0) ? new SynchronousQueue<Runnable>() : new LinkedBlockingQueue<Runnable>(maxQueueSize)
final BlockingQueue<Runnable> workQueue = getBlockingQueue(maxQueueSize);
if (allowMaximumSizeToDivergeFromCoreSize) {
final int dynamicMaximumSize = threadPoolProperties.maximumSize().get();
if (dynamicCoreSize > dynamicMaximumSize) {
logger.error("Hystrix ThreadPool configuration at startup for : " + threadPoolKey.name() + " is trying to set coreSize = " +
dynamicCoreSize + " and maximumSize = " + dynamicMaximumSize + ". Maximum size will be set to " +
dynamicCoreSize + ", the coreSize value, since it must be equal to or greater than the coreSize value");
return new ThreadPoolExecutor(dynamicCoreSize, dynamicCoreSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
} else {
return new ThreadPoolExecutor(dynamicCoreSize, dynamicMaximumSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
}
} else {
return new ThreadPoolExecutor(dynamicCoreSize, dynamicCoreSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
}
// 默认属性
static int default_coreSize = 10; // core size of thread pool
static int default_maximumSize = 10; // maximum size of thread pool
static int default_keepAliveTimeMinutes = 1; // minutes to keep a thread alive
static int default_maxQueueSize = -1; // size of queue (this can't be dynamically changed so we use 'queueSizeRejectionThreshold' to artificially limit and reject)
// -1 turns it off and makes us use SynchronousQueue
static boolean default_allow_maximum_size_to_diverge_from_core_size = false; //should the maximumSize config value get read and used in configuring the threadPool
//turning this on should be a conscious decision by the user, so we default it to false
HystrixThreadPoolKey 的创建
this.threadPoolKey = initThreadPoolKey(threadPoolKey, this.commandGroup, this.properties.executionIsolationThreadPoolKeyOverride().get());
if (threadPoolKey == null) {
/* use HystrixCommandGroup if HystrixThreadPoolKey is null */
return HystrixThreadPoolKey.Factory.asKey(groupKey.name());
} else {
return threadPoolKey;
}
command核心执行逻辑
调用方法入口
result = CommandExecutor.execute(invokable, executionType, metaHolder);
/**
* Calls a method of {@link HystrixExecutable} in accordance with specified execution type.
*
* @param invokable {@link HystrixInvokable}
* @param metaHolder {@link MetaHolder}
* @return the result of invocation of specific method.
* @throws RuntimeException
*/
public static Object execute(HystrixInvokable invokable, ExecutionType executionType, MetaHolder metaHolder) throws RuntimeException {
Validate.notNull(invokable);
Validate.notNull(metaHolder);
switch (executionType) {
case SYNCHRONOUS: {
// 同步调用在此
return castToExecutable(invokable, executionType).execute();
}
case ASYNCHRONOUS: {
HystrixExecutable executable = castToExecutable(invokable, executionType);
if (metaHolder.hasFallbackMethodCommand()
&& ExecutionType.ASYNCHRONOUS == metaHolder.getFallbackExecutionType()) {
return new FutureDecorator(executable.queue());
}
return executable.queue();
}
case OBSERVABLE: {
HystrixObservable observable = castToObservable(invokable);
return ObservableExecutionMode.EAGER == metaHolder.getObservableExecutionMode() ? observable.observe() : observable.toObservable();
}
default:
throw new RuntimeException("unsupported execution type: " + executionType);
}
}
HystrixCommand 的 execute()方法
/**
* Used for synchronous execution of command.
*
* @return R
* Result of {@link #run()} execution or a fallback from {@link #getFallback()} if the command fails for any reason.
* @throws HystrixRuntimeException
* if a failure occurs and a fallback cannot be retrieved
* @throws HystrixBadRequestException
* if invalid arguments or state were used representing a user failure, not a system failure
* @throws IllegalStateException
* if invoked more than once
*/
public R execute() {
try {
return queue().get();
} catch (Exception e) {
throw Exceptions.sneakyThrow(decomposeException(e));
}
}
最终都是通过 核心方法 queue() 调用
/**
* Used for asynchronous execution of command.
* <p>
* This will queue up the command on the thread pool and return an {@link Future} to get the result once it completes.
* <p>
* NOTE: If configured to not run in a separate thread, this will have the same effect as {@link #execute()} and will block.
* <p>
* We don't throw an exception but just flip to synchronous execution so code doesn't need to change in order to switch a command from running on a separate thread to the calling thread.
*
* @return {@code Future<R>} Result of {@link #run()} execution or a fallback from {@link #getFallback()} if the command fails for any reason.
* @throws HystrixRuntimeException
* if a fallback does not exist
* <p>
* <ul>
* <li>via {@code Future.get()} in {@link ExecutionException#getCause()} if a failure occurs</li>
* <li>or immediately if the command can not be queued (such as short-circuited, thread-pool/semaphore rejected)</li>
* </ul>
* @throws HystrixBadRequestException
* via {@code Future.get()} in {@link ExecutionException#getCause()} if invalid arguments or state were used representing a user failure, not a system failure
* @throws IllegalStateException
* if invoked more than once
*/
public Future<R> queue() {
/*
* The Future returned by Observable.toBlocking().toFuture() does not implement the
* interruption of the execution thread when the "mayInterrupt" flag of Future.cancel(boolean) is set to true;
* thus, to comply with the contract of Future, we must wrap around it.
*/
// command 的 全部 核心 逻辑 都在这里了
final Future<R> delegate = toObservable().toBlocking().toFuture();
核心执行创建 toObservable()
private Observable<R> applyHystrixSemantics(final AbstractCommand<R> _cmd) {
// mark that we're starting execution on the ExecutionHook
// if this hook throws an exception, then a fast-fail occurs with no fallback. No state is left inconsistent
executionHook.onStart(_cmd);
/* determine if we're allowed to execute */
if (circuitBreaker.allowRequest()) {
final TryableSemaphore executionSemaphore = getExecutionSemaphore();
final AtomicBoolean semaphoreHasBeenReleased = new AtomicBoolean(false);
final Action0 singleSemaphoreRelease = new Action0() {
@Override
public void call() {
if (semaphoreHasBeenReleased.compareAndSet(false, true)) {
executionSemaphore.release();
}
}
};
final Action1<Throwable> markExceptionThrown = new Action1<Throwable>() {
@Override
public void call(Throwable t) {
eventNotifier.markEvent(HystrixEventType.EXCEPTION_THROWN, commandKey);
}
};
// 如果是线程池隔离 这里总是返回true
if (executionSemaphore.tryAcquire()) {
try {
/* used to track userThreadExecutionTime */
executionResult = executionResult.setInvocationStartTime(System.currentTimeMillis());
return executeCommandAndObserve(_cmd)
.doOnError(markExceptionThrown)
.doOnTerminate(singleSemaphoreRelease)
.doOnUnsubscribe(singleSemaphoreRelease);
} catch (RuntimeException e) {
return Observable.error(e);
}
} else {
return handleSemaphoreRejectionViaFallback();
}
} else {
return handleShortCircuitViaFallback();
}
}
executeCommandAndObserve 方法
private Observable<R> executeCommandAndObserve(final AbstractCommand<R> _cmd) {
final HystrixRequestContext currentRequestContext = HystrixRequestContext.getContextForCurrentThread();
final Action1<R> markEmits = new Action1<R>() {
@Override
public void call(R r) {
if (shouldOutputOnNextEvents()) {
executionResult = executionResult.addEvent(HystrixEventType.EMIT);
eventNotifier.markEvent(HystrixEventType.EMIT, commandKey);
}
// HystrixCommand这里总是true
if (commandIsScalar()) {
long latency = System.currentTimeMillis() - executionResult.getStartTimestamp();
eventNotifier.markCommandExecution(getCommandKey(), properties.executionIsolationStrategy().get(), (int) latency, executionResult.getOrderedList());
eventNotifier.markEvent(HystrixEventType.SUCCESS, commandKey);
executionResult = executionResult.addEvent((int) latency, HystrixEventType.SUCCESS);
// 断路器标记执行成功
circuitBreaker.markSuccess();
}
}
};
final Action0 markOnCompleted = new Action0() {
@Override
public void call() {
if (!commandIsScalar()) {
long latency = System.currentTimeMillis() - executionResult.getStartTimestamp();
eventNotifier.markCommandExecution(getCommandKey(), properties.executionIsolationStrategy().get(), (int) latency, executionResult.getOrderedList());
eventNotifier.markEvent(HystrixEventType.SUCCESS, commandKey);
executionResult = executionResult.addEvent((int) latency, HystrixEventType.SUCCESS);
circuitBreaker.markSuccess();
}
}
};
// 各种异常发生的fallback执行情况
final Func1<Throwable, Observable<R>> handleFallback = new Func1<Throwable, Observable<R>>() {
@Override
public Observable<R> call(Throwable t) {
// 封装执行产生的throwable为Exception
Exception e = getExceptionFromThrowable(t);
// 保存执行时抛出的异常, 透传到fallback时用到
executionResult = executionResult.setExecutionException(e);
if (e instanceof RejectedExecutionException) {
return handleThreadPoolRejectionViaFallback(e);
} else if (t instanceof HystrixTimeoutException) {
return handleTimeoutViaFallback();
} else if (t instanceof HystrixBadRequestException) {
return handleBadRequestByEmittingError(e);
} else {
/*
* Treat HystrixBadRequestException from ExecutionHook like a plain HystrixBadRequestException.
*/
if (e instanceof HystrixBadRequestException) {
eventNotifier.markEvent(HystrixEventType.BAD_REQUEST, commandKey);
return Observable.error(e);
}
return handleFailureViaFallback(e);
}
}
};
final Action1<Notification<? super R>> setRequestContext = new Action1<Notification<? super R>>() {
@Override
public void call(Notification<? super R> rNotification) {
setRequestContextIfNeeded(currentRequestContext);
}
};
Observable<R> execution;
// 是否有执行超时时间 默认 true
if (properties.executionTimeoutEnabled().get()) {
execution = executeCommandWithSpecifiedIsolation(_cmd)
.lift(new HystrixObservableTimeoutOperator<R>(_cmd));
} else {
execution = executeCommandWithSpecifiedIsolation(_cmd);
}
return execution.doOnNext(markEmits)
.doOnCompleted(markOnCompleted)
.onErrorResumeNext(handleFallback)
.doOnEach(setRequestContext);
}
1 命令执行 executeCommandWithSpecifiedIsolation(_cmd)
// 线程池隔离
private Observable<R> executeCommandWithSpecifiedIsolation(final AbstractCommand<R> _cmd) {
if (properties.executionIsolationStrategy().get() == ExecutionIsolationStrategy.THREAD) {
// mark that we are executing in a thread (even if we end up being rejected we still were a THREAD execution and not SEMAPHORE)
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
executionResult = executionResult.setExecutionOccurred();
if (!commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.USER_CODE_EXECUTED)) {
return Observable.error(new IllegalStateException("execution attempted while in state : " + commandState.get().name()));
}
getUserExecutionObservable metrics.markCommandStart(commandKey, threadPoolKey, ExecutionIsolationStrategy.THREAD);
if (isCommandTimedOut.get() == TimedOutStatus.TIMED_OUT) {
// the command timed out in the wrapping thread so we will return immediately
// and not increment any of the counters below or other such logic
return Observable.error(new RuntimeException("timed out before executing run()"));
}
if (threadState.compareAndSet(ThreadState.NOT_USING_THREAD, ThreadState.STARTED)) {
//we have not been unsubscribed, so should proceed
HystrixCounters.incrementGlobalConcurrentThreads();
threadPool.markThreadExecution();
// store the command that is being run
endCurrentThreadExecutingCommand = Hystrix.startCurrentThreadExecutingCommand(getCommandKey());
executionResult = executionResult.setExecutedInThread();
/**
* If any of these hooks throw an exception, then it appears as if the actual execution threw an error
*/
try {
executionHook.onThreadStart(_cmd);
executionHook.onRunStart(_cmd);
executionHook.onExecutionStart(_cmd);
// 获取包装的用户依赖调用
return getUserExecutionObservable(_cmd);
} catch (Throwable ex) {
return Observable.error(ex);
}
} else {
//command has already been unsubscribed, so return immediately
return Observable.error(new RuntimeException("unsubscribed before executing run()"));
}
}
}).doOnTerminate(new Action0() {
@Override
public void call() {
if (threadState.compareAndSet(ThreadState.STARTED, ThreadState.TERMINAL)) {
handleThreadEnd(_cmd);
}
if (threadState.compareAndSet(ThreadState.NOT_USING_THREAD, ThreadState.TERMINAL)) {
//if it was never started and received terminal, then no need to clean up (I don't think this is possible)
}
//if it was unsubscribed, then other cleanup handled it
}
}).doOnUnsubscribe(new Action0() {
@Override
public void call() {
if (threadState.compareAndSet(ThreadState.STARTED, ThreadState.UNSUBSCRIBED)) {
handleThreadEnd(_cmd);
}
if (threadState.compareAndSet(ThreadState.NOT_USING_THREAD, ThreadState.UNSUBSCRIBED)) {
//if it was never started and was cancelled, then no need to clean up
}
//if it was terminal, then other cleanup handled it
}
// 指定subscribe执行的线程 为command对应的线程池线程
}).subscribeOn(threadPool.getScheduler(new Func0<Boolean>() {
@Override
public Boolean call() {
return properties.executionIsolationThreadInterruptOnTimeout().get() && _cmd.isCommandTimedOut.get() == TimedOutStatus.TIMED_OUT;
}
}));
} else {
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
executionResult = executionResult.setExecutionOccurred();
if (!commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.USER_CODE_EXECUTED)) {
return Observable.error(new IllegalStateException("execution attempted while in state : " + commandState.get().name()));
}
metrics.markCommandStart(commandKey, threadPoolKey, ExecutionIsolationStrategy.SEMAPHORE);
// semaphore isolated
// store the command that is being run
endCurrentThreadExecutingCommand = Hystrix.startCurrentThreadExecutingCommand(getCommandKey());
try {
executionHook.onRunStart(_cmd);
executionHook.onExecutionStart(_cmd);
return getUserExecutionObservable(_cmd); //the getUserExecutionObservable method already wraps sync exceptions, so this shouldn't throw
} catch (Throwable ex) {
//If the above hooks throw, then use that as the result of the run method
return Observable.error(ex);
}
}
});
}
}
getUserExecutionObservable
private Observable<R> getUserExecutionObservable(final AbstractCommand<R> _cmd) {
Observable<R> userObservable;
try {
userObservable = getExecutionObservable();
} catch (Throwable ex) {
// the run() method is a user provided implementation so can throw instead of using Observable.onError
// so we catch it here and turn it into Observable.error
userObservable = Observable.error(ex);
}
return userObservable
.lift(new ExecutionHookApplication(_cmd))
.lift(new DeprecatedOnRunHookApplication(_cmd));
}
getExecutionObservable
HystrixCommand的 getExecutionObservable
final protected Observable<R> getExecutionObservable() {
return Observable.defer(new Func0<Observable<R>>() {
@Override
public Observable<R> call() {
try {
return Observable.just(run());
} catch (Throwable ex) {
return Observable.error(ex);
}
}
}).doOnSubscribe(new Action0() {
@Override
public void call() {
// Save thread on which we get subscribed so that we can interrupt it later if needed
executionThread.set(Thread.currentThread());
}
});
}
run()方法
GenericCommand 的run()
protected Object run() throws Exception {
LOGGER.debug("execute command: {}", getCommandKey().name());
return process(new Action() {
@Override
Object execute() {
return getCommandAction().execute(getExecutionType());
}
});
}
// commandActions 为创建command时候封装的下游调用
protected CommandAction getCommandAction() {
return commandActions.getCommandAction();
}
超时处理
execution = executeCommandWithSpecifiedIsolation(_cmd)
.lift(new HystrixObservableTimeoutOperator(_cmd));
public HystrixObservableTimeoutOperator(final AbstractCommand<R> originalCommand) {
this.originalCommand = originalCommand;
}
/*
* Define the action to perform on timeout outside of the TimerListener to it can capture the HystrixRequestContext
* of the calling thread which doesn't exist on the Timer thread.
*/
//
final HystrixContextRunnable timeoutRunnable = new HystrixContextRunnable(originalCommand.concurrencyStrategy, new Runnable() {
@Override
public void run() {
child.onError(new HystrixTimeoutException());
}
});
// 创建延时执行任务
TimerListener listener = new TimerListener() {
@Override
public void tick() {
// if we can go from NOT_EXECUTED to TIMED_OUT then we do the timeout codepath
// otherwise it means we lost a race and the run() execution completed or did not start
if (originalCommand.isCommandTimedOut.compareAndSet(TimedOutStatus.NOT_EXECUTED, TimedOutStatus.TIMED_OUT)) {
// report timeout failure
originalCommand.eventNotifier.markEvent(HystrixEventType.TIMEOUT, originalCommand.commandKey);
// shut down the original request
s.unsubscribe();
timeoutRunnable.run();
//if it did not start, then we need to mark a command start for concurrency metrics, and then issue the timeout
}
}
@Override
public int getIntervalTimeInMilliseconds() {
return originalCommand.properties.executionTimeoutInMilliseconds().get();
}
};
// 添加延时执行任务到 hystrixTimer线程池
final Reference<TimerListener> tl = HystrixTimer.getInstance().addTimerListener(listener);
// set externally so execute/queue can see this
originalCommand.timeoutTimer.set(tl);
添加延时任务
public Reference<TimerListener> addTimerListener(final TimerListener listener) {
startThreadIfNeeded();
// add the listener
Runnable r = new Runnable() {
@Override
public void run() {
try {
listener.tick();
} catch (Exception e) {
logger.error("Failed while ticking TimerListener", e);
}
}
};
ScheduledFuture<?> f = executor.get().getThreadPool().scheduleAtFixedRate(r, listener.getIntervalTimeInMilliseconds(), listener.getIntervalTimeInMilliseconds(), TimeUnit.MILLISECONDS);
return new TimerReference(listener, f);
}
executor.get().getThreadPool()
// 执行队列 DelayQueen 为无界限队列
executor = new ScheduledThreadPoolExecutor(coreSize, threadFactory);
int coreSize = propertiesStrategy.getTimerThreadPoolProperties().getCorePoolSize().get();
// 默认为CPU可用核心数
protected HystrixTimerThreadPoolProperties() {
this(new Setter().withCoreSize(Runtime.getRuntime().availableProcessors()));
}
// 单例 全部command共享
private static HystrixTimer INSTANCE = new HystrixTimer();
private HystrixTimer() {
// private to prevent public instantiation
}
/**
* Retrieve the global instance.
*/
public static HystrixTimer getInstance() {
return INSTANCE;
}
超时任务调用详细原理
http://www.ligen.pro/2019/05/01/Hystrix超时机制实现原理/#more
异常处理
final Func1<Throwable, Observable<R>> handleFallback = new Func1<Throwable, Observable<R>>() {
@Override
public Observable<R> call(Throwable t) {
Exception e = getExceptionFromThrowable(t);
executionResult = executionResult.setExecutionException(e);
if (e instanceof RejectedExecutionException) {
// 线程池/信号量拒绝
return handleThreadPoolRejectionViaFallback(e);
} else if (t instanceof HystrixTimeoutException) {
// 用户命令执行超时处理
return handleTimeoutViaFallback();
} else if (t instanceof HystrixBadRequestException) {
// 抛出 HystrixBadRequestException 异常处理
return handleBadRequestByEmittingError(e);
} else {
/*
* Treat HystrixBadRequestException from ExecutionHook like a plain HystrixBadRequestException.
*/
if (e instanceof HystrixBadRequestException) {
eventNotifier.markEvent(HystrixEventType.BAD_REQUEST, commandKey);
return Observable.error(e);
}
// 用户命令执行异常处理
return handleFailureViaFallback(e);
}
}
};
抛出 HystrixBadRequestException 异常处理
return handleBadRequestByEmittingError(e);
private Observable<R> handleBadRequestByEmittingError(Exception underlying) {
Exception toEmit = underlying;
try {
long executionLatency = System.currentTimeMillis() - executionResult.getStartTimestamp();
eventNotifier.markEvent(HystrixEventType.BAD_REQUEST, commandKey);
executionResult = executionResult.addEvent((int) executionLatency, HystrixEventType.BAD_REQUEST);
Exception decorated = executionHook.onError(this, FailureType.BAD_REQUEST_EXCEPTION, underlying);
if (decorated instanceof HystrixBadRequestException) {
toEmit = decorated;
} else {
logger.warn("ExecutionHook.onError returned an exception that was not an instance of HystrixBadRequestException so will be ignored.", decorated);
}
} catch (Exception hookEx) {
logger.warn("Error calling HystrixCommandExecutionHook.onError", hookEx);
}
/*
* HystrixBadRequestException is treated differently and allowed to propagate without any stats tracking or fallback logic 主要抛出HystrixBadRequestException 不会 计入断路器的状态 不会触发熔断 !!!
*/
return Observable.error(toEmit);
}
线程池/信号量拒绝
private Observable<R> handleThreadPoolRejectionViaFallback(Exception underlying) {
eventNotifier.markEvent(HystrixEventType.THREAD_POOL_REJECTED, commandKey);
// 这里 统计数据到 metrics
public void markThreadRejection() {
metrics.markThreadRejection();
}
threadPool.markThreadRejection();
// use a fallback instead (or throw exception if not implemented)
// fallback 主要处理逻辑方法。
return getFallbackOrThrowException(this, HystrixEventType.THREAD_POOL_REJECTED, FailureType.REJECTED_THREAD_EXECUTION, "could not be queued for execution", underlying);
}
用户命令执行超时处理
private Observable<R> handleTimeoutViaFallback() {
return getFallbackOrThrowException(this, HystrixEventType.TIMEOUT, FailureType.TIMEOUT, "timed-out", new TimeoutException());
}
用户命令执行错误处理
private Observable<R> handleFailureViaFallback(Exception underlying) {
/**
* All other error handling
*/
logger.debug("Error executing HystrixCommand.run(). Proceeding to fallback logic ...", underlying);
// report failure
eventNotifier.markEvent(HystrixEventType.FAILURE, commandKey);
// record the exception
executionResult = executionResult.setException(underlying);
return getFallbackOrThrowException(this, HystrixEventType.FAILURE, FailureType.COMMAND_EXCEPTION, "failed", underlying);
}
断路器打开的 fallback执行
private Observable<R> handleShortCircuitViaFallback() {
// record that we are returning a short-circuited fallback
eventNotifier.markEvent(HystrixEventType.SHORT_CIRCUITED, commandKey);
// short-circuit and go directly to fallback (or throw an exception if no fallback implemented)
Exception shortCircuitException = new RuntimeException("Hystrix circuit short-circuited and is OPEN");
executionResult = executionResult.setExecutionException(shortCircuitException);
try {
return getFallbackOrThrowException(this, HystrixEventType.SHORT_CIRCUITED, FailureType.SHORTCIRCUIT,
"short-circuited", shortCircuitException);
} catch (Exception e) {
return Observable.error(e);
}
}
getFallbackOrThrowException 方法(部分)
final TryableSemaphore fallbackSemaphore = getFallbackSemaphore();
final AtomicBoolean semaphoreHasBeenReleased = new AtomicBoolean(false);
final Action0 singleSemaphoreRelease = new Action0() {
@Override
public void call() {
if (semaphoreHasBeenReleased.compareAndSet(false, true)) {
fallbackSemaphore.release();
}
}
};
Observable<R> fallbackExecutionChain;
// acquire a permit 信号量限流 默认大小 10
if (fallbackSemaphore.tryAcquire()) {
try {
if (isFallbackUserDefined()) {
executionHook.onFallbackStart(this);
fallbackExecutionChain = getFallbackObservable();
} else {
//same logic as above without the hook invocation
// 获取下游调用对用的fallback方法
fallbackExecutionChain = getFallbackObservable();
}
} catch (Throwable ex) {
//If hook or user-fallback throws, then use that as the result of the fallback lookup
fallbackExecutionChain = Observable.error(ex);
}
传递执行异常到fallback
使用注解方式时, fallback参数比command执行参数 最后加一个参数 Throwable e , 进入 fallback时 e则为执行抛出的异常
1 获取fallback方法部分
MethodProvider 的 doFind
private FallbackMethod doFind(Class<?> enclosingType, Method commandMethod, boolean extended) {
String name = getFallbackName(enclosingType, commandMethod);
Class<?>[] fallbackParameterTypes = null;
if (isDefault()) {
fallbackParameterTypes = new Class[0];
} else {
// 不是默认fallback方法 获取 执行方法的参数
fallbackParameterTypes = commandMethod.getParameterTypes();
}
if (extended && fallbackParameterTypes[fallbackParameterTypes.length - 1] == Throwable.class) {
fallbackParameterTypes = ArrayUtils.remove(fallbackParameterTypes, fallbackParameterTypes.length - 1);
}
// 继承的fallbackParameterTypes 参数为 执行方法的参数 最后加上 Throwable.class
Class<?>[] extendedFallbackParameterTypes = Arrays.copyOf(fallbackParameterTypes,
fallbackParameterTypes.length + 1);
// 出现了
extendedFallbackParameterTypes[fallbackParameterTypes.length] = Throwable.class;
Optional<Method> exFallbackMethod = getMethod(enclosingType, name, extendedFallbackParameterTypes);
Optional<Method> fMethod = getMethod(enclosingType, name, fallbackParameterTypes);
// 如果写的fallback方法 不是 执行方法 最后加上throable 则 返回 和执行方法一样参数的fallback方法
Method method = exFallbackMethod.or(fMethod).orNull();
if (method == null) {
throw new FallbackDefinitionException("fallback method wasn't found: " + name + "(" + Arrays.toString(fallbackParameterTypes) + ")");
}
// 返回fallback 第二个参数为 boolean extended
return new FallbackMethod(method, exFallbackMethod.isPresent(), isDefault());
}
注意两个参数
isExtendedFallback ture 当参数 为 execute 参数 + Throwable.class 时 会设置为true
isExtendedParentFallback false 默认值, 此种调用情况没有设置
2 透传执行异常部分
(1) 保存执行异常
AbstractCommand 的 handleFallback 中前两行
Exception e = getExceptionFromThrowable(t);
executionResult = executionResult.setExecutionException(e);
(2) fallback方法执行 透传异常
GenericCommand的的 getFallback
protected Object getFallback() {
final CommandAction commandAction = getFallbackAction();
if (commandAction != null) {
try {
return process(new Action() {
@Override
Object execute() {
MetaHolder metaHolder = commandAction.getMetaHolder();
// getExecutionException 获取执行失败时抛出的异常, 这个方法是设置执行异常 到 fallback方法执行的参数
Object[] args = createArgsForFallback(metaHolder, getExecutionException());
return commandAction.executeWithArgs(metaHolder.getFallbackExecutionType(), args);
}
});
} catch (Throwable e) {
LOGGER.error(FallbackErrorMessageBuilder.create()
.append(commandAction, e).build());
throw new FallbackInvocationException(unwrapCause(e));
}
} else {
return super.getFallback();
}
}
设置执行异常 到 fallback方法执行的参数
CommonUtils 的 createArgsForFallback
public static Object[] createArgsForFallback(MetaHolder metaHolder, Throwable exception) {
return createArgsForFallback(metaHolder.getArgs(), metaHolder, exception);
}
public static Object[] createArgsForFallback(Object[] args, MetaHolder metaHolder, Throwable exception) {
// 如果fallback方法最后加了throwable 则这里为 true
if (metaHolder.isExtendedFallback()) {
// 这里此种执行没有设置值 为false
if
(metaHolder.isExtendedParentFallback()) {
args[args.length - 1] = exception;
} else {
// 填充参数
args = Arrays.copyOf(args, args.length + 1);
args[args.length - 1] = exception;
}
} else {
if (metaHolder.isExtendedParentFallback()) {
args = ArrayUtils.remove(args, args.length - 1);
}
}
return args;
}
熔断器
1 HystrixCircuitBreaker 重要接口方法
1 allowRequest()
/**
* Every {@link HystrixCommand} requests asks this if it is allowed to proceed or not.
* <p>
* This takes into account the half-open logic which allows some requests through when determining if it should be closed again.
*
* @return boolean whether a request should be permitted
*/
public boolean allowRequest();
2 isOpen()
/**
* Whether the circuit is currently open (tripped).
*
* @return boolean state of circuit breaker
*/
public boolean isOpen();
3 markSuccess()
/**
* Invoked on successful executions from {@link HystrixCommand} as part of feedback mechanism when in a half-open state.
*/
/* package */void markSuccess();
(这个是 默认实现类的方法) 4 allowSingleTest()
public boolean allowSingleTest() {
long timeCircuitOpenedOrWasLastTested = circuitOpenedOrLastTestedTime.get();
// 1) if the circuit is open
// 2) and its been longer than 'sleepWindow' since we opened the circuit
if (circuitOpen.get() && System.currentTimeMillis() > timeCircuitOpenedOrWasLastTested + properties.circuitBreakerSleepWindowInMilliseconds().get()) {
// We push the 'circuitOpenedTime' ahead by 'sleepWindow' since we have allowed one request to try.
// If it succeeds the circuit will be closed, otherwise another singleTest will be allowed at the end of the 'sleepWindow'.
if (circuitOpenedOrLastTestedTime.compareAndSet(timeCircuitOpenedOrWasLastTested, System.currentTimeMillis())) {
// if this returns true that means we set the time so we'll return true to allow the singleTest
// if it returned false it means another thread raced us and allowed the singleTest before we did
return true;
}
}
return false;
}
2 HystrixCircuitBreaker 内部类
1 Factory.
public static class Factory {
// String is HystrixCommandKey.name() (we can't use HystrixCommandKey directly as we can't guarantee it implements hashcode/equals correctly)
private static ConcurrentHashMap<String, HystrixCircuitBreaker> circuitBreakersByCommand = new ConcurrentHashMap<String, HystrixCircuitBreaker>();
/**
* Get the {@link HystrixCircuitBreaker} instance for a given {@link HystrixCommandKey}.
* <p>
* This is thread-safe and ensures only 1 {@link HystrixCircuitBreaker} per {@link HystrixCommandKey}.
*
* @param key
* {@link HystrixCommandKey} of {@link HystrixCommand} instance requesting the {@link HystrixCircuitBreaker}
* @param group
* Pass-thru to {@link HystrixCircuitBreaker}
* @param properties
* Pass-thru to {@link HystrixCircuitBreaker}
* @param metrics
* Pass-thru to {@link HystrixCircuitBreaker}
* @return {@link HystrixCircuitBreaker} for {@link HystrixCommandKey}
*/
public static HystrixCircuitBreaker getInstance(HystrixCommandKey key, HystrixCommandGroupKey group, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
// this should find it for all but the first time
HystrixCircuitBreaker previouslyCached = circuitBreakersByCommand.get(key.name());
if (previouslyCached != null) {
return previouslyCached;
}
// if we get here this is the first time so we need to initialize
// Create and add to the map ... use putIfAbsent to atomically handle the possible race-condition of
// 2 threads hitting this point at the same time and let ConcurrentHashMap provide us our thread-safety
// If 2 threads hit here only one will get added and the other will get a non-null response instead.
HystrixCircuitBreaker cbForCommand = circuitBreakersByCommand.putIfAbsent(key.name(), new HystrixCircuitBreakerImpl(key, group, properties, metrics));
if (cbForCommand == null) {
// this means the putIfAbsent step just created a new one so let's retrieve and return it
return circuitBreakersByCommand.get(key.name());
} else {
// this means a race occurred and while attempting to 'put' another one got there before
// and we instead retrieved it and will now return it
return cbForCommand;
}
}
/**
* Get the {@link HystrixCircuitBreaker} instance for a given {@link HystrixCommandKey} or null if none exists.
*
* @param key
* {@link HystrixCommandKey} of {@link HystrixCommand} instance requesting the {@link HystrixCircuitBreaker}
* @return {@link HystrixCircuitBreaker} for {@link HystrixCommandKey}
*/
public static HystrixCircuitBreaker getInstance(HystrixCommandKey key) {
return circuitBreakersByCommand.get(key.name());
}
/**
* Clears all circuit breakers. If new requests come in instances will be recreated.
*/
/* package */static void reset() {
circuitBreakersByCommand.clear();
}
}
2 实现类 HystrixCircuitBreakerImpl circuitBreakerEnabled 为true 返回的断路器
/* track whether this circuit is open/closed at any given point in time (default to false==closed) */
private AtomicBoolean circuitOpen = new AtomicBoolean(false);
/* when the circuit was marked open or was last allowed to try a 'singleTest' */
private AtomicLong circuitOpenedOrLastTestedTime = new AtomicLong();
protected HystrixCircuitBreakerImpl(HystrixCommandKey key, HystrixCommandGroupKey commandGroup, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
this.properties = properties;
this.metrics = metrics;
}
public void markSuccess() {
if (circuitOpen.get()) {
if (circuitOpen.compareAndSet(true, false)) {
//win the thread race to reset metrics
//Unsubscribe from the current stream to reset the health counts stream. This only affects the health counts view,
//and all other metric consumers are unaffected by the reset
metrics.resetStream();
}
}
}
@Override
public boolean allowRequest() {
if (properties.circuitBreakerForceOpen().get()) {
// properties have asked us to force the circuit open so we will allow NO requests
return false;
}
if (properties.circuitBreakerForceClosed().get()) {
// we still want to allow isOpen() to perform it's calculations so we simulate normal behavior
isOpen();
// properties have asked us to ignore errors so we will ignore the results of isOpen and just allow all traffic through
return true;
}
return !isOpen() || allowSingleTest();
}
public boolean allowSingleTest() {
long timeCircuitOpenedOrWasLastTested = circuitOpenedOrLastTestedTime.get();
// 1) if the circuit is open
// 2) and it's been longer than 'sleepWindow' since we opened the circuit
if (circuitOpen.get() && System.currentTimeMillis() > timeCircuitOpenedOrWasLastTested + properties.circuitBreakerSleepWindowInMilliseconds().get()) {
// We push the 'circuitOpenedTime' ahead by 'sleepWindow' since we have allowed one request to try.
// If it succeeds the circuit will be closed, otherwise another singleTest will be allowed at the end of the 'sleepWindow'.
if (circuitOpenedOrLastTestedTime.compareAndSet(timeCircuitOpenedOrWasLastTested, System.currentTimeMillis())) {
// if this returns true that means we set the time so we'll return true to allow the singleTest
// if it returned false it means another thread raced us and allowed the singleTest before we did
return true;
}
}
return false;
}
@Override
public boolean isOpen() {
if (circuitOpen.get()) {
// if we're open we immediately return true and don't bother attempting to 'close' ourself as that is left to allowSingleTest and a subsequent successful test to close
return true;
}
// we're closed, so let's see if errors have made us so we should trip the circuit open
HealthCounts health = metrics.getHealthCounts();
// check if we are past the statisticalWindowVolumeThreshold
if (health.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
// we are not past the minimum volume threshold for the statisticalWindow so we'll return false immediately and not calculate anything
return false;
}
if (health.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
return false;
} else {
// our failure rate is too high, trip the circuit
if (circuitOpen.compareAndSet(false, true)) {
// if the previousValue was false then we want to set the currentTime
circuitOpenedOrLastTestedTime.set(System.currentTimeMillis());
return true;
} else {
// How could previousValue be true? If another thread was going through this code at the same time a race-condition could have
// caused another thread to set it to true already even though we were in the process of doing the same
// In this case, we know the circuit is open, so let the other thread set the currentTime and report back that the circuit is open
return true;
}
}
}
}
3 不开断路器的实现 NoOpCircuitBreaker
@Override
public boolean allowRequest() {
return true;
}
@Override
public boolean isOpen() {
return false;
}
@Override
public void markSuccess() {
}
}
调用流程
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-eEDILSrW-1684075576670)(null)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-kOwB12fm-1684075574148)(null)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-h7XEprgB-1684075574110)(null)]
数据统计
https://www.sczyh30.com/posts/%E9%AB%98%E5%8F%AF%E7%94%A8%E6%9E%B6%E6%9E%84/netflix-hystrix-1-5-sliding-window/
常见降级回退方式
Fail Fast 快速失败
快速失败是最普通的命令执行方法,命令没有重写降级逻辑。 如果命令执行发生任何类型的故障,它将直接抛出异常。
Fail Silent 无声失败
指在降级方法中通过返回null,空Map,空List或其他类似的响应来完成。
Fallback: Static
指在降级方法中返回静态默认值。 这不会导致服务以“无声失败”的方式被删除,而是导致默认行为发生。如:应用根据命令执行返回true / false执行相应逻辑,但命令执行失败,则默认为true
Fallback: Stubbed
当命令返回一个包含多个字段的复合对象时,适合以Stubbed 的方式回退。
Fallback: Cache via Network
有时,如果调用依赖服务失败,可以从缓存服务(如redis)中查询旧数据版本。由于又会发起远程调用,所以建议重新封装一个Command,使用不同的ThreadPoolKey,与主线程池进行隔离。
Primary + Secondary with Fallback
有时系统具有两种行为- 主要和次要,或主要和故障转移。主要和次要逻辑涉及到不同的网络调用和业务逻辑,所以需要将主次逻辑封装在不同的Command中,使用线程池进行隔离。为了实现主从逻辑切换,可以将主次command封装在外观HystrixCommand的run方法中,并结合配置中心设置的开关切换主从逻辑。由于主次逻辑都是经过线程池隔离的HystrixCommand,因此外观HystrixCommand可以使用信号量隔离,而没有必要使用线程池隔离引入不必要的开销
主要参数
https://blog.csdn.net/tongtong_use/article/details/78611225(有一处说法有问题)