上一篇文章写了在集成hystrix时遇到的一些小坑,其中hystrix线程池的工作模式问题一直比较头疼,本文提供了重写Hystrix线程池的方案,实现了与tomcat原生线程池类似的功能
Hystrix线程池的问题
Hystrix使用的线程池默认为java的ThreadPoolExecutor,它不像Tomcat一样在请求进入后尽可能的创建线程,直到达到max再进入队列,而是在请求并发数超过coreSize后,优先进入hystrix队列等待,只有在队列满了之后才会创建新的线程,这带来的问题就是在绝大部分情况下,hystrix的线程池都是一个固定的coreSize,即浪费了线程资源,也无法很好的应对并发
源码分析
我们来分析一下Hystrix的线程池模式下的线程池创建类
com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategy
在Hystrix创建线程池时,使用它的默认实现 com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategyDefault 来创建线程池
public abstract class HystrixConcurrencyStrategy {
private final static Logger logger = LoggerFactory.getLogger(HystrixConcurrencyStrategy.class);
public ThreadPoolExecutor getThreadPool(final HystrixThreadPoolKey threadPoolKey, HystrixProperty<Integer> corePoolSize, HystrixProperty<Integer> maximumPoolSize, HystrixProperty<Integer> keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
final ThreadFactory threadFactory = getThreadFactory(threadPoolKey);
final int dynamicCoreSize = corePoolSize.get();
final int dynamicMaximumSize = maximumPoolSize.get();
if (dynamicCoreSize > dynamicMaximumSize) {
logger.error("Hystrix ThreadPool configuration at startup for : " + threadPoolKey.name() + " is trying to set coreSize = " +
dynamicCoreSize + " and maximumSize = " + dynamicMaximumSize + ". Maximum size will be set to " +
dynamicCoreSize + ", the coreSize value, since it must be equal to or greater than the coreSize value");
return new ThreadPoolExecutor(dynamicCoreSize, dynamicCoreSize, keepAliveTime.get(), unit, workQueue, threadFactory);
} else {
return new ThreadPoolExecutor(dynamicCoreSize, dynamicMaximumSize, keepAliveTime.get(), unit, workQueue, threadFactory);
}
}
public ThreadPoolExecutor getThreadPool(final HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolProperties threadPoolProperties) {
final ThreadFactory threadFactory = getThreadFactory(threadPoolKey);
final boolean allowMaximumSizeToDivergeFromCoreSize = threadPoolProperties.getAllowMaximumSizeToDivergeFromCoreSize().get();
final int dynamicCoreSize = threadPoolProperties.coreSize().get();
final int keepAliveTime = threadPoolProperties.keepAliveTimeMinutes().get();
final int maxQueueSize = threadPoolProperties.maxQueueSize().get();
final BlockingQueue<Runnable> workQueue = getBlockingQueue(maxQueueSize);
if (allowMaximumSizeToDivergeFromCoreSize) {
final int dynamicMaximumSize = threadPoolProperties.maximumSize().get();
if (dynamicCoreSize > dynamicMaximumSize) {
logger.error("Hystrix ThreadPool configuration at startup for : " + threadPoolKey.name() + " is trying to set coreSize = " +
dynamicCoreSize + " and maximumSize = " + dynamicMaximumSize + ". Maximum size will be set to " +
dynamicCoreSize + ", the coreSize value, since it must be equal to or greater than the coreSize value");
return new ThreadPoolExecutor(dynamicCoreSize, dynamicCoreSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
} else {
return new ThreadPoolExecutor(dynamicCoreSize, dynamicMaximumSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
}
} else {
return new ThreadPoolExecutor(dynamicCoreSize, dynamicCoreSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory);
}
}
private static ThreadFactory getThreadFactory(final HystrixThreadPoolKey threadPoolKey) {
if (!PlatformSpecific.isAppEngineStandardEnvironment()) {
return new ThreadFactory() {
private final AtomicInteger threadNumber = new AtomicInteger(0);
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r, "hystrix-" + threadPoolKey.name() + "-" + threadNumber.incrementAndGet());
thread.setDaemon(true);
return thread;
}
};
} else {
return PlatformSpecific.getAppEngineThreadFactory();
}
}
public BlockingQueue<Runnable> getBlockingQueue(int maxQueueSize) {
/*
* We are using SynchronousQueue if maxQueueSize <= 0 (meaning a queue is not wanted).
* <p>
* SynchronousQueue will do a handoff from calling thread to worker thread and not allow queuing which is what we want.
* <p>
* Queuing results in added latency and would only occur when the thread-pool is full at which point there are latency issues
* and rejecting is the preferred solution.
*/
if (maxQueueSize <= 0) {
return new SynchronousQueue<Runnable>();
} else {
return new LinkedBlockingQueue<Runnable>(maxQueueSize);
}
}
public <T> Callable<T> wrapCallable(Callable<T> callable) {
return callable;
}
public <T> HystrixRequestVariable<T> getRequestVariable(final HystrixRequestVariableLifecycle<T> rv) {
return new HystrixLifecycleForwardingRequestVariable<T>(rv);
}
}
由代码可知,其中getThreadPool方法用于创建Hystrix线程池,默认使用ThreadPoolExecutor,其内部调用getBlockingQueue方法来创建线程池(这里他把getBlockingQueue也单独作为方法声明出来,给外部提供了单独重写Queue的可能),默认在配置允许开启队列的状态下,使用LinkedBlockingQueue
那么这个HystrixConcurrencyStrategyDefault在哪里创建出来,我们看一下
com.netflix.hystrix.strategy.HystrixPlugins
其中的主要方法
public void registerConcurrencyStrategy(HystrixConcurrencyStrategy impl) {
if (!concurrencyStrategy.compareAndSet(null, impl)) {
throw new IllegalStateException("Another strategy was already registered.");
}
}
public HystrixConcurrencyStrategy getConcurrencyStrategy() {
if (concurrencyStrategy.get() == null) {
// check for an implementation from Archaius first
Object impl = getPluginImplementation(HystrixConcurrencyStrategy.class);
if (impl == null) {
// nothing set via Archaius so initialize with default
concurrencyStrategy.compareAndSet(null, HystrixConcurrencyStrategyDefault.getInstance());
// we don't return from here but call get() again in case of thread-race so the winner will always get returned
} else {
// we received an implementation from Archaius so use it
concurrencyStrategy.compareAndSet(null, (HystrixConcurrencyStrategy) impl);
}
}
return concurrencyStrategy.get();
}
private <T> T getPluginImplementation(Class<T> pluginClass) {
T p = getPluginImplementationViaProperties(pluginClass, dynamicProperties);
if (p != null) return p;
return findService(pluginClass, classLoader);
}
private static <T> T getPluginImplementationViaProperties(Class<T> pluginClass, HystrixDynamicProperties dynamicProperties) {
String classSimpleName = pluginClass.getSimpleName();
// Check Archaius for plugin class.
String propertyName = "hystrix.plugin." + classSimpleName + ".implementation";
String implementingClass = dynamicProperties.getString(propertyName, null).get();
if (implementingClass != null) {
try {
Class<?> cls = Class.forName(implementingClass);
// narrow the scope (cast) to the type we're expecting
cls = cls.asSubclass(pluginClass);
return (T) cls.newInstance();
} catch (ClassCastException e) {
throw new RuntimeException(classSimpleName + " implementation is not an instance of " + classSimpleName + ": " + implementingClass);
} catch (ClassNotFoundException e) {
throw new RuntimeException(classSimpleName + " implementation class not found: " + implementingClass, e);
} catch (InstantiationException e) {
throw new RuntimeException(classSimpleName + " implementation not able to be instantiated: " + implementingClass, e);
} catch (IllegalAccessException e) {
throw new RuntimeException(classSimpleName + " implementation not able to be accessed: " + implementingClass, e);
}
} else {
return null;
}
}
private static <T> T findService(
Class<T> spi,
ClassLoader classLoader) throws ServiceConfigurationError {
ServiceLoader<T> sl = ServiceLoader.load(spi,
classLoader);
for (T s : sl) {
if (s != null)
return s;
}
return null;
}
讲一下主要逻辑,首先这个Plugin类用于获取hystrix所使用的一系列实现(如HystrixConcurrencyStrategy,HystrixPropertiesStrategy等),并且提供了从外部注册这些类的方法,HystrixConcurrencyStrategy的创建有几种方式,如果有特定的hystrix配置来进行指定实现,则使用指定的实现类,如果没有指定,则通过ServiceLoader(SPI技术,类似SpringIOC)加载外部注入的实例,如果无法找到外部注入的实例,则默认创建 HystrixConcurrencyStrategyDefault,另外它提供了registerConcurrencyStrategy,可以从外部注册新的HystrixConcurrencyStrategy来替代原有的实现
重写方案
由源码可知,hystrix的行为主要由ThreadPoolExecutor的实现决定,我们只需要注入自定义的HystrixConcurrencyStrategy即可改变ThreadPoolExecutor的实现,从而改变整个hystrix的执行逻辑
设计线程池/线程队列
这里我模仿阿里的dubbo内置的EagerThreadPool线程池,并对其进行了一定的改造,设计了一个自定义线程池MyThreadPoolExecutor和自定义线程队列MyThreadPoolTaskQueue
/**
* 自定义线程池
* 当未达到核心线程数时,将优先创建线程,达到最大线程数后才入队
* 代码来自Dubbo的EagerThreadPool
*/
public class MyThreadPoolExecutor extends ThreadPoolExecutor {
/**
* 当前已提交(在队列中或在执行中)的任务总量
*/
private final AtomicInteger submittedTaskCount = new AtomicInteger(0);
public MyThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
int maxQueueSize,
ThreadFactory threadFactory) {
this(corePoolSize, maximumPoolSize, keepAliveTime, unit, new MyThreadPoolTaskQueue<>(maxQueueSize), threadFactory);
}
public MyThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit, MyThreadPoolTaskQueue<Runnable> workQueue,
ThreadFactory threadFactory) {
this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, new AbortPolicy());
}
public MyThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
int maxQueueSize,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
this(corePoolSize, maximumPoolSize, keepAliveTime, unit, new MyThreadPoolTaskQueue<>(maxQueueSize), threadFactory, handler);
}
public MyThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit, MyThreadPoolTaskQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, handler);
workQueue.bind(this);
}
/**
* 获取当前已提交(在队列中或在执行中)的任务总量
*
* @return
*/
public int getSubmittedTaskCount() {
return submittedTaskCount.get();
}
/**
* 在任务执行完成后任务计数器自减
*
* @param r
* @param t
*/
@Override
protected void afterExecute(Runnable r, Throwable t) {
submittedTaskCount.decrementAndGet();
}
@Override
public void execute(Runnable command) {
if (command == null) {
throw new NullPointerException();
}
//任务数自增
submittedTaskCount.incrementAndGet();
try {
//执行任务
super.execute(command);
} catch (RejectedExecutionException rx) {
//当任务被拒绝(可能是队列已满或者无法创建出新线程)时由于不能认定是队列已满,故应尝试重新入队
final MyThreadPoolTaskQueue queue = (MyThreadPoolTaskQueue) super.getQueue();
try {
if (!queue.retryOffer(command, 0, TimeUnit.MILLISECONDS)) {
submittedTaskCount.decrementAndGet();
throw new RejectedExecutionException("线程池任务队列已满,任务无法执行", rx);
}
} catch (InterruptedException x) {
submittedTaskCount.decrementAndGet();
throw new RejectedExecutionException(x);
}
} catch (Throwable t) {
submittedTaskCount.decrementAndGet();
throw t;
}
}
}
这里主要重写execute方法,记录了当前线程池的任务总数,这个数量会交由MyThreadPoolTaskQueue来判断是否入队,在原有ThreadPoolExecutor实现中,线程数超过coreSize后,线程池会首先尝试调用Queue的offer方法入队,在入队成功后不会进行任何操作,失败后(队列已满)才会尝试创建新的线程,所以我们需要重写Queue队列,让他在检测到线程池的线程数未达到maxSize时强制入队失败
/**
* 自定义线程池专用队列
* 此队列使线程池优先创建线程而不是优先将任务入队
* 由Dubbo的EagerThreadPool对应的TaskQueue改造,增强了兼容性和容错能力
*
* @param <T>
*/
public class MyThreadPoolTaskQueue<T extends Runnable> extends LinkedBlockingQueue<Runnable> {
Logger logger = LoggerFactory.getLogger(MyThreadPoolTaskQueue.class);
private MyThreadPoolExecutor executor;
public MyThreadPoolTaskQueue(int capacity) {
super(capacity);
}
public void bind(MyThreadPoolExecutor exec) {
if (executor == null) {
this.executor = exec;
} else {
//重复绑定Executor大概率为调用方代码编写错误,为避免BUG产生,不应该对同一个Queue重复绑定Executor
throw new IllegalArgumentException("不允许[MyThreadPoolTaskQueue]重复绑定[MyThreadPoolExecutor]");
}
}
/**
* 是否已绑定线程池
*
* @return
*/
public boolean binded() {
return this.executor != null;
}
private boolean loged = false;
/**
* 将任务入队,入队失败的可能性有2种
* 第一种是线程未达上限,应该新建线程来执行任务,则入队失败
* 第二种是线程已达上限,且队列也达上限,则入队失败
* 为了达到较好的性能,此处允许由于并发创建出略多于实际并发任务数的线程,但仍然不可能超过最大线程,故不选择上锁
*
* @param runnable
* @return
*/
@Override
public boolean offer(Runnable runnable) {
if (executor == null) {
if (!loged) {
logger.warn("[MyThreadPoolTaskQueue]无法找到可用线程池,将运行在默认模式");
loged = true;
}
return super.offer(runnable);
}
int currentPoolThreadSize = executor.getPoolSize();
//如果已提交的任务数 < 当前线程池的线程数量,则直接入队让其处理,线程池会自动创建出新线程
if (executor.getSubmittedTaskCount() < currentPoolThreadSize) {
return super.offer(runnable);
}
//如果已提交的任务数 < 当前线程池的最大线程数,则返回false,此时ThreadPoolExecutor线程池会尝试创建出新的线程来执行任务(关键代码),但由于并发的存在,可能瞬时达到最大线程数导致创建线程失败
if (currentPoolThreadSize < executor.getMaximumPoolSize()) {
return false;
}
//如果已提交的任务数 >= 当前线程池的最大线程数,则将任务入队,入队可能失败,此时ThreadPoolExecutor线程池会尝试创建出新的线程来执行任务,但必然会失败,并抛出RejectedExecutionException异常
return super.offer(runnable);
}
/**
* 任务重新尝试入队
* ThreadPoolExecutor在通过 #offer(Runnable)入队失败后,会抛出RejectedExecutionException异常,此时不一定是队列已满,故应尝试重新入队
*
* @param o
* @param timeout
* @param unit
* @return
* @throws InterruptedException
*/
public boolean retryOffer(Runnable o, long timeout, TimeUnit unit) throws InterruptedException {
if (executor.isShutdown()) {
throw new RejectedExecutionException("线程池已关闭");
}
return super.offer(o, timeout, unit);
}
}
这里主要重写了offer方法,在线程池的线程数量未达到maxSize时强制入队失败,强迫线程池创建出新的线程
重写HystrixConcurrencyStrategy
由于HystrixConcurrencyStrategy的设计,此处有两种方案,一种是直接重写,并把重写后的类替代原有的HystrixConcurrencyStrategyDefault注入,另一种是模仿springcloud框架下的SleuthHystrixConcurrencyStrategy等的实现,对原有Default实现做包装,考虑到与其他框架的兼容性,此处选择第一种方案,直接替换底层的Default实现
/**
* 使用自定义线程池的Hystrix线程隔离策略
* 有HystrixPlugin使用ServiceLoader(SPI)方式加载
*
*/
public class MyHystrixConcurrencyStrategy extends HystrixConcurrencyStrategy {
Logger logger = LoggerFactory.getLogger(MyHystrixConcurrencyStrategy.class);
@Override
public ThreadPoolExecutor getThreadPool(HystrixThreadPoolKey threadPoolKey,
HystrixProperty<Integer> corePoolSize,
HystrixProperty<Integer> maximumPoolSize,
HystrixProperty<Integer> keepAliveTime, TimeUnit unit,
BlockingQueue<Runnable> workQueue) {
final int dynamicCoreSize = corePoolSize.get();
final int dynamicMaximumSize = maximumPoolSize.get();
//若核心线程数 < 最大线程数,且队列支持自定义线程池,使用自定义线程池,此处由于workQueue来自getBlockingQueue(int maxQueueSize),存在被其他delegate重写或被AOP等的可能性,故应做兼容处理
if (dynamicCoreSize < dynamicMaximumSize && workQueue instanceof MyThreadPoolTaskQueue && ((MyThreadPoolTaskQueue) workQueue).binded()) {
return new MyThreadPoolExecutor(dynamicCoreSize, dynamicMaximumSize, keepAliveTime.get(), TimeUnit.MINUTES, (MyThreadPoolTaskQueue) workQueue, this.getThreadFactory(threadPoolKey));
} else {
logger.warn("Hystrix线程池[{}]配置不适合使用[自定义线程池],将使用默认线程池", threadPoolKey.name());
return super.getThreadPool(threadPoolKey, corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
}
}
@Override
public ThreadPoolExecutor getThreadPool(HystrixThreadPoolKey threadPoolKey,
HystrixThreadPoolProperties threadPoolProperties) {
final boolean allowMaximumSizeToDivergeFromCoreSize = threadPoolProperties.getAllowMaximumSizeToDivergeFromCoreSize().get();
final int maxQueueSize = threadPoolProperties.maxQueueSize().get();
final int dynamicMaximumSize = threadPoolProperties.maximumSize().get();
final int dynamicCoreSize = threadPoolProperties.coreSize().get();
//若允许线程动态调整大小,且队列 > 0,且最大线程数>核心线程数,则使用优化过的线程池,否则使用默认线程池
if (allowMaximumSizeToDivergeFromCoreSize && maxQueueSize > 0 && dynamicCoreSize < dynamicMaximumSize) {
final BlockingQueue<Runnable> workQueue = this.getBlockingQueue(maxQueueSize);
//此处由于workQueue来自getBlockingQueue(int maxQueueSize),存在被其他delegate重写或被AOP等的可能性,故应做兼容处理
if (workQueue instanceof MyThreadPoolTaskQueue && !((MyThreadPoolTaskQueue) workQueue).binded()) {
final MyThreadPoolTaskQueue myThreadPoolTaskQueue = (MyThreadPoolTaskQueue) workQueue;
final int keepAliveTime = threadPoolProperties.keepAliveTimeMinutes().get();
final ThreadFactory threadFactory = this.getThreadFactory(threadPoolKey);
return new MyThreadPoolExecutor(dynamicCoreSize, dynamicMaximumSize, keepAliveTime, TimeUnit.MINUTES, myThreadPoolTaskQueue, threadFactory);
} else {
logger.warn("[自定义线程池]队列被重写或已被使用,[自定义线程池]无法生效,将使用Hystrix默认线程池");
return super.getThreadPool(threadPoolKey, threadPoolProperties);
}
} else {
logger.warn("Hystrix线程池[{}]的配置不适合使用[自定义线程池],将使用默认线程池", threadPoolKey.name());
return super.getThreadPool(threadPoolKey, threadPoolProperties);
}
}
private static ThreadFactory getThreadFactory(final HystrixThreadPoolKey threadPoolKey) {
if (!PlatformSpecific.isAppEngineStandardEnvironment()) {
return new ThreadFactory() {
private final AtomicInteger threadNumber = new AtomicInteger(0);
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r, "my-hystrix-" + threadPoolKey.name() + "-" + threadNumber.incrementAndGet());
thread.setDaemon(true);
return thread;
}
};
} else {
return PlatformSpecific.getAppEngineThreadFactory();
}
}
/**
* 重写获取队列逻辑
*
* @param maxQueueSize
* @return
*/
@Override
public BlockingQueue<Runnable> getBlockingQueue(int maxQueueSize) {
if (maxQueueSize <= 0) {
return new SynchronousQueue<>();
} else {
return new MyThreadPoolTaskQueue<>(maxQueueSize);
}
}
}
该类主要重写思路为,在配置不允许线程池动态调整(包括coreSize与maxSize相同等场景)时,尽量使用hystrix原有线程池与队列,而在启用了动态调整后,使用我们重写的线程池与队列替代原有队列
注入MyHystrixConcurrencyStrategy
此处采用SPI的方案进行注入(SPI技术在这里不详细展开)
在resources文件夹下新建META-INF文件夹,在其中新建services文件夹,创建名为com.netflix.hystrix.strategy.concurrency.HystrixConcurrencyStrategy的文件
其内容为我们的类名
com.rytech.hystrix.MyHystrixConcurrencyStrategy