生产实践—线程池与异步任务编排

最新推荐文章于 2024-05-20 15:11:39 发布

winsonWu1996

最新推荐文章于 2024-05-20 15:11:39 发布

阅读量247

点赞数

文章标签： java 开发语言

本文链接：https://blog.csdn.net/weixin_46522694/article/details/124603217

版权

现今的我们使用的服务器，大都是**多处理器多核**配置，资源充足。为**充分利用服务器性能**，**解耦调用线程与异步线程**，**提升响应速度**，使用**并发编程**成为了我们较好的选择。本文将就`JDK`提供的**线程池**，以文件上传的示例讲解线程池的打开方式。
## **一、线程池介绍**
JDK中提供的线程池核心实现类是`ThreadPoolExecutor`，使用***IDEA show Diagrams***查看类继承关系如下：

![1649818738(1).png](https://p1-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/9be889ab859d41f781d9176354b0d899~tplv-k3u1fbpfcp-watermark.image?)

- 顶层接口`Executor`仅提供了一个`void execute(Runnable command)`方法，将任务定义与任务执行解耦，用户只需要定义`Runnable`任务即可。
- `ExecutorService`接口继承了`Executor`接口，在任务执行的基础上，增加了带返回的`<T> Future<T> submit(Callable<T> task)`方法，以及**批量执行异步任务**及**线程池启停**等管理功能。
- `AbstractExecutorService`实现了`ExecutorService`，作为任务模板，**串联任务执行的流程**，让下层实现类只需关注任务执行。
- `ThreadPoolExecutor`则实现了**任务管理**、**线程管理**、**线程池生命周期管理**等功能。
## **二、任务执行流程**
接下来我们通过源码看下线程池的默认执行流程：
```
...
// 获取ctl参数，高3位表示运行状态，低29位表示工作线程数
int c = ctl.get();
// 工作线程数小于核心线程数，尝试创建线程
if (workerCountOf(c) < corePoolSize) {
// 线程数和运行状态符合预期,新增工作线程
if (addWorker(command, true))
return;
c = ctl.get();
}
//工作线程数大于等于核心线程数，检查运行状态并尝试进入任务队列
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
// 再次检查运行状态，如果状态异常（如执行shutdownNow），则移除任务并回调拒绝策略。
if (!isRunning(recheck) && remove(command))
// 执行拒绝策略
reject(command);
// 如果工作线程为0，则初始化一个工作线程。
// 极限情况，刚入队时，线程都被回收。
else if (workerCountOf(recheck) == 0)
// 新增线程
addWorker(null, false);
}
// 运行线程数大于等于核心线程数且队列已满尝试新增线程
else if (!addWorker(command, false))
// 新增失败执行拒绝策略
reject(command);
}
```
流程图：

![image.png](https://p9-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/0c2cb876f8994411a8d94c9e24c2e2b4~tplv-k3u1fbpfcp-watermark.image?)

当然我们这里介绍的是线程池**默认**的执行流程，这类流程适合**CPU密集型**应用，目前也有不少中间件基于`ThreadPoolExecutor`进行二次开发。例如`Tomcat`、`Netty`、`Dubbo`等都有相应的实现，`tomcat`将执行流程改为，**先将线程数提升到最大线程再进入队列**，从而**减少IO密集型应用阻塞时的资源浪费**。

## **二、自定义线程池**
### 2.1 线程池创建
JDK本身提供一些开箱即用的线程池，如`FixedThreadPool`、`CachedThreadPool`等，但参数设定固定且部分线程池使用无界队列，在系统并发量过高或程序设计出现缺陷时，极容易导致内存溢出(out of memory)或其他一些不可预知的异常。

这里我们使用`ThreadPoolExecutor`如下构造函数进行线程池的创建。
```
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler)
```
线程池创建代码如下：

```
/**
* @author winsonWu
* @Description: thread pool creating configuration
* @date Date : 2021.04.13 16:00
*/
@Configuration
public class ThreadPoolCreator {

/**
* 核心线程数
*/
private static int corePoolSize = Runtime.getRuntime().availableProcessors() + 1;

/**
* 最大线程数避免内存交换设置为核心核心线程数
*/
private static int maximumPoolSize = corePoolSize;

/**
* 最大空闲时间
*/
private static long keepAliveTime = 3;

/**
* 最大空闲时间单位
*/
private static TimeUnit unit = TimeUnit.MINUTES;

/**
* 使用有界队列，避免内存溢出
*/
private static BlockingQueue<Runnable> workQueue = new LinkedBlockingDeque<>(500);

/**
* 线程工厂，这里我们使用可命名的线程工厂，方便业务区分以及生产问题排查。
*/
private static ThreadFactory threadFactory = new NamedThreadFactory("taskResolver");

/**
* 拒绝策略根据业务选择或者自定义
*/
private static RejectedExecutionHandler handler = new ThreadPoolExecutor.AbortPolicy();

@Bean
public ThreadPoolExecutor threadPoolExecutor(){
return new ThreadPoolExecutor(
corePoolSize,
maximumPoolSize,
keepAliveTime, unit,
workQueue,
threadFactory,
handler);
}
}
```
### 2.2 **核心线程数配置**
并发任务一般分为**CPU密集型任务**，**IO密集型任务**两类。

**CPU密集型任务**，需要CPU进行复杂、高密度的运算。这种类型的任务不能创建过多的线程，否则将会频繁引起**上文切换**，降低资源使用率，降低任务处理速度; **IO密集型任务**，线程则不会对CPU资源要求过于苛刻，可能大部分时间**阻塞在IO**，增加线程数量可以提高并发度，尽可能多处理任务。一般经验化配置：

```
CPU密集型 N + 1 但尽量不超过操作系统核数2倍
IO密集型 2N + 1
N为服务器核数。
```
生产环境，建议具体设置根据**压测**结果决定。

### 2.3 **阻塞队列**

**阻塞队列(BlockingQueue)** 在队列为空时，获取元素的线程会阻塞，等待队列变为非空。当队列满时，存储元素的线程会阻塞，等待队列被获取消费，天然支持线程池这类生产消费者模型。常见阻塞队列如下：
![image.png](https://p9-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/80418d86a39a44d6977ba42520df21db~tplv-k3u1fbpfcp-watermark.image?)
大多数场景，我们使用`LinkedBlockingQueue`即可解决，这里我们也选择使用`LinkedBlockingQueue`。

### 2.4 **拒绝策略**
默认情况下，线程池阻塞队列已满且线程池已达到最大线程数，会执行拒绝策略，JDK也为我们内置了四种拒绝策略：

![image.png](https://p1-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/9a385a73de894d8dbfaa26fa5a0fd29a~tplv-k3u1fbpfcp-watermark.image?)
我们这里以两个场景举例说明一下，
- 场景一：系统监控。通过`自定义注解`与`AOP`解析需要监控的接口并获取其出入参，然后通过异步任务将日志存储到HDFS。

日志采集这种场景，少量的数据丢失对业务影响并不大，因而我们可以配置核心线程数为1，以降低对服务器资源的占用，阻塞队列容量可以根据压测结果适当增加，拒绝策略则使用`DiscardPolicy`或`AbortPolicy`,但要注意，如果使用`AbortPolicy`，`execute(...)`方法使用**Try Catch**或**UncaughtExceptionHandler(不推荐)** 进行异常处理，`submit(...)`方法使用`Future.get()`获取异常进行处理。
- 场景二：消息队列消费。消息堆积可以通过异步线程进行批量处理。

这种场景，数据不能丢失，因此我们采用`CallerRunsPolicy`让调用线程执行消息处理逻辑。但这种方式会对调用线程执行的业务产生影响，更好的方式可以采用**自定义拒绝策略**进行持久化或者放入队列。我们先看下`CallerRunsPolicy`拒绝策略的实现：
```
public static class CallerRunsPolicy implements RejectedExecutionHandler {
/**
* Creates a {@code CallerRunsPolicy}.
*/
public CallerRunsPolicy() { }

/**
* Executes task r in the caller's thread, unless the executor
* has been shut down, in which case the task is discarded.
*
* @param r the runnable task requested to be executed
* @param e the executor attempting to execute this task
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
if (!e.isShutdown()) {
r.run();
}
}
}
```
可以看到我们自定义拒绝策略，只需要实现`RejectedExecutionHandler`接口，并覆写`rejectedExecution(Runnable r, ThreadPoolExecutor e)`，自定义示例如下：
```
/**
* @Author winsonWu
* @Description: 持久化拒绝策略
* @date Date : 2022.04.14 9:38
**/
public class DataBaseStoragePolicy implements RejectedExecutionHandler {
@Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
// todo duration or something
}
}
```
### 2.5 **线程预热与回收**

- **线程预热**

如果可预见服务器启动后就会产生大量请求，如拒绝策略模块中提到的堆积消息处理的场景，我们可以使用线程池预热，提前创建核心线程，以提升服务相应速度。在`ThreadPoolExecutor`中有三个方法：

```
// 启动一个核心线程
public boolean prestartCoreThread() {
return workerCountOf(ctl.get()) < corePoolSize &&
addWorker(null, true);
}
```
```
// 启动所有核心线程
public int prestartAllCoreThreads() {
int n = 0;
while (addWorker(null, true))
++n;
return n;
}
```
```
// 保证至少一个核心线程启动
void ensurePrestart() {
int wc = workerCountOf(ctl.get());
if (wc < corePoolSize)
addWorker(null, true);
else if (wc == 0)
addWorker(null, false);
}
```

- **线程回收**

默认情况下，当`workerCount`大于`corePoolSize`的时候，空闲线程的空闲时间超过了`keepAliveTime`所设置的时间，线程池就会自动回收该线程，另外核心线程数如果设置`allowCoreThreadTimeOut`参数，也同样可以被回收，以提高资源使用率。
### 2.6 **线程池监控**
`ThreadPoolExecutor`自身提供了一些状态查询方法，可以获取一些线程池状态信息，我们修改前面的Bean定义来看一下，相关方法在代码注释中已经写出：
```
...
@Bean
public ThreadPoolExecutor threadPoolExecutor(){
return new ThreadPoolExecutor(
corePoolSize,
maximumPoolSize,
keepAliveTime, unit,
workQueue,
threadFactory,
handler){

// 设定任务前执行动作
@Override
protected void beforeExecute(Thread t, Runnable r) {
// 获取线程池大小
System.out.println("线程池大小：" + this.getPoolSize());
// 获取核心线程数
System.out.println("核心线程数：" + this.getCorePoolSize());
// 获取最大线程数
System.out.println("最大线程数：" + this.getLargestPoolSize());
// 获取活跃线程数
System.out.println("活跃线程数：" + this.getActiveCount());
}

// 设定任务后执行动作
@Override
protected void afterExecute(Runnable r, Throwable t) {
// 获取活跃线程数
System.out.println("活跃线程数：" + this.getActiveCount());
// 获取任务数
System.out.println("任务数：" + this.getTaskCount());
}

// 设定线程池终止执行动作
@Override
protected void terminated() {
// 获取已完成任务数
System.out.println("已完成任务数：" + this.getCompletedTaskCount());
}
};
}
```

### 2.7 **线程池生命周期**
线程池主要有五种状态，代码定义如下：
```
private static final int RUNNING = -1 << COUNT_BITS;
private static final int SHUTDOWN = 0 << COUNT_BITS;
private static final int STOP = 1 << COUNT_BITS;
private static final int TIDYING = 2 << COUNT_BITS;
private static final int TERMINATED = 3 << COUNT_BITS;
```
状态转换如下：

![image.png](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/18e44d0a7ef24b2e942d648fdf8a2ff1~tplv-k3u1fbpfcp-watermark.image?)
这里需要注意的是我们执行`shutdown()`方法后,不再接收新任务，但会处理阻塞队列中剩余的任务，而`shutdownNow()`方法，不再接收新任务的同时也会中断阻塞队列中剩余的任务。
## **三、线程池实践**
### 3.1 **基础实践**
前面我们已经定义好了线程池，我们先来尝试下基础方法的使用：
```
@Resource
private ThreadPoolExecutor threadPoolExecutor;

@Test
public void testMultiThread() throws InterruptedException {
// 线程池预热，提前启动所有核心线程
threadPoolExecutor.prestartAllCoreThreads();
StopWatch stopwatch = new StopWatch("线程池测试");
stopwatch.start("execute");
// execute(e(Runnable command)
CountDownLatch forExecute = new CountDownLatch(1);
threadPoolExecutor.execute(() -> {
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
System.out.println("interrupted ignore");
}
System.out.println("execute(Runnable command) test");
forExecute.countDown();
});
forExecute.await();
stopwatch.stop();
stopwatch.start("submit");
// submit(Runnable command)
CountDownLatch forSubmit = new CountDownLatch(1);
final Future<String> future = threadPoolExecutor.submit(() -> {
System.out.println("submit(Runnable command) test");
forSubmit.countDown();
return "submit(Runnable command) test";
});
try {
final String result = future.get();
System.out.println("result: " + result);
} catch (ExecutionException e) {
//todo 自定义异常处理
}
forSubmit.await();
stopwatch.stop();
System.out.println(stopwatch.prettyPrint());
}
```
执行结果如下：

![1649913889(1).png](https://p9-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/bd7c8c25f5a64c9889da92cdd141be5d~tplv-k3u1fbpfcp-watermark.image?)

前面已经讲过`execute(e(Runnable command)`和`submit(Runnable command)`的区别，这里就不再赘述。另外可以看到我们使用CountDownLatch来进行线程协同，execute执行结束后，submit才开始执行。

### 3.2 **文件上传实践**
接下来我们通过一个文件上传的功能来演示使用`CompletableFuture`来进行异步任务编排。我们要实现的业务描述如下：

1. **实现批量文件上传**
2. **文件上传完成后返回文件名称与文件ID列表**
3. **出现异常打印日志到控制台（演示，生产环境可以自定义）**

实现代码如下:

**定义返回对象：**
```
@Data
public class FileEntry implements Serializable {

/**
* 文件ID
*/
private String fileId;

/**
* 文件名
*/
private String fileName;

}
```
**文件上传逻辑：**
```
/**
* 测试代码，这里我们直接上传到固定目录
* @param eachFile
* @return
*/
private FileEntry createFileEntry(MultipartFile eachFile){
// 生成文件ID
String fileId = UUID.randomUUID().toString().replace("-", "");
File desFile = new File(FILE_LOCATION + fileId + "_" + eachFile.getOriginalFilename());
try {
eachFile.transferTo(desFile);
} catch (IOException e) {
throw new BizException("文件上传失败");
}
// 文件上传成功，构建返回参数
FileEntry fileEntry = new FileEntry();
fileEntry.setFileName(eachFile.getOriginalFilename());
fileEntry.setFileId(fileId);
return fileEntry;
}
```
**主逻辑：**
```
/**
* 文件上传
* @param files
*/
public ArrayList<FileEntry> uploadFile(MultipartFile[] files){
// 初始化返回值
ArrayList<FileEntry> fileEntryList = new ArrayList<>(files.length);
List<CompletableFuture<FileEntry>> futureList = new ArrayList<>(files.length);
for (MultipartFile eachFile : files){
// 使用之前定义的线程池执行文件上传逻辑
CompletableFuture<FileEntry> future = CompletableFuture.supplyAsync(() ->
createFileEntry(eachFile), threadPoolExecutor);
// 添加到future列表
futureList.add(future);
}
CompletableFuture<Void> fileUploadFuture = CompletableFuture
.allOf(futureList.toArray(new CompletableFuture[futureList.size()]))
.whenComplete((v, t) -> futureList.forEach(future -> {
// 添加返回结果到返回值列表
fileEntryList.add(future.getNow(null));
}))
.exceptionally(exception -> {
// todo 自定义逻辑
System.out.println("error occurred：" + exception.getMessage());
return null;
});
// 阻塞主线程，等待文件全部上传
fileUploadFuture.join();
// 返回entry列表
return fileEntryList;
}
```
至此，我们简化逻辑实现完成，`CompletableFuture`用法可查阅参考资料部分。

## **四、参考资料**
- [明心见性-如何由表及里精通线程池设计与原理](https://juejin.cn/post/6988792385488551944/)
- [琳琅满目-细数CompletableFuture的那些花式玩法](https://juejin.cn/post/6996943652832411684/)
- [Java线程池实现原理及其在美团业务中的实践](https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html)
- [原生线程池这么强大，Tomcat 为何还需扩展线程池?](https://www.shuzhiduo.com/A/l1dyZ3xxze/)

winsonWu1996

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
生产实践—线程池与异步任务编排

现今的我们使用的服务器，大都是**多处理器多核**配置，资源充足。为**充分利用服务器性能**，**解耦调用线程与异步线程**，**提升响应速度**，使用**并发编程**成为了我们较好的选择。本文将就`JDK`提供的**线程池**，以文件上传的示例讲解线程池的打开方式。## **一、线程池介绍**JDK中提供的线程池核心实现类是`ThreadPoolExecutor`，使用***IDEA show Diagrams***查看类继承关系如下：![1649818738(1).png](https://p
复制链接

扫一扫