并发编程实战- 线程池的使用

锦衣素颜

已于 2023-02-02 15:25:34 修改

阅读量261

点赞数

分类专栏：并发编程实战文章标签： java 多线程并发编程

于 2023-02-02 15:25:03 首次发布

本文链接：https://blog.csdn.net/qq_33924360/article/details/128820683

版权

并发编程实战专栏收录该内容

13 篇文章 0 订阅

订阅专栏

文章讨论了线程池在任务执行中的隐形耦合问题，如线程饥饿死锁和长时间任务对性能的影响。建议根据任务类型和资源设置线程池大小，并详细介绍了如何配置ThreadPoolExecutor，包括管理任务队列、设置饱和策略和自定义线程工厂。此外，还提到了递归算法的并行化优化。

摘要由CSDN通过智能技术生成

本章将介绍对线程池进行配置与调优的一些高级选项,并分析在使用任务执行框架时需要注意的各种危险,以及一些使用Executor的高级示例.

1.在任务与执行策略之间的隐形耦合

只有当任务都是同类型的并且相互独立,线程池的性能才能达到最佳.
如果将运行时间较长的和运行时间较短的线程混合在一起,那么除非线程池很大,否则将可能造成"拥塞".
如果提交的任务依赖于其它任务,那么除非线程池无限大,否则将可能造成死锁.

1.1 线程饥饿死锁

线程池中,如果任务依赖于其它任务,可能产生死锁.
单线程的Executor中,如果一个任务将另一个任务提交到同一个Executor中,并且等待这个任务的结果,那么会产生死锁.
在更大的线程池中,如果所有正在执行任务的线程都由于等待其他仍处于工作队列中的任务而阻塞,那么会发生同样的问题.
这种现象被称为线程饥饿死锁

/** 会发生饥饿死锁的示例 */
public class ThreadDeadlock {
	//单线程的Executor
    ExecutorService exec = Executors.newSingleThreadExecutor();
    public class RenderPageTask implements Callable<String> {
        @Override
        public String call() throws Exception {
            Future<String> header,footer;
            header = exec.submit(new LoadFileTask("header.html"));
            footer = exec.submit(new LoadFileTask("footer.html"));
            //将发生死锁--因为任务在等待子任务的结果
            return header.get() + footer.get();
        }
    }
}

每当提交了一个有依赖性的Executor任务时,要清楚地知道可能会出现线程"饥饿"死锁,因此需要在代码或配置Executor地配置文件中记录线程池地大小限制或配置限制.

1.2 运行时间较长的任务

如果任务执行时间较长,那么线程池的响应性也会变得糟糕.
可以采用定时的方式解决这个问题,利用将执行任务超时的线程中断重新放回队列,来让线程池把线程分配给执行其它更快能完成任务的线程
如果线程池中总是充满了被阻塞的任务,那么也可能表明线程池规模过小了

2.设置线程池大小

设置线程池大小必须分析计算环境,资源预算和任务的特性.部署的系统中有多少个CPU,多大的内存,任务是计算密集型还是I/O密集型,是否需要像JDBC连接这样的稀缺资源
如果执行不同类别的任务,并且它们之间的行为相差很大,应该考虑使用多个线程池,从而使每个线程池可以根据自身的工作负载来进行调整.
对于计算密集型的任务,通常设置 N(cpu) + 1通常能实现最优的利用率
对于I/O操作或者其它阻塞操作的任务,由于线程并不会一直执行,因此线程池的规模应该更大.
要正确设置线程池的大小,应该估算出任务的等待时间于计算时间的比值.
在这里插入图片描述
要使处理器达到期望的使用率,线程池最优大小等于:

3.配置ThreadPoolExecutor

通用构造函数

    public ThreadPoolExecutor(int corePoolSize,//核心线程大小,即空闲时的线程池的大小
                              int maximumPoolSize,//最大线程大小,超出核心线程大小的最大线程数
                              long keepAliveTime,//线程存活时间,线程空闲该时间后将被回收
                              TimeUnit unit,//单位
                              BlockingQueue<Runnable> workQueue,//线程池满时的阻塞队列
                              ThreadFactory threadFactory) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             threadFactory, defaultHandler);
    }

3.1 管理队列任务

ThreadPoolExecutor允许提供一个BlockingQueue来保存等待执行的任务,基本的任务排队方法有3种:无界队列,有界队列和同步移交
newFixedThreadPool 和呢SingleThreadExecutor都是用的LinkedBlockingQueue无界队列,线程阻塞时会放到队列中无限增加
一种更稳妥的方式是使用有界队列,有助于避免资源耗尽的情况,有界队列的大小于线程池的大小必须一起调节.如果线程池较小而队列较大,有助于减少内存使用量,降低CPU使用率,同时可以减少上下文切换,单会限制吞吐量
对于非常大且无界的线程池,可以用SynchronousQueue来避免任务排队,使用移交将更高效,因为任务会直接交给执行它的线程,而不是被首先放在队列中,然后由工作者线程从队列中提取该任务.

对于Executor,newCachedThreadPool工厂方法是一种很好的默认选择,他能提供比固定大小的线程池更好的排队性能.当需要限制当前任务的数量以满足资源管理需求时,那么可以选择固定大小的线程池,就像在接收网络客户请求的服务器应用程序中,如果不进行限制,那么很容易发生过载问题.

只有当任务相互独立,为线程池或工作队列设置界限才是合理的.如果任务之间存在依赖性,那么有界的线程池或队列就可能导致线程"饥饿"死锁问题.此时应该使用无界的线程池,例如newCachedThreadPool

3.2 饱和策略

ThreadPoolExecutor的饱和策略可以通过setRejectedExecutionHandler来修改
JDK提供了几种不同的策略实现:

AbortPolicy:中止策略,默认的饱和策略,抛出未检查的RejectedExecutionException.调用者可以捕获异常并处理.
DiscardPolicy:抛弃策略,当队列满时,新进来的任务会被抛弃
DiscardOldestPolicy:抛弃最旧的,最先提交的正在执行的任务会被抛弃,如果任务有优先级区分,不适用此策略
CallerRunsPolicy:调用者运行策略,实现了一种调节机制,不会抛弃任务,也不会抛出异常,而是将某些任务回退到调用者,从而降低新任务的流量.他不会在线程池的某个线程中执行新提交的任务,而是在一个调用了execute的线程中执行该任务.

当队列没有饱和策略时,也可以使用Semaphore信号量来限制任务的到达率

public class BoundedExecutor {
    private final Executor exec;
    private final Semaphore semaphore;

    public BoundedExecutor(Executor exec, int bound) {
        this.exec = exec;
        this.semaphore = new Semaphore(bound);
    }
    public void submitTask(final Runnable r) throws InterruptedException {
        semaphore.acquire();
        try {
            exec.execute(new Runnable() {
                @Override
                public void run() {
                    try {
                        r.run();
                    } finally {
                        semaphore.release();
                    }
                }
            });
        } catch (RejectedExecutionException) {
            semaphore.release();
        }
    }
}

3.3 线程工厂

线程池创建线程都是通过线程工厂来得,
可以自定义线程工厂,来给线程增加额外得功能,例如设定线程名称,写logger等

public class MyAppThread extends Thread {
    public static final String DEFAULT_NAME = "MyAppThread";
    private static volatile boolean debugLifecycle = false;
    private static final AtomicInteger created = new AtomicInteger();
    private static final AtomicInteger alive = new AtomicInteger();
    private static final Logger log = Logger.getAnonymousLogger();
    public MyAppThread(Runnable r) {
        this(r, DEFAULT_NAME);
    }
    public MyAppThread(Runnable runnable, String name) {
        super(runnable, name + "_" + created.incrementAndGet());
        setUncaughtExceptionHandler((t, e) -> log.log(Level.SEVERE, "UNCAUGHT in thread" + t.getName(), e));
    }
    @Override
    public void run() {
        boolean debug = MyAppThread.debugLifecycle;
        if (debug) {
            log.log(Level.FINE, "Created " + getName());
        }
        try {
            alive.incrementAndGet();
            super.run();
        } finally {
            alive.decrementAndGet();
            if (debug) log.log(Level.FINE, "Exiting " + getName());
        }
    }
    public static int getThreadsCreated() {
        return created.get();
    }
    public static int getThreadsAlive() {
        return alive.get();
    }
    public static boolean isDebugLifecycle() {
        return debugLifecycle;
    }
    public static void setDebugLifecycle(boolean debugLifecycle) {
        MyAppThread.debugLifecycle = debugLifecycle;
    }
}

ThreadPoolExecutor中提供了可以修改线程池大小参数等方法,可以对线程池进行配置.

4.扩展ThreadPoolExecutor

ThreadPoolExecutor中提供了一些方法用于扩展

beforeExecute:任务执行之前调用,如果抛出一个RuntimeException,那么任务将不被执行,并且afterExecute也不会被调用.
afterExecute:无论任务是从run中正常返回,还是抛出一个异常而返回,都会被调用
terminated:线程池关闭操作时调用,用来释放Executor在其生命周期里分配的各种资源,此外还可以执行发送通知,记录日志或手机finalize统计信息等.

这些方法可以添加日志,计时,监视或统计信息收集的功能.

示例:给线程池添加统计信息

public class TimingThreadPool extends ThreadPoolExecutor {
    private final ThreadLocal<Long> startTime = new ThreadLocal<>();
    private final Logger log = LoggerFactory.getLogger(getClass());
    private final AtomicLong numTasks = new AtomicLong();
    private final AtomicLong totalTime = new AtomicLong();
    public TimingThreadPool(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
    }
    @Override
    protected void beforeExecute(Thread t, Runnable r) {
        super.beforeExecute(t, r);
        log.info("Thread:{} ,start:{}",t,r);
        startTime.set(System.nanoTime());
    }
    @Override
    protected void afterExecute(Runnable r, Throwable t) {
        try {
            long endTime = System.nanoTime();
            long taskTime = endTime - startTime.get();
            numTasks.incrementAndGet();
            totalTime.addAndGet(taskTime);
            log.info("Thread:{} end:{} time:{}", t,r,taskTime);
        } finally {
            super.afterExecute(r, t);
        }
    }
    @Override
    protected void terminated() {
        try {
            log.info("Terminated: avg time :{}",totalTime.get() / numTasks.get());
        } finally {
            super.terminated();
        }
    }
}

5.递归算法的并行化

递归算法同样可以使用并行化进行优化,前提是每个任务都是独立的,每个迭代操作都不需要来自于后续迭代的结果.

public class Recursive {

    public<T> void sequentialRecursive(List<Node<T>> nodes, Collection<T> results) {
        for (Node<T> node : nodes) {
            results.add(node.compute());
            sequentialRecursive(n.getChildren(),results);
        }
    }
    //优化后
    public<T> void parallelRecursive(final Executor exec,List<Node<T>> nodes, Collection<T> results) {
        for (Node<T> node : nodes) {
            exec.execute(() -> results.add(node.compute()));
            parallelRecursive(exec,n.getChildren(),results);
        }
    }
}

可以通过以下方式等待所有结果

 public<T> Collection<T> getParallelResults(List<Node<T>> nodes) throws InterruptedException {
        ExecutorService exec = Executors.newCachedThreadPool();
        Queue<T> resultQueue = new ConcurrentLinkedQueue<>();
        parallelRecursive(exec,nodes,resultQueue);
        //创建一个特定于遍历过程的Executor,并使用shutdown和awaitTermination等方法
        exec.shutdown();
        exec.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);
        return resultQueue;
    }