并行流与串行流 Fork/Join框架

最新推荐文章于 2023-05-16 11:30:10 发布

失业找工作中

最新推荐文章于 2023-05-16 11:30:10 发布

阅读量624

点赞数

分类专栏： JAVA8新特性

本文链接：https://blog.csdn.net/m0_37450089/article/details/81435507

版权

JAVA8新特性专栏收录该内容

16 篇文章 0 订阅

订阅专栏

一、并行流概念：

　　并行流就是把一个内容分成多个数据块，并用不同的线程分别处理每个数据块的流。

　　java8中将并行进行了优化，我们可以很容易的对数据进行并行操作。Stream API可以声明性的通过parallel()与sequential()在并行流与顺序流之间进行切换。

二、Fork/Join 框架

　　就是在必要的情况下，将一个大任务，进行拆分(fork)成若干个小任务（拆到不可再拆时），再将一个个的小任务运算的结果进行 join 汇总。

　　Fork/Join框架与传统线程池的区别：

　　采用 “工作窃取”模式（work-stealing）：当执行新的任务时它可以将其拆分分成更小的任务执行，并将小任务加到线程队列中，然后再从一个随机线程的队列中偷一个并把它放在自己的队列中。

　　相对于一般的线程池实现,fork/join框架的优势体现在对其中包含的任务的处理方式上.在一般的线程池中,如果一个线程正在执行的任务由于某些原因无法继续运行,那么该线程会处于等待状态，

　　而在fork/join框架实现中,如果某个子问题由于等待另外一个子问题的完成而无法继续运行.那么处理该子问题的线程会主动寻找其他尚未运行的子问题来执行.这种方式减少了线程的等待时间,提高了性能。

　　Fork/Join实现例子

　　1、使用传统forkJoin实现　

实现一个对某个区间的数字进行累加

import java.util.concurrent.RecursiveTask;

/**
 * @author Dongguo
 * @date 2021/8/16 0016-7:10
 * @description:/计算从start-end之和
 */

public class ForkJoinCalculate extends RecursiveTask<Long> {

    private static final long serialVersionUID = 13475679780L;

    private long start;//初始值
    private long end;//结束值

    private static final long THRESHOLD = 10000L; //拆分临界值

    public ForkJoinCalculate(long start, long end) {
        this.start = start;
        this.end = end;
    }

    @Override
    protected Long compute() {
        long length = end - start;
        //个数小于临界值
        if (length <= THRESHOLD) {
            long sum = 0;

            for (long i = start; i <= end; i++) {
                sum += i;
            }
            return sum;
        } else {
            //个数大于临界值 拆分为两部分
            long middle = (start + end) / 2;

            ForkJoinCalculate left = new ForkJoinCalculate(start, middle);
            left.fork(); //并将该子任务压入线程队列

            ForkJoinCalculate right = new ForkJoinCalculate(middle + 1, end);
            right.fork();//并将该子任务压入线程队列
            //汇总
            return left.join() + right.join();
        }
    }
}

@Test
public void test1() {
    //执行前时间
    //long start = System.currentTimeMillis();
    Instant start = Instant.now();//java8时间
    //创建一个ForkJoinPool线程池
    ForkJoinPool pool = new ForkJoinPool();
    ForkJoinTask<Long> task = new ForkJoinCalculate(0, 100000000L);//1亿的累加

    long sum = pool.invoke(task);
    System.out.println(sum);
    //执行后时间
    //long end = System.currentTimeMillis();
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//138毫秒
}

　　2、使用java8并行流实现 jdk1.8 优化forkjoin框架

前一章节我们简要地提到了Stream接口可以让你非常方便地处理它的元素：可以可以通过对收集源调用parallelStream方法来把集合转换为并行流。并行流就是一个把内容分成多个数据块，并用不同的线程分别处理每个数据块的流。这样一来，你就可以自动把给定操作的工作负荷分配给多核处理器的所有内核，让它们都忙起来。

@Test
public void test3() {
    long start = System.currentTimeMillis();

    Long sum = LongStream.rangeClosed(0L, 100000000L)
            .parallel()
            .sum();

    System.out.println(sum);

    long end = System.currentTimeMillis();

    System.out.println("耗费的时间为: " + (end - start)); 
}

并行流内部使用了默认的ForkJoinPool，它默认的线程数量就是你的处理器数量，这个值是由Runtime.getRuntime().available-Processors()得到的。

但是你可以通过系统属性java.util.concurrent.ForkJoinPool.common.parallelism来改变线程池大小，

如下所示：System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism","12");

这是一个全局设置，因此它将影响代码中所有的并行流。反过来说，目前还无法专为某个并行流指定这个值。一般而言，让ForkJoinPool的大小等于处理器数量是个不错的默认值，除非你有很好的理由，否则我们强烈建议你不要修改它。

请注意，在现实中，对顺序流调用parallel方法并不意味着流本身有任何实际的变化。它在内部实际上就是设了一个boolean标志，表示你想让调用parallel之后进行的所有操作都并行执行。类似地，你只需要对并行流调用sequential方法就可以把它变成顺序流。请注意，你可能以为把这两个方法结合起来，就可以更细化地控制在遍历流时哪些操作要并行执行，哪些要顺序执行。例如，你可以这样做：

transactions.stream()
        .filter(...)
        .sequential()
        .map(...)
        .parallel()
        .reduce();

但最后一次parallel或sequential调用会影响整个流水线。在本例中，流水线会并行执行，因为最后调用的是它。

测量流性能

forkjoin与普通for循环性能对比

@Test
public void test1() {
    //执行前时间
    //long start = System.currentTimeMillis();
    Instant start = Instant.now();//java8时间
    //创建一个ForkJoinPool线程池
    ForkJoinPool pool = new ForkJoinPool();
    ForkJoinTask<Long> task = new ForkJoinCalculate(0, 100000000L);//1亿的累加

    long sum = pool.invoke(task);
    System.out.println(sum);
    //执行后时间
    //long end = System.currentTimeMillis();
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//138毫秒
}

@Test
public void test2(){
    Instant start = Instant.now();
    long sum =0l;
    for (int i = 0; i < 100000000L; i++) {
        sum +=i;
    }
    System.out.println(sum);
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//34毫秒
}

这样一看普通for循环性能比forkjoin性能要好啊，

1是for是比较底层的代码，效率相对而言是比较高的

2forkjoin要对数值范围进行拆分，如果数值过小，明明单线程很快就能执行完forkjoin还要不断地进行拆分执行再汇总，这些操作也都是需要时间的

那么我们加大数值再进行测试

100亿

@Test
public void test1() {
    //执行前时间
    //long start = System.currentTimeMillis();
    Instant start = Instant.now();//java8时间
    //创建一个ForkJoinPool线程池
    ForkJoinPool pool = new ForkJoinPool();
    ForkJoinTask<Long> task = new ForkJoinCalculate(0, 10000000000L);//100亿的累加

    long sum = pool.invoke(task);
    System.out.println(sum);
    //执行后时间
    //long end = System.currentTimeMillis();
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//2016毫秒
}

@Test
public void test2(){
    Instant start = Instant.now();
    long sum =0l;
    for (int i = 0; i < 10000000000L; i++) {
        sum +=i;
    }
    System.out.println(sum);
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//3135毫秒
}

500亿

@Test
public void test1() {
    //执行前时间
    //long start = System.currentTimeMillis();
    Instant start = Instant.now();//java8时间
    //创建一个ForkJoinPool线程池
    ForkJoinPool pool = new ForkJoinPool();
    ForkJoinTask<Long> task = new ForkJoinCalculate(0, 50000000000L);//500亿的累加

    long sum = pool.invoke(task);
    System.out.println(sum);
    //执行后时间
    //long end = System.currentTimeMillis();
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//9260毫秒
}

@Test
public void test2(){
    Instant start = Instant.now();
    long sum =0l;
    for (int i = 0; i < 50000000000L; i++) {
        sum +=i;
    }
    System.out.println(sum);
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//15720毫秒
}

当数值越大，forkjoin的并行性能越明显

parallel与forkjoin性能对比

@Test
public void test1() {
    //执行前时间
    //long start = System.currentTimeMillis();
    Instant start = Instant.now();//java8时间
    //创建一个ForkJoinPool线程池
    ForkJoinPool pool = new ForkJoinPool();
    ForkJoinTask<Long> task = new ForkJoinCalculate(0, 100000000000L);//1000亿的累加

    long sum = pool.invoke(task);
    System.out.println(sum);
    //执行后时间
    //long end = System.currentTimeMillis();
    Instant end = Instant.now();
    System.out.println("耗费的时间为: " + (Duration.between(start,end).toMillis()));//19229毫秒
}

@Test
public void test3() {
    long start = System.currentTimeMillis();
    
    Long sum = LongStream.rangeClosed(0L, 100000000000L)
            .parallel()
            .sum();

    System.out.println(sum);

    long end = System.currentTimeMillis();

    System.out.println("耗费的时间为: " + (end - start)); //14146
}