并行化stream流

最新推荐文章于 2024-05-27 17:00:08 发布

goalidea

最新推荐文章于 2024-05-27 17:00:08 发布

阅读量581

点赞数

分类专栏：技术博客译文章标签： java 开发语言

本文链接：https://blog.csdn.net/goalidea/article/details/125447427

版权

技术博客译专栏收录该内容

32 篇文章 0 订阅

订阅专栏

并行化stream流

优化stream流计算

A very exciting feature of the Stream API is the fact that a stream is capable of processing data in parallel.

Stream API并行处理数据只需要简单的调用parallel()

example

int parallelSum = 
    IntStream.range(0, 10)
             .parallel()
             .sum();

result

parallelSum = 45

并行化实现

Parallelization is implemented in the Stream API by using recursive decomposition of the data the stream is processing. It is build on top of the Fork/Join Framework, added in JDK 7.

理解Data Locality（数据的局部性）

并行化处理，主要是将stream流切分给多个cpu执行，多个cpu从cpu高速缓存中获取数据，如果cpu高速缓存中没有数据则从主存中刷到cpu高速缓存中。数据的局部性指的是cpu高速缓存中的数据。一般cpu高速缓存从主存中一次行刷64字节的数据。如果是原始数据类型则直接使用，否则要根据对象的引用找到具体的堆内存，再从主存中刷数据到cpu高速缓存。

要避免这种情况（即指针追逐）

切分数据源

并行化处理stream流的第一步就是切分数据源，高效的切分数据源，应该注意：

切分的数据结构应该简单快速
切分应该均匀：各个子stream流数据量应该相差不多

切分和调度工作

This is done by the Fork/Join Framework. The Fork/Join Framework handles a pool of threads, created when your application is launched, called the Common Fork/Join Pool. The number of threads in this pool is aligned with the number of cores your CPU has. Each thread in this pool has a waiting queue, in which the thread can store tasks.

The first thread of the pool creates a first task. The execution of this task decides if the computation is small enough to be computed sequentially or is too big and should be split.

If it is split, then two subtasks are created and stored in the queue of that thread. The main task then waits for the two sub-tasks to complete. While waiting, it is also stored in this waiting queue.

If the computation is conducted, then a result is produced. This result is a partial result of the whole computation. This task then returns the result to the main task that created it.

Once a task has the two results of the two subtasks it created, it can merge them to produce a result and return it to the main task that created it.

处理切分后的子stream流

处理子stream流不同于处理完整的stream流。有2个方面使得处理子stream流不同：

访问外部状态
从另一个子stream流中获取元素处理状态

这2方面会影响并行stream流的性能

访问外部状态

Accessing a state external to your stream is then made from another thread and may lead to race conditions.

example1

Set<String> threadNames =
        
IntStream.range(0, 100)
         // .parallel()
         .mapToObj(index -> Thread.currentThread().getName())
         .collect(Collectors.toSet());

System.out.println("Thread names:");
threadNames.forEach(System.out::println);

result

Thread names:
main

uncomment result

Thread names:
ForkJoinPool.commonPool-worker-3
ForkJoinPool.commonPool-worker-4
ForkJoinPool.commonPool-worker-2
ForkJoinPool.commonPool-worker-4
main
ForkJoinPool.commonPool-worker-5

example2

List<Integer> ints = new ArrayList<>();

IntStream.range(0, 1_000_000)
         .parallel()
         .forEach(ints::add);

System.out.println("ints.size() = " + ints.size());

因为外部状态ints的类型为ArrayList并不是线程安全的，所以每次运行的结果都会发生不一致。

排序的问题

在某些情况下，处理数据的顺序在 Stream API 中很重要。以下方法就是这种情况。

limit(n)
skip(n)
findFirst()

这三种方法需要记住流中元素的处理顺序，并且需要对元素进行计数以产生正确的结果。

There are called stateful operations, because they need to carry an internal state to work.

Having such stateful operations leads to overheads in parallel streams. For instance, limit() needs an internal counter to work correctly. In parallel, this internal counter is shared among different threads. Sharing a mutable state between threads is costly and should be avoided.

理解并行计算stream流的负载

Computing a stream in parallel adds some computations to handle parallelism. These elements have a cost, and you need to know them to make sure that this cost will not be too high compared to the benefits of going parallel.

Your data needs to be split. Splitting can be cheap, and it can be expensive, depending on the data you process. Bad locality for your data will make the splitting expensive.
Splitting needs to be efficient. It needs to create evenly split sub-streams. Some sources can be evenly split easily, some others can not.
Once split, the implementation processes your data concurrently. You should avoid any access to any external mutable state, and also avoid having an internal shared mutable state.
Then the partial results have to be merged. There are results that can be easily merged. Merging a sum of integers is easy and cheap. Merging collections is also easy. Merging hashmaps is more costly.