Java8 ParallelStream并行流不一定返回乱序结果

最新推荐文章于 2024-08-05 09:09:40 发布

lijunfeng722

最新推荐文章于 2024-08-05 09:09:40 发布

阅读量6k

点赞数 1

分类专栏： Java 文章标签： Java Java8 Stream

本文链接：https://blog.csdn.net/u012364631/article/details/89020335

版权

Java 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

之前一直以为如果是并行流，那么返回的结果一定是乱序的。其实这是错误的。

Stream s = Stream.of("1","2","3","4","5","6","7");
s.parallel().collect(Collectors.toList()); //一定返回有序结果

并行/串行计算和计算过程、收集过程的有序无序是两码事。

是否有序跟并行流还是串行流没有关系，只跟Collector的Characteristics有关。

    enum Characteristics {
        /**
         * Indicates that this collector is <em>concurrent</em>, meaning that
         * the result container can support the accumulator function being
         * called concurrently with the same result container from multiple
         * threads.
         *
         * <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
         * then it should only be evaluated concurrently if applied to an
         * unordered data source. //即CONCURRENT的收集器只能用于无序源
         */
        CONCURRENT, //标记容器是线程安全的，如ConcurrentHashMap

        /**
         * Indicates that the collection operation does not commit to preserving
         * the encounter order of input elements.  (This might be true if the
         * result container has no intrinsic order, such as a {@link Set}.)
         */
        UNORDERED,

        /**
         * Indicates that the finisher function is the identity function and
         * can be elided.  If set, it must be the case that an unchecked cast
         * from A to R will succeed.
         */
        IDENTITY_FINISH
    }

而Collectors.toList()返回的收集器只是IDENTITY_FINISH的，见Collectors.toList()源码：

所以s.parallel().collect(Collectors.toList())一定返回有序结果。

另外可以看下collect() 方法的实现：

    public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
        A container;
        //如果是并行流且收集器CONCURRENT是无序的
        if (isParallel()//
                && (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
                && (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
            container = collector.supplier().get();
            BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
            forEach(u -> accumulator.accept(container, u));//此方法收集后的结果是无序的
        }
        else {
            container = evaluate(ReduceOps.makeRef(collector));//此方法收集后的结果有无序的，但仍可以是并行计算。
        }
        return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
               ? (R) container
               : collector.finisher().apply(container);
    }

    final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) {
        assert getOutputShape() == terminalOp.inputShape();
        if (linkedOrConsumed)
            throw new IllegalStateException(MSG_STREAM_LINKED);
        linkedOrConsumed = true;

        return isParallel()//判断是否并行流，来决定是否并行计算（使用Spliterator），跟收集器是否是CONCURRENT无关
               ? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
               : terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
    }

        @Override
        public <S> Void evaluateParallel(PipelineHelper<T> helper,
                                         Spliterator<S> spliterator) {
            if (ordered)//并行流仍然可以是顺序计算
                new ForEachOrderedTask<>(helper, spliterator, this).invoke();
            else
                new ForEachTask<>(helper, spliterator, helper.wrapSink(this)).invoke();
            return null;
        }

总之，并行流和串行流只决定任务是否并行，跟收集器的Characteristics是两码事。

可以是并行计算、顺序/乱序收集

也可以是串行计算、顺序收集

令见影响结果顺序与否的因素：https://www.baeldung.com/java-stream-ordering

重点：

中间操作除了sorted(),unsorted(),empty()都不影响结果顺序
终止操作：

ForEach logs the elements in the order they arrive from each thread.list.stream().parallel().forEach(e -> logger.log(Level.INFO, e));
forEachOrdered保证顺序，即使用于并行流。list.stream().parallel().forEachOrdered(e -> logger.log(Level.INFO, e));
collect()方法之后的顺序跟具体收集器有关，如Collectors.toSet()返回的收集器是UNORDERED，而toList()则不是。

参考：