Java8 ParallelStream返回结果顺序问题

1. 前言

之前一直以为如果是并行流,那么返回的结果一定是乱序的。其实这是错误的。

Stream s = Stream.of("1","2","3","4","5","6","7");
s.parallel().collect(Collectors.toList()); //一定返回有序结果

2. 源码

是否有序跟并行流还是串行流没有关系,只跟Collector的特性Characteristics有关。

    enum Characteristics {
        /**
         * Indicates that this collector is <em>concurrent</em>, meaning that
         * the result container can support the accumulator function being
         * called concurrently with the same result container from multiple
         * threads.
         *
         * <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
         * then it should only be evaluated concurrently if applied to an
         * unordered data source. //即CONCURRENT的收集器只能用于无序源
         */
        CONCURRENT, //标记容器是线程安全的,如ConcurrentHashMap
 
        /**
         * Indicates that the collection operation does not commit to preserving
         * the encounter order of input elements.  (This might be true if the
         * result container has no intrinsic order, such as a {@link Set}.)
         */
        UNORDERED,
 
        /**
         * Indicates that the finisher function is the identity function and
         * can be elided.  If set, it must be the case that an unchecked cast
         * from A to R will succeed.
         */
        IDENTITY_FINISH
    }

Collectors.toList()返回的收集器只是IDENTITY_FINISH的,见Collectors.toList()源码:

/**
     * Returns a {@code Collector} that accumulates the input elements into a
     * new {@code List}. There are no guarantees on the type, mutability,
     * serializability, or thread-safety of the {@code List} returned; if more
     * control over the returned {@code List} is required, use {@link #toCollection(Supplier)}.
     *
     * @param <T> the type of the input elements
     * @return a {@code Collector} which collects all the input elements into a
     * {@code List}, in encounter order
     */
    public static <T>
    Collector<T, ?, List<T>> toList() {
        return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
                                   (left, right) -> { left.addAll(right); return left; },
                                   CH_ID);
    }

所以s.parallel().collect(Collectors.toList())一定返回有序结果。

另外可以看下collect() 方法的实现:

    public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
        A container;
        //如果是并行流且收集器CONCURRENT是无序的
        if (isParallel()//
                && (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
                && (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
            container = collector.supplier().get();
            BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
            forEach(u -> accumulator.accept(container, u));//此方法收集后的结果是无序的
        }
        else {
            container = evaluate(ReduceOps.makeRef(collector));//此方法收集后的结果有无序的,但仍可以是并行计算。
        }
        return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
               ? (R) container
               : collector.finisher().apply(container);
    }
    @Override
    public void forEach(Consumer<? super E_OUT> action) {
        if (!isParallel()) {
            sourceStageSpliterator().forEachRemaining(action);
        }
       else {
           super.forEach(action);
        }
    }

    @Override
    public void forEach(Consumer<? super P_OUT> action) {
        evaluate(ForEachOps.makeRef(action, false));     // boolean orderd:为false
    }


    public static <T> TerminalOp<T, Void> makeRef(Consumer<? super T> action,
                                                  boolean ordered) {
        Objects.requireNonNull(action);
        return new ForEachOp.OfRef<>(action, ordered);
    }
    final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) {
        assert getOutputShape() == terminalOp.inputShape();
        if (linkedOrConsumed)
            throw new IllegalStateException(MSG_STREAM_LINKED);
        linkedOrConsumed = true;
 
        return isParallel()//判断是否并行流,来决定是否并行计算(使用Spliterator),跟收集器是否是CONCURRENT无关
               ? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
               : terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
    }
        @Override
        public <S> Void evaluateParallel(PipelineHelper<T> helper,
                                         Spliterator<S> spliterator) {
            if (ordered)//并行流仍然可以是顺序计算
                new ForEachOrderedTask<>(helper, spliterator, this).invoke();
            else
                new ForEachTask<>(helper, spliterator, helper.wrapSink(this)).invoke();
            return null;
        }

3. 总结

为了确保整个流(stream)中维持顺序,必须研究流的来源(documentation of the stream's source)、流的串/并行、所有的中间操作(intermediate operations)、所有的终止操作(terminal operation)是否维持顺序。

  1. 流的来源:如果数据源本身是无序的,那么讨论元素的执行顺序就没有意义;

  2. 流的串/并行
    串行流:对于串行的流,其数据源是有序的,如果中间操作中没有排序之类的影响顺序的操作,那么在最终操作中处理元素的顺序,和数据源中元素的顺序就是一致的;如果中间操作中有排序之类的操作,那么在最终操作中处理元素的顺序,和依次执行各个中间操作之后的元素顺序,是一致的。
    并行流:对于并行的流,其数据源是有序的,但是其最终操作中处理元素的顺序依然是随机的;但是并行流可以通过foreachOrdered保证执行顺序和数据源中元素的顺序一致。

    注意:
    处理元素的顺序与最终结果的顺序不是同一个概念,处理过程中的处理顺序可以是无序的,但最终的结果任然可以是有序的。例如,如果您使用类似:List<…> result=inputList.parallelStream().map(…).filter(…).collect(Collectors.toList());
    整个操作可能会受益于并行执行,但是无论您使用并行流还是顺序流,结果列表将始终处于正确的顺序。

  3. 中间操作
    中间操作除了sorted(),unsorted(),empty()都不影响结果顺序。

  4. 终止操作
    collect()方法之后的顺序跟具体收集器有关,如
    1)Collectors.toSet()返回的收集器是UNORDERED,而toList()则不是。
    2)foreach():ForEach logs the elements in the order they arrive from each thread.list.stream().parallel().forEach(e -> logger.log(Level.INFO, e));
    3)forEachOrdered():forEachOrdered保证顺序,即使用于并行流。list.stream().parallel().forEachOrdered(e -> logger.log(Level.INFO, e));

令见

4. 参考

  1. https://www.pianshen.com/article/7167346972/
  2. https://stackoverflow.com/questions/29216588/how-to-ensure-order-of-processing-in-java8-streams
  3. https://blog.csdn.net/weixin_38569499/article/details/87875183
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值