之前一直以为如果是并行流,那么返回的结果一定是乱序的。其实这是错误的。
Stream s = Stream.of("1","2","3","4","5","6","7");
s.parallel().collect(Collectors.toList()); //一定返回有序结果
并行/串行计算 和 计算过程、收集过程的有序无序是两码事。
是否有序跟并行流还是串行流没有关系,只跟Collector的Characteristics有关。
enum Characteristics {
/**
* Indicates that this collector is <em>concurrent</em>, meaning that
* the result container can support the accumulator function being
* called concurrently with the same result container from multiple
* threads.
*
* <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
* then it should only be evaluated concurrently if applied to an
* unordered data source. //即CONCURRENT的收集器只能用于无序源
*/
CONCURRENT, //标记容器是线程安全的,如ConcurrentHashMap
/**
* Indicates that the collection operation does not commit to preserving
* the encounter order of input elements. (This might be true if the
* result container has no intrinsic order, such as a {@link Set}.)
*/
UNORDERED,
/**
* Indicates that the finisher function is the identity function and
* can be elided. If set, it must be the case that an unchecked cast
* from A to R will succeed.
*/
IDENTITY_FINISH
}
而Collectors.toList()返回的收集器只是IDENTITY_FINISH的,见Collectors.toList()源码:
所以s.parallel().collect(Collectors.toList())一定返回有序结果。
另外可以看下collect() 方法的实现:
public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
A container;
//如果是并行流且收集器CONCURRENT是无序的
if (isParallel()//
&& (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
&& (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
container = collector.supplier().get();
BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
forEach(u -> accumulator.accept(container, u));//此方法收集后的结果是无序的
}
else {
container = evaluate(ReduceOps.makeRef(collector));//此方法收集后的结果有无序的,但仍可以是并行计算。
}
return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
? (R) container
: collector.finisher().apply(container);
}
final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) {
assert getOutputShape() == terminalOp.inputShape();
if (linkedOrConsumed)
throw new IllegalStateException(MSG_STREAM_LINKED);
linkedOrConsumed = true;
return isParallel()//判断是否并行流,来决定是否并行计算(使用Spliterator),跟收集器是否是CONCURRENT无关
? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
: terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
}
@Override
public <S> Void evaluateParallel(PipelineHelper<T> helper,
Spliterator<S> spliterator) {
if (ordered)//并行流仍然可以是顺序计算
new ForEachOrderedTask<>(helper, spliterator, this).invoke();
else
new ForEachTask<>(helper, spliterator, helper.wrapSink(this)).invoke();
return null;
}
总之,并行流和串行流只决定任务是否并行,跟收集器的Characteristics是两码事。
可以是并行计算、顺序/乱序收集
也可以是串行计算、顺序收集
令见 影响结果顺序与否的因素:https://www.baeldung.com/java-stream-ordering
重点:
- 中间操作除了sorted(),unsorted(),empty()都不影响结果顺序
- 终止操作:
- ForEach logs the elements in the order they arrive from each thread.list.stream().parallel().forEach(e -> logger.log(Level.INFO, e));
- forEachOrdered保证顺序,即使用于并行流。list.stream().parallel().forEachOrdered(e -> logger.log(Level.INFO, e));
- collect()方法之后的顺序跟具体收集器有关,如Collectors.toSet()返回的收集器是UNORDERED,而toList()则不是。
参考:
- https://stackoverflow.com/questions/29710999/is-collect-guaranteed-to-be-ordered-on-parallel-streams
- 三种顺序:encouter order / processing order / collecting order https://stackoverflow.com/questions/29709140/why-parallel-stream-get-collected-sequentially-in-java-8/29713386#29713386
- 使用unordered()标记流的有序或无序,来决定processing order https://stackoverflow.com/questions/50625544/confusion-about-characteristics-unordered-in-java-8-in-action-book