Java Stream Parallel任务数量拆分
1. 测试任务数量
集合数量 | 任务个数 |
4 | 4 |
10 | 10 |
56 | 32 |
100 | 36 |
1000 | 32 |
代码如下,调整num值得到不同的任务数量
// 创建stream数量 int num =100; List<Integer> list = new ArrayList(){{ for (int i = 0; i < num; i++) { add(0); } }}; // 使用reduce的identity属性查看任务数量 Integer reduce = list.stream().parallel().reduce(1, Integer::sum, Integer::sum); System.out.println(reduce);
2 任务数量拆分算法
我们已经知道java Stream是基于ForkjoinPool是先并行的,那个就想到了经典的divide and conquer算法。此算法需要一个分任务的临界阈值,所以决定查看源码如下:sizeThreshold 就是我们需要的阈值。继续跟进代码,发现sizeThreshold 是动态生成的。
java.util.stream.AbstractTask#compute
Spliterator<P_IN> rs = spliterator, ls; // right, left spliterators long sizeEstimate = rs.estimateSize(); long sizeThreshold = getTargetSize(sizeEstimate); boolean forkRight = false; @SuppressWarnings("unchecked") K task = (K) this; while (sizeEstimate > sizeThreshold && (ls = rs.trySplit()) != null) { K leftChild, rightChild, taskToFork; task.leftChild = leftChild = task.makeChild(ls); task.rightChild = rightChild = task.makeChild(rs); task.setPendingCount(1); if (forkRight) { forkRight = false; rs = ls; task = leftChild; taskToFork = rightChild; } else { forkRight = true; task = rightChild; taskToFork = leftChild; } taskToFork.fork(); sizeEstimate = rs.estimateSize(); } task.setLocalResult(task.doLeaf()); task.tryComplete();
sizeThreshold = collectionSize / (4* (系统核数-1))。比如集合数量是100,系统是8核,则默认拆分任务最小值就100/28=3;
由此就推算出100对应的任务数量为36;此算法可以让每个核大约执行4个任务。
/** * Default target factor of leaf tasks for parallel decomposition. * To allow load balancing, we over-partition, currently to approximately * four tasks per processor, which enables others to help out * if leaf tasks are uneven or some processors are otherwise busy. */