概述
Spliterator主要是用于定制并行流的任务划分规则。
示例场景
开发一个简单的方法开始数一数String中的单词数。
原始实现
WordCounter 类
public class WordCounter {
private final int counter;
private final boolean lastSpace;
public WordCounter(int counter, boolean lastSpace) {
this.counter = counter;
this.lastSpace = lastSpace;
}
public WordCounter accumulate(Character c) {
if (Character.isWhitespace(c)) {
return lastSpace ? this : new WordCounter(counter, true);
} else {
return lastSpace ? new WordCounter(counter + 1, false) : this;
}
}
public WordCounter combine(WordCounter wordCounter) {
return new WordCounter(counter + wordCounter.counter, wordCounter.lastSpace);
}
public int getCounter() {
return counter;
}
}
测试
public static void main(String[] args) {
final String SENTENCE = " Nel mezzo del cammin di nostra vita " +
"mi ritrovai in una selva oscura" +
" che la dritta via era smarrita ";
long startTime = System.nanoTime();
System.out.println("Found: " + countWordsIteratively(SENTENCE) + " words");
System.out.println("Spend " + 1.0 * (System.nanoTime() - startTime) / 1_000_000 + " msecs");
}
输出结果:
Found: 19 words
Spend 0.5934 msecs
改进一:使用流
WordCounter 类
public class WordCounter {
private final int counter;
private final boolean lastSpace;
public WordCounter(int counter, boolean lastSpace) {
this.counter = counter;
this.lastSpace = lastSpace;
}
public WordCounter accumulate(Character c) {
if (Character.isWhitespace(c)) {
return lastSpace ? this : new WordCounter(counter, true);
} else {
return lastSpace ? new WordCounter(counter + 1, false) : this;
}
}
public WordCounter combine(WordCounter wordCounter) {
return new WordCounter(counter + wordCounter.counter, wordCounter.lastSpace);
}
public int getCounter() {
return counter;
}
public static int countWords(Stream<Character> stream) {
WordCounter wordCounter = stream.reduce(new WordCounter(0, true),
WordCounter::accumulate,
WordCounter::combine);
return wordCounter.getCounter();
}
}
测试
public static void main(String[] args) {
final String SENTENCE = " Nel mezzo del cammin di nostra vita " +
"mi ritrovai in una selva oscura" +
" che la dritta via era smarrita ";
long startTime = System.nanoTime();
Stream<Character> stream = IntStream.range(0, SENTENCE.length()).mapToObj(SENTENCE::charAt);
System.out.println("Found: " + WordCounter.countWords(stream) + " words");
System.out.println("Spend " + 1.0 * (System.nanoTime() - startTime) / 1_000_000 + " msecs");
}
输出结果:
Found: 19 words
Spend 74.5587 msecs
结论:
相比不使用流的0.5934 msecs,速度明显减慢了很多。
改进二:使用并行流
WordCounter 类
public class WordCounter {
private final int counter;
private final boolean lastSpace;
public WordCounter(int counter, boolean lastSpace) {
this.counter = counter;
this.lastSpace = lastSpace;
}
public WordCounter accumulate(Character c) {
if (Character.isWhitespace(c)) {
return lastSpace ? this : new WordCounter(counter, true);
} else {
return lastSpace ? new WordCounter(counter + 1, false) : this;
}
}
public WordCounter combine(WordCounter wordCounter) {
return new WordCounter(counter + wordCounter.counter, wordCounter.lastSpace);
}
public int getCounter() {
return counter;
}
public static int countWords(Stream<Character> stream) {
WordCounter wordCounter = stream.reduce(new WordCounter(0, true),
WordCounter::accumulate,
WordCounter::combine);
return wordCounter.getCounter();
}
}
测试
public static void main(String[] args) {
final String SENTENCE = " Nel mezzo del cammin di nostra vita " +
"mi ritrovai in una selva oscura" +
" che la dritta via era smarrita ";
long startTime = System.nanoTime();
Stream<Character> stream = IntStream.range(0, SENTENCE.length()).mapToObj(SENTENCE::charAt);
System.out.println("Found: " + WordCounter.countWords(stream.parallel()) + " words");
System.out.println("Spend " + 1.0 * (System.nanoTime() - startTime) / 1_000_000 + " msecs");
}
输出结果:
Found: 40 words
Spend 172.3421 msecs
结论:
相比不使用流的0.5934 msecs,
相比使用流的74.5587 msecs,不仅速度明显减慢了很多,还出现了错误的结果!
这是因为在并发执行的过程当中,由于是对 Character(字符)进行的操作,所以并行拆分任务的时候,将原始的String在任意位置进行拆分,有时一个词会被拆分为两个词,然后数了两次。
改进三:使用Spliterator支持并行流
WordCounter 类
public class WordCounter {
private final int counter;
private final boolean lastSpace;
public WordCounter(int counter, boolean lastSpace) {
this.counter = counter;
this.lastSpace = lastSpace;
}
public WordCounter accumulate(Character c) {
if (Character.isWhitespace(c)) {
return lastSpace ? this : new WordCounter(counter, true);
} else {
return lastSpace ? new WordCounter(counter + 1, false) : this;
}
}
public WordCounter combine(WordCounter wordCounter) {
return new WordCounter(counter + wordCounter.counter, wordCounter.lastSpace);
}
public int getCounter() {
return counter;
}
public static int countWords(Stream<Character> stream) {
WordCounter wordCounter = stream.reduce(new WordCounter(0, true),
WordCounter::accumulate,
WordCounter::combine);
return wordCounter.getCounter();
}
}
WordCounterSpliterator 类
public class WordCounterSpliterator implements Spliterator<Character> {
private final String string;
private int currentChar = 0;
public WordCounterSpliterator(String string) {
this.string = string;
}
@Override
public boolean tryAdvance(Consumer<? super Character> consumer) {
consumer.accept(string.charAt(currentChar++));
// 如果还有字符需要处理,则返回true
return currentChar < string.length();
}
@Override
public Spliterator<Character> trySplit() {
int currentSize = string.length() - currentChar;
if (currentSize < 10) {
return null;
}
for (int splitPos = currentSize / 2 + currentChar; splitPos < string.length(); splitPos++) {
if (Character.isWhitespace(string.charAt(splitPos))) {
Spliterator<Character> spliterator = new WordCounterSpliterator(string.substring(currentChar, splitPos));
currentChar = splitPos;
return spliterator;
}
}
return null;
}
@Override
public long estimateSize() {
return string.length() - currentChar;
}
@Override
public int characteristics() {
return ORDERED + SIZED + SUBSIZED + NONNULL + IMMUTABLE;
}
}
测试
public static void main(String[] args) {
final String SENTENCE = " Nel mezzo del cammin di nostra vita " +
"mi ritrovai in una selva oscura" +
" che la dritta via era smarrita ";
long startTime = System.nanoTime();
Spliterator<Character> spliterator = new WordCounterSpliterator(SENTENCE);
Stream<Character> stream = StreamSupport.stream(spliterator, true);
System.out.println("Found: " + countWords(stream.parallel()) + " words");
System.out.println("Spend " + 1.0 * (System.nanoTime() - startTime) / 1_000_000 + " msecs");
}
输出结果:
Found: 19 words
Spend 278.9621 msecs
结论:
相比不使用流的0.5934 msecs,
相比使用流的74.5587 msecs,
相比使用并行流的172.3421 msecs,虽然没有再出现错误结果,不过速度明显又减慢了很多!
总结
-
Spliterator主要是用于定制并行流的任务划分规则;
-
Spliterator会解决代码并行所带来的问题;
-
Spliterator会增加代码的复杂程度;
-
对于逻辑比较简单的代码,使用并行只会增加时间,而使用Spliterator支持的并行会增加更多的时间。