Java 8 自定义流Collector实现

最新推荐文章于 2024-08-16 23:03:31 发布

梦想画家

最新推荐文章于 2024-08-16 23:03:31 发布

阅读量6k

点赞数 6

文章标签：自定义Collector

本文链接：https://blog.csdn.net/neweastsun/article/details/89435527

版权

Java 8 自定义流Collector实现

前文我们看到 Java 8 Collectors提供了很多内置实现。但有时我们需要实现一些特定功能满足业务需要，本文带你学习如何自定义Collector的实现，计算字符串流中所有单词的长度。

需求说明

假设有字符串流，利用每个字符串对象有方法length()————计算并返回单词长度。我们想创建自定义Collector,实现reduce操作，计算流中所有单词的长度之和。

使用 Collector.of() 方法

为了创建自定义Collector，需要实现Collector接口。现在，我们不使用传统方法，而是使用Collector.of()静态方法创建自定义Collector。

不仅是为了更精简和增强可读性，还因为这种方法可以忽略部分不必要的实现。实际上，Collector接口仅需要三个必须部分————提供者(supplier), 累加器(accumulator) 以及合并器(combiner)。

结果容器提供者(supplier)

实现Collector，必须提供结果容器，即累加值存储的地方。下面代码提供了结果容器：

() -> new int[1]

你可能会想，为什么是 new int[1],而不是int变量作为初始化值。原因是Collector接口需要结果容器能够以可变的方式进行更新。

累加元素(accumulator)

接下来，我们需要创建函数实现增加元素至结果容器。在我们的示例中，即单词的长度增加至结果容器：

(result, item) -> result[0] += item.length()

该函数是Consumer类型，其不返回任何值，仅以可变的方式更新结果容器————即数组中的第一个元素。

合并器(combiner)

在reduction序列操作中，提供者(supplier) 和累加器(accumulator) 已经足够了，但为了能够实现并行操作，我们需要实现一个合并器。合并器(combiner)是定义两个结果如何合并的函数。
在并行环境下，流被分为多个部分，每个部分被并行累加。当所有部分都完成时，结果需要使用合并器函数进行合并。下面请看我们的实现代码：

(result1, result2) -> {
  result1[0] += result2[0];
  return result1;
}

最小的自定义Collector

现在我们已经所有必要组件，整合在一起就是我们的Collector：

 wordStream.collect(Collector.of(
    ()-> new int[1],
    (result, item) -> result[0] += item.length(),
    (result1, result2) -> {
        result1[0] += result2[0];
        return result1;
    }
));

上面方案有个小问题，其直接返回结果容器即int[]类型。实际我们需要的字符串长度，不是结果容器。

最后一个转换

我们可以很容易实现，增加一个函数，其映射结果容器至我们需要的类型。这里我们仅仅需要数组的第一个元素：

total -> total[0]

最后完整代码为：

    private List<String> wordList = Arrays.asList("tommy", "is", "a", "java", "developer");

    @Test
    public void wordCountTest() {
        Stream<String> wordStream = wordList.stream();

        int wordCnt = wordStream.collect(Collector.of(
            ()-> new int[1],
            (result, item) -> result[0] += item.length(),
            (result1, result2) -> {
                result1[0] += result2[0];
                return result1;
            },
            total -> total[0]
        ));

        System.out.println("wordCnt = " + wordCnt);
    }

如果把我们自定义的Collector赋值给变量，则代码可以简化为：

int wordCount = wordStream.collect(totalWordCountCollector);

优化参数

最后，我们看下优化参数。即自定义Collector支持不同类型的优化参数。
使用Collector.of() 可以在参数最后增加 Characteristics 作为可变参数：

Collector.of(  
  // supplier,
  // accumulator,
  // combiner,
  // finisher, 
  Collector.Characteristics.CONCURRENT,
  Collector.Characteristics.IDENTITY_FINISH,
  // ...
);

有三种 Characteristics 可以使用：