Java Stream API groupingBy()（翻）

最新推荐文章于 2023-09-22 14:44:29 发布

乌蹄踏雪

最新推荐文章于 2023-09-22 14:44:29 发布

阅读量222

点赞数

文章标签： java

本文链接：https://blog.csdn.net/Wutitaxue/article/details/120142419

版权

本文详细介绍了Java Stream API中groupingBy()收集器的使用，包括基本分组、自定义Map实现、自定义集合、计数、字符串连接、过滤、平均值计算、求和、统计摘要、归约操作、最大/最小值选择以及组合下游收集器等应用场景。通过丰富示例展示了如何灵活运用groupingBy()进行数据处理。

摘要由CSDN通过智能技术生成

groupingBy()提供了与 SQL 的 GROUP BY 子句类似的功能。
使用形式如下：

.collect(groupingBy(...));

需要指定一个属性才能使用，通过该属性执行分组。我们通过提供功能接口的实现来实现这一点，通常通过传递lambda表达式。
例如，如果我们想按长度对字符串进行分组，我们可以通过将 String::length 传递给 groupingBy() 来实现：

List<String> strings = List.of("a", "bb", "cc", "ddd"); 
Map<Integer, List<String>> result = strings.stream() 
  .collect(groupingBy(String::length)); 
System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

但是收集器本身能够做的不仅仅是像上面这样的简单分组。

分组到自定义Map实现

如果需要提供自定义Map实现，可以使用提供的groupingBy（）重载来实现：

List<String> strings = List.of("a", "bb", "cc", "ddd");
TreeMap<Integer, List<String>> result = strings.stream()
  .collect(groupingBy(String::length, TreeMap::new, toList()));
System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

提供一个自定义的Collection

如果您需要在自定义集合中存储分组元素，可以使用 toCollection() 收集器来实现。

例如，如果您想对 TreeSet 实例中的元素进行分组，这可能很简单：

groupingBy(String::length, toCollection(TreeSet::new))

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, TreeSet<String>> result = strings.stream()
  .collect(groupingBy(String::length, toCollection(TreeSet::new)));
System.out.println(result); // {1=[a], 2=[bb, cc], 3=[ddd]}

分组计数

如果你只是想知道分组元素的数量，提供一个自定义的 count() ：

groupingBy(String::length, counting())

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, Long> result = strings.stream()
  .collect(groupingBy(String::length, counting()));
System.out.println(result); // {1=1, 2=2, 3=1}

将每个组转为字符串

如果需要对元素进行分组并为每个组创建单个String表示，可以使用joining()来实现：

groupingBy(String::length, joining(",", "[", "]"))

List<String> strings = List.of("a", "bb", "cc", "ddd");

Map<Integer, String> result = strings.stream()
  .collect(groupingBy(String::length, joining(",", "[", "]")));
System.out.println(result); // {1=[a], 2=[bb,cc], 3=[ddd]}

分组和过滤条目

从分组结果中排除某些条目。这可以使用filtering（）收集器来实现：

groupingBy(String::length, filtering(s -> !s.contains("c"), toList()))

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, List<String>> result = strings.stream()
  .collect(groupingBy(String::length, filtering(s -> !s.contains("c"), toList())));
System.out.println(result); // {1=[a], 2=[bb], 3=[ddd]}

分组和计算每组平均值

如果需要派生每组条目的平均属性，那么有一些方便的收集器：

averagingInt（）
averagingLong（）
averagingDouble（）

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, Double> result = strings.stream()
  .collect(groupingBy(String::length, averagingInt(String::hashCode)));
System.out.println(result); // {1=97.0, 2=3152.0, 3=99300.0}

String::hashCode 被用作占位符。

分组和计算每组的总和

如果要对分组条目进行累计总和：

summingInt（）
summingLong（）
summingDouble（）

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, Integer> result = strings.stream()
  .collect(groupingBy(String::length, summingInt(String::hashCode)));
System.out.println(result); // {1=97, 2=6304, 3=99300}

String::hashCode 被用作占位符。

分组和计算每组的统计摘要

如果您想分组，然后从分组项目的属性中得出统计摘要，也有可用的函数：

summarizingInt()
summarizingLong()
summarizingDouble()

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, IntSummaryStatistics> result = strings.stream()
  .collect(groupingBy(String::length, summarizingInt(String::hashCode)));
System.out.println(result);

{
1=IntSummaryStatistics{
count=1,
sum=97,
min=97,
average=97.000000,
max=97},
2=IntSummaryStatistics{
count=2,
sum=6304,
min=3136,
average=3152.000000,
max=3168},
3=IntSummaryStatistics{
count=1,
sum=99300,
min=99300,
average=99300.000000,
max=99300} }

String::hashCode 被用作占位符。

分组并且Reducing

如果要对分组元素执行归约操作，可以使用 reduction() 收集器：

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, List<Character>> result = strings.stream()
  .map(toStringList())
  .collect(groupingBy(List::size, reduction(List.of(), (l1, l2) -> Stream.concat(l1.stream(), l2.stream())
    .collect(Collectors.toList()))));
System.out.println(result); // {1=[a], 2=[b, b, c, c], 3=[d, d, d]}

分组计算最大/最小项目

如果你想从一个组中导出 max/min 元素，你可以简单地使用 max()/min() 收集器：

groupingBy(String::length, Collectors.maxBy(Comparator.comparing(String::toUpperCase)))

List<String> strings = List.of("a", "bb", "cc", "ddd");
Map<Integer, Optional<String>> result = strings.stream()
  .collect(groupingBy(String::length, Collectors.maxBy(Comparator.comparing(String::toUpperCase))));
System.out.println(result); // {1=Optional[a], 2=Optional[cc], 3=Optional[ddd]}

在这种情况下，收集器返回 Optional 的事实有点不方便——组中总是至少有一个元素，因此使用 Optional 会增加意外的复杂性。
不幸的是，我们无法对收集器本身做任何事情来阻止它。不过，我们可以使用reducing() 收集器重新创建相同的功能。

组成下游收集器

一旦我们开始组合多个收集器来定义复杂的下游分组操作（开始类似于标准的 Stream API 管道），收集器的全部功能就会被释放出来——这里是无限的。
示例#1
假设我们有一个字符串列表，并且想要获取与长度大于 1 的大写字符串关联的字符串长度映射，并将它们收集到一个TreeSet实例中。

var result = strings.stream()
  .collect(
    groupingBy(String::length,
      mapping(String::toUpperCase,
        filtering(s -> s.length() > 1,
          toCollection(TreeSet::new)))));
//result
{1=[], 2=[BB, CC], 3=[DDD]}

示例#2
给定一个字符串列表，将它们按匹配的长度分组，转换为字符列表，展平获得的列表，仅保留非零长度的不同元素，并最终通过应用字符串连接来减少它们。

var result = strings.stream()
  .collect(
    groupingBy(String::length,
      mapping(toStringList(),
        flatMapping(s -> s.stream().distinct(),
          filtering(s -> s.length() > 0,
            mapping(String::toUpperCase,
              reducing("", (s, s2) -> s + s2)))))
    ));
//result 
{1=A, 2=BC, 3=D}