Java8 Streams用法总结大全之 Collector用法详解

姠惢荇者

已于 2022-06-26 22:45:50 修改

阅读量2.9k

点赞数 1

分类专栏：笔记文章标签： Stream Java

于 2022-06-26 16:31:49 首次发布

本文链接：https://blog.csdn.net/hou_ge/article/details/125460426

版权

笔记专栏收录该内容

46 篇文章 7 订阅

订阅专栏

1、前言

在《Java8 Streams用法总结大全之 Stream中的常见操作》中，我们已经学习了Stream中的常用操作，其中也提到了collect()的用法，当时只是通过入参Collectors.toList()实现了把Stream转为List集合的功能，其实collect()还有很多其他的用法，入参就是实现了Collector接口的对象，同时Collectors可以看作是Collector的工厂类，其为我们提供了非常多的内建Collector的方法，具体如何使用，我们一起来学习一下。

2、Collector接口

在前面提到的collect()方法，其入参就是一个实现了Collector接口的对象。其实在Stream中，collect有两个重载方法，如下所示：

<R, A> R collect(Collector<? super T, A, R> collector);

<R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);

其中T, A, R三个泛型类型，其实对应了Collector接口中的三个泛型类型，而第二个重载方法，其实对应了Collector接口中的三个接口方法是返回类型，学习了Collector接口，我们就会明白其中的含义。

2.1、Collector接口定义

public interface Collector<T, A, R> {
}

Collector是一个泛型接口，有三个泛型参数分别是T、A、R，其含义如下：

T 代表着Stream元素的数据类型，比如Production、String、Integer等。
A 代表着累加器的数据类型，在Stream collect方法源码中甚至将其命名为容器，通常情况下，经过了collection操作之后的部分数据会被存放在该累加器中或者容器中。
R 代表着collect方法最终返回的数据类型。

2.2、Collector接口定义方法

Collector接口定义了五个接口，每个接口方法的作用，请看方法注释，具体如下所示：

public interface Collector<T, A, R> {
	
	/**
	 *该方法将返回一个类型为A的Supplier，该方法会创建一个元素容器，该容器在accumulator()方法中将会被用到，主要用于收集累加器计算的数据结果
	 */
	Supplier<A> supplier();
	/**
	 *累加器方法是比较关键的方法，该方法会部分（在并行流中）运算或者全部计算（在串行流中）Stream流经的元素，并且将其存入supplier方法构造出来的容器中。
	 */
	BiConsumer<A, T> accumulator();
	/**
	 *该方法主要用于在并行流中进行结果的整合操作，在并行流中，每一个子线程都在执行部分数据的累加器方法，最后的结果该如何自处呢？当然是需要将其进行整合（分而治之，Fork Join的思想），那么该方法的泛型参数与supplier（）方法一致也就很容易理解了。
	 */
	BinaryOperator<A> combiner();
	/**
	 * 当所有的计算完成之后，该方法将被用于做进一步的transformation操作，比如将int类型转换为long类型，同时该方法也是整个Collector接口在Stream collect操作中最后一个被调用的方法。
	 */
	Function<A, R> finisher();
	/**
	 * 该方法主要用于定义Collector的特征值，包含了CONCURRENT、UNORDERED和IDENTITY_FINISH三个类型
	 */
	Set<Characteristics> characteristics();
}

Collector接口方法在串行流中的执行过程，如下：
在这里插入图片描述
如上图所示，在串行流中，其中combiner方法将不会被使用到，因为不存在子线程子任务数据的合并动作，所有的操作将直接由单线程来完成。

Collector接口方法在并行流中的执行过程，如下所示：
在这里插入图片描述
&emp;如上图所示，和串行流相比，在这里当分支流程执行完成后，需要调用combiner方法进行结果的合并，然后再调用finisher方法返回结果。

2.3、Collector对象的创建方法

在Collector接口中，还定义了两个of()方法，用于创建Collector类型的对象，本质上还是由Collectors工厂类创建的Collector对象，具体实现如下：

public static<T, R> Collector<T, R, R> of(Supplier<R> supplier,
                                              BiConsumer<R, T> accumulator,
                                              BinaryOperator<R> combiner,
                                              Characteristics... characteristics) {
        Objects.requireNonNull(supplier);
        Objects.requireNonNull(accumulator);
        Objects.requireNonNull(combiner);
        Objects.requireNonNull(characteristics);
        Set<Characteristics> cs = (characteristics.length == 0)
                                  ? Collectors.CH_ID
                                  : Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH,
                                                                           characteristics));
        return new Collectors.CollectorImpl<>(supplier, accumulator, combiner, cs);
    }

	public static<T, A, R> Collector<T, A, R> of(Supplier<A> supplier,
                                                 BiConsumer<A, T> accumulator,
                                                 BinaryOperator<A> combiner,
                                                 Function<A, R> finisher,
                                                 Characteristics... characteristics) {
        Objects.requireNonNull(supplier);
        Objects.requireNonNull(accumulator);
        Objects.requireNonNull(combiner);
        Objects.requireNonNull(finisher);
        Objects.requireNonNull(characteristics);
        Set<Characteristics> cs = Collectors.CH_NOID;
        if (characteristics.length > 0) {
            cs = EnumSet.noneOf(Characteristics.class);
            Collections.addAll(cs, characteristics);
            cs = Collections.unmodifiableSet(cs);
        }
        return new Collectors.CollectorImpl<>(supplier, accumulator, combiner, finisher, cs);
    }

3、Collectors类

Collectors类其实就是Collector类的工厂类，内置了很多遍历的方法，这里我们就分别看一下其中都有哪些类型的用法：

3.1、Collectors.averaging类型方法（求平均值）

averagingInt(ToIntFunction<? super T> mapper)：将Stream的元素T转换为int类型，然后计算其平均值。
averagingLong(ToLongFunction<? super T> mapper)：将Stream的元素T转换为long类型，然后计算其平均值。
averagingDouble(ToDoubleFunction<? super T>mapper)：将Stream的元素T替换为double类型，然后计算其平均值。

public static void averaging(){
  Stream<Production> stream = initData().stream();
    //求平均值
    Double average = stream.collect(Collectors.averagingDouble(Production::getPrice));
    System.out.println("average:" + average);
}
//初始数据的方法，后续使用的均是该方法
public static List<Production> initData(){
    List<Production> list = new ArrayList<>();
    list.add(new Production("T-Shirt",43.34d));
    list.add(new Production("cloth",99.99d));
    list.add(new Production("shoe",123.8d));
    list.add(new Production("hat",26.5d));
    list.add(new Production("cloth",199.99d));
    list.add(new Production("shoe",32.5d));
    return list;
}

3.2、 Collectors.collectingAndThen方法

该方法的主要作用是对当前Stream元素经过一次Collector操作之后，对结果再次进行transformation操作。

public static void  collectingAndThen(){
     Stream<Production> stream = initData().stream();
      //求商品总价，然后判断是否超过800元
      Boolean average = stream.collect(Collectors.collectingAndThen(
              Collectors.summingDouble(Production::getPrice),
              p -> p > 800
      ));
      System.out.println("是否超额:" + average);
  }

3.3、Collectors.counting方法

counting方法所创建的Collector，其主要用于返回Stream中元素的个数，当Stream中没有任何元素时返回0，counting方法在Stream collect操作中的效果实际上是等价于Stream的count方法，但是由于counting方法返回的是一个Collector，因此它可以应用于其他的Collectors方法中。

public static void  counting() {
    Stream<Production> stream = initData().stream();
     Long count = stream.collect(Collectors.counting());
     System.out.println("count:" + count);
 }

3.4、Collectors.mapping方法

<T, U, A, R> Collector<T, ?, R>mapping(Function<? super T, ? extends U> mapper,Collector<? super U, A, R> downstream)

首先Function函数将Stream中的类型为T的元素transformation成U类型，紧接着downstream collector将处理元素类型为U的Stream。

public static void  mapping() {
     Stream<Production> stream = initData().stream();
     //先转化成Stream<Double>类型，然后再计算合计总价
     Double total = stream.collect(Collectors.mapping(
         Production::getPrice,
             Collectors.summingDouble(Double::doubleValue)
     ));
     System.out.println("total:" + total);
 }

3.5、 Collectors.joining方法

Collectors的joining方法主要用于将Stream中的元素连接成字符串并且返回，Collectors的joining()方法如有下三种重载形式。

joining()：将Stream中的元素连接在一起，中间没有任何符号对其进行分隔。
joining(CharSequence delimiter)：将Stream中的元素连接在一起，元素与元素之间将用delimiter进行分割。
joining(CharSequence delimiter,CharSequence prefix,CharSequence suff ix)：将Stream中的元素连接在一起，元素与元素之间将用delimiter进行分割；除此之外，最后的返回结果还将会被pref ix与suff ix包裹。

public static void  joining() {
    Stream<Production> stream = initData().stream();
     //获取商品名称，拼接字符串，输出：“<T-Shirt,cloth,shoe,hat,cloth,shoe>”
     String str = stream.collect(
             Collectors.mapping(
                     Production::getName,
                     Collectors.joining(",","<",">")
             )
     );
     System.out.println("str:" + str);
 }

3.6、Collectors.summing方法

summingInt(ToIntFunction<? super T> mapper)：将Stream的元素T转换为int类型，然后对所有值求和。
summingDouble(ToDoubleFunction<? super T>mapper)：将Stream的元素T转换为double类型，然后对所有值求和。
summingLong(ToLongFunction<? super T> mapper)：将Stream的元素T转换为long类型，然后对所有值求和。

public static void  summing() {
    Stream<Production> stream = initData().stream();
    //统计商品总价格
    Double total = stream.collect(Collectors.summingDouble(Production::getPrice));
    System.out.println("total:" + total);
}

3.7、Collectors获取最大值最小值的方法

Collectors提供了可以获取Stream中最大元素和最小元素的Collector。

maxBy(Comparator<? super T> comparator)：根据Comparator获取Stream中最大的那个元素。
minBy(Comparator<? super T> comparator)：根据Comparator获取Stream中最小的那个元素。

public static void  maxMin() {
     Stream<Production> stream = initData().stream();
     Stream<Production> stream2 = initData().stream();
     //最高价商品
     Optional<Production> optional = stream.collect(Collectors.maxBy((o1, o2) -> (int)(o1.getPrice() - o2.getPrice())));
     System.out.println("name:" + optional.get().getName() + "-price:" + optional.get().getPrice());
     //最低价商品
     Optional<Production> optional2 = stream2.collect(Collectors.minBy((o1, o2) -> (int)(o1.getPrice() - o2.getPrice())));
     System.out.println("name:" + optional2.get().getName() + "-price:" + optional2.get().getPrice());
 }

3.8、Collectors.summarizing方法

前面学习了Collectors的averaging和summing，如何用counting方法创建对应用途的Collector。而summarizing方法创建的Collector则会集averaging、summing、counting于一身，并且提供了更多额外的方法，summarizing提供了三种汇总方式，如下所示：

summarizingInt(ToIntFunction<? super T> mapper)：将Stream元素转换为int类型，并且进行汇总运算，该Collector的返回值为IntSummaryStatistics类型。
summarizingLong(ToLongFunction<? super T>mapper)：将Stream元素转换为long类型，并且进行汇总运算，该Collector的返回值为LongSummaryStatistics类型。
summarizingDouble(ToDoubleFunction<? super T>mapper)：将Stream元素转换为double类型，并且进行汇总运算，该Collector的返回值为DoubleSummaryStatistics类型。

public static void  summarizing() {
   Stream<Production> stream = initData().stream();
    //同时包含了averaging、summing和counting的值，输出结果：DoubleSummaryStatistics{count=6, sum=526.120000, min=26.500000, average=87.686667, max=199.990000}
    DoubleSummaryStatistics statistics = stream.collect(Collectors.summarizingDouble(Production::getPrice));
    System.out.println("statistics:" + statistics);
}

3.9、Collectors输出到其他容器的方法

Stream通过若干intermediate操作之后，可以执行collect操作将Stream中的元素输出汇总至其他容器中，比如Set、List、Map等。

toSet()：将Stream中的元素输出到Set中
toList()：将Stream中的元素输出到List中
toMap()：将Stream中的元素输出到Map中，Collectors提供了toMap的三种重载形式。
1. toMap(Function<? super T, ? extends K>keyMapper,Function<? super T, ? extends U>valueMapper)：该方法需要两个Function参数，第一个参数应用于map key的mapper操作，第二个参数应用于value的mapper操作。
2. toMap(Function<? super T, ? extends K> keyMapper,Function<? super T, ? extends U>valueMapper,BinaryOperator mergeFunction)：该toMap方法和第一个重载方法相比，多了一个BinaryOperator参数，该参数主要那个用于解决当Key值出现冲突时的merge方法，通过该函数可以创建类似grouping的效果。
3. toMap(Function<? super T, ? extends K>keyMapper,Function<? super T, ? extends U> valueMapper,BinaryOperator mergeFunction,Supplier<M> mapSupplier)：与第二个重载方法相比，多了一个Supplier参数，该参数用来配置创建返回的Map类型（前面两个方法返回的都是HashMap类型）。
toCollection(Supplier<C> collectionFactory) 将Stream中的元素输出到collectionFactory指定的Collection类型中。
toConcurrentMap(Function<? super T, ? extends K>keyMapper,Function<? super T, ? extends U>valueMapper) 和toMap()方法类似，只不过这里使用的是ConcurrentHashMap对象，提供了线程安全的作用。
toConcurrentMap(Function<? super T, ? extends K>keyMapper,Function<? super T, ? extends U>valueMapper,BinaryOperator mergeFunction) 和toMap()方法类似，只不过这里使用的是ConcurrentHashMap对象，提供了线程安全的作用。
toConcurrentMap(Function<? super T, ? extends K>keyMapper,Function<? super T, ? extends U>valueMapper,BinaryOperatormergeFunction,Supplier<M> mapSupplier) 和toMap()方法类似，只不过这里使用的是ConcurrentHashMap对象，提供了线程安全的作用。

public static void  toCollection() {
    Stream<Production> stream = initData().stream();
     Stream<Production> stream2 = initData().stream();
     //把商品价格转成List输出
     List<Double> list = stream.map(Production::getPrice).collect(Collectors.toList());
     System.out.println("list:" + list);
     //把商品按照名字分类，每类用List存成数据，最后返回TreeMap对象，其中key是商品名称，value是商品的List集合
     TreeMap<String,List> treeMap = stream2.collect(Collectors.toMap(
             Production::getName,
             Arrays::asList,
             (p1,p2) ->{
                 List<Production> mergeList = new ArrayList<>();
                 mergeList.addAll(p1);
                 mergeList.addAll(p2);
                 return mergeList;
             },
             TreeMap::new
     ));
     System.out.println("treeMap:" + treeMap);
 }

3.10、Collectors.partitioningBy方法

该方法会将Stream中的元素分为两个部分，以Map<Boolean,?>的形式作为返回值，Key为True代表一部分；Key为False代表另外一部分。partitioningBy有两个重载方法，如下所示：

partitioningBy(Predicate<? super T> predicate)：根据Predicate的判断，将Stream中的元素分为两个部分，最后的返回值为Map<Boolean,List<?>>。
partitioningBy(Predicate<? super T>predicate,Collector<? super T, A, D> downstream)：和第一个重载方法相比，多了一个downstream参数，这让该重载方法灵活强大许多了，比如，我们可以对每一个分区的元素再次进行其他Collector的操作运算。

public static void collectPartition(){
    Stream<Production> stream = initData().stream();
    Stream<Production> stream2 = initData().stream();
    //分区，把大于100和小于等于100的分成两份，其中key=false的对应=<100,key=true对应>100,
    // 通过Collectors.toSet()让每个部分的集合转成了Set，默认是List
    Map<Boolean,Set<Production>> map = stream.collect(
            Collectors.partitioningBy(
                    item -> item.getPrice() > 100,
                    Collectors.toSet()
            )
    );
    System.out.println(map);
    //分区，把大于100和小于等于100的分成两份，其中key=false的对应=<100,key=true对应>100,
    // 然后针对分区后的结果，对每个分区计算汇总值，返回类型为Map<Boolean,Double>
    Map<Boolean,Double> map2 = stream2.collect(
            Collectors.partitioningBy(
                    item -> item.getPrice() > 100,
                    Collectors.summingDouble(Production::getPrice)
            )
    );
    System.out.println(map2);
}

3.11、 Collectors.groupingBy方法

groupingBy方法类似于关系型数据库中的分组操作，其主要作用是根据classifier（分类器）对Stream中的元素进行分组，groupingBy方法在Collectors中提供了如下几种重载形式：

groupingBy(Function<? super T, ? extends K> classifier)：根据分类器函数对Stream中的元素进行分组，返回结果类型为：Map<K, List<T>>。
groupingBy(Function<? super T, ? extends K> classifier,Collector<? super T, A, D> downstream)：与第一个重载方法相比，多了一个downstream参数，用于运算操作，可以改变返回集合的类型。
groupingBy(Function<? super T, ? extends K> classifier,Supplier mapFactory, Collector<? super T, A, D>downstream)：与第二个重载方法先比，多了一个提供构造返回Map类型的Supplier，在前两个groupingBy方法中返回的Map为HashMap，在该方法中，开发者可以指定Map的其他实现类。

除此之外，Collectors还提供了其他三个groupingByConcurrent的重载形式，返回结果为线程安全的、支持并发的Map实现ConcurrentHashMap，其具体用法和原理与本节中介绍的三个重载方法类似。

public static void collectGroup() {
        Stream<Production> stream = initData().stream();
        Stream<Production> stream2 = initData().stream();
        Stream<Production> stream3 = initData().stream();
        //分组统计各类的总价
        Map<String,Double> map = stream.collect(Collectors.groupingBy(
                Production::getName, Collectors.summingDouble(Production::getPrice)
        ));
        System.out.println(map);

        //分组统计
        Map<String,List<Production>> map2 = stream2.collect(Collectors.groupingBy(
                Production::getName
        ));
        System.out.println(map2);

        //分组统计
        TreeMap<String,Set<Production>> map3 = stream3.collect(Collectors.groupingBy(
                Production::getName,//根据name进行分组
                TreeMap::new, //返回TreeMap，替换HashMap
                Collectors.toSet()//每个分组中的集合使用Set，默认使用List
        ));
        System.out.println(map3);
    }

3.12、 Collectors.reducing方法

与Stream的reduce操作非常类似，Collectors的reducing方法也将创建一个用于对Stream中的元素进行reduce计算的Collector，该操作在Collectors中提供了三个重载方法：

reducing(BinaryOperator<T> op)：给定一个BinaryOperator函数式接口，对Stream中的每一个元素进行计算，但是该reducing创建的Collector其返回值将是一个类型与Stream中元素类型一致的Optional。
reducing(T identity, BinaryOperator op)：该方法的作用与上面的reducing类似，只不过增加了一个identity的参数，该参数会纳入BinaryOperator函数的运算之中，除此之外当该Stream为空时，reducing将会直接返回该identity。
reducing(U identity,Function<? super T, ? extends U>mapper, BinaryOperator op)：前两个reducing方法只能返回与Stream元素类型一样的结果或者Optional，该重载方法允许开发者返回不同于其他类型的结果，因为有了mapper函数的加持。

public static void reducing() {
        Stream<Production> stream = initData().stream();
        Stream<Production> stream2 = initData().stream();
        Stream<Production> stream3 = initData().stream();

        //获取价格最高的产品
        Optional<Production> production = stream.collect(Collectors.reducing(
                (p1,p2) -> p1.getPrice() > p2.getPrice() ? p1:p2
        ));
        System.out.println("production:" + production.get());
        //获取价格最高的产品,增加一个初始值
        Production production2 = stream2.collect(Collectors.reducing(
                new Production("kuzi",2000),
                (p1,p2) -> p1.getPrice() > p2.getPrice() ? p1:p2
        ));
        System.out.println("production2:" + production2);
        //获取商品总价
        Double tatal = stream3.collect(Collectors.reducing(
                0d,
                Production::getPrice,
                (p1,p2) -> p1 + p2
        ));
        System.out.println("tatal:" + tatal);
    }

4、自定义Collector

Collectors提供的静态方法几乎可以满足我们对Collector的所有需要，如果你觉得有些Collector无法满足你的需求，那么完全可以自行扩展，同时我们也可以通过自定义Collector，更加熟悉Collector的用法。

这里我们实现一个类似toList方法的功能，首先，需要实现Collector接口，具体实现如下：

/**
 * 自定义Collector
 */
public class QriverCollector<T> implements Collector<T, List<T>, List<T>> {
    /**
     * 容器
     * @return
     */
    @Override
    public Supplier<List<T>> supplier() {
        return ArrayList::new;
    }

    /**
     * 累加器
     * @return
     */
    @Override
    public BiConsumer<List<T>,T> accumulator() {
        return List::add;
    }

    @Override
    public BinaryOperator<List<T>> combiner() {
        return (p,l) ->{
            p.addAll(l);
            return p;
        };
    }

    @Override
    public Function<List<T>, List<T>> finisher() {
        return Function.identity();
    }

    @Override
    public Set<Characteristics> characteristics() {
        return EnumSet.of(Characteristics.UNORDERED,Characteristics.CONCURRENT,Characteristics.IDENTITY_FINISH);
    }
}

然后，我们实际使用时如下所示：

public static void custom() {
    Stream<Production> stream = initData().stream();
    List<Production> list = stream.collect(new QriverCollector<>());
    System.out.println("list:" + list);
}

至此，我们就了解了Collector的一些用法，后续再继续学习关于并发流的一些知识点。

姠惢荇者

关注

1
点赞
踩
7

收藏

觉得还不错? 一键收藏
打赏
0
评论
Java8 Streams用法总结大全之 Collector用法详解

在《Java8 Streams用法总结大全之 Stream中的常见操作》中，我们已经学习了Stream中的常用操作，其中也提到了collect()的用法，当时只是通过入参Collectors.toList()实现了把Stream转为List集合的功能，其实collect()还有很多其他的用法，入参就是实现了Collector接口的对象，同时Collectors可以看作是Collector的工厂类，其为我们提供了非常多的内建Collector的方法，具体如何使用，我们一起来学习一下。在前面提到的co
复制链接

扫一扫