Java函数式编程【三】【Stream终止操作】【下】之【Collect()方法】【1】用法详解

Java编程乐园

已于 2025-04-27 21:16:35 修改

阅读量1.1k

点赞数 25

分类专栏：函数式编程文章标签： java

于 2025-01-12 16:39:56 首次发布

本文链接：https://blog.csdn.net/weixin_42369079/article/details/144007266

版权

函数式编程专栏收录该内容

8 篇文章

订阅专栏

collect()、Collector和Collectors的概念及三者之间的关系

collect() 方法是 Java Stream API 中的终止操作方法，它的作用是将流中的元素收集到一个新的集合或者结果中。使用collect()收集运算结果时，通常需传入收集器（Collector）参数作为集合容器。这个收集器必须是Collector接口的某种具体实现类。
Collector是一个接口，collect()方法使用的收集器是Collector接口的具体实现类。
Collectors 是 Java 8 引入的操作类，位于 java.util.stream 包下。Collectors 是一个工具类，它预置提供了很多静态方法来创建特定的收集器（Collector），可用于作为方法 collect() 的参数。特别对于基于Java Stream Api 的函数式编程来说特别有用。它会根据不同的策略创建收集器（Collector），比如最常用的Map、Set、List 等容器。它为collect()收集操作提供了基于各种策略的结果集容器。

在这里插入图片描述

Collectors所有预定义的实现收集器都可以在 Collectors 类中找到。通常的做法是将以下静态导入与这些方法结合使用，以提高可读性：

	import static java.util.stream.Collectors.*;

Collectors创建收集器的静态方法列表：
在这里插入图片描述

流（Stream）的终止方法 collect()用法详解

方法 collect()对 Stream 流进行终结操作，它会根据不同的策略将元素收集归纳起来，比如最简单常用的收集操作是将元素装入Map、Set、List 等可变容器中。

在这里插入图片描述

Java核心库Collectors类定义文件给出了多种用法示例
Collectors的用途非常广泛，下面这是出自Java核心库Collectors类定义文件中的注释，给出了多种用法示例。

/**
 * Implementations of {@link Collector} that implement various useful reduction
 * operations, such as accumulating elements into collections, summarizing
 * elements according to various criteria, etc.
 *
 * <p>The following are examples of using the predefined collectors to perform
 * common mutable reduction tasks:
 *
 * <pre>{@code
 *     // Accumulate names into a List
 *     List<String> list = people.stream().map(Person::getName).collect(Collectors.toList());
 *
 *     // Accumulate names into a TreeSet
 *     Set<String> set = people.stream().map(Person::getName).collect(Collectors.toCollection(TreeSet::new));
 *
 *     // Convert elements to strings and concatenate them, separated by commas
 *     String joined = things.stream()
 *                           .map(Object::toString)
 *                           .collect(Collectors.joining(", "));
 *
 *     // Compute sum of salaries of employee
 *     int total = employees.stream()
 *                          .collect(Collectors.summingInt(Employee::getSalary)));
 *
 *     // Group employees by department
 *     Map<Department, List<Employee>> byDept
 *         = employees.stream()
 *                    .collect(Collectors.groupingBy(Employee::getDepartment));
 *
 *     // Compute sum of salaries by department
 *     Map<Department, Integer> totalByDept
 *         = employees.stream()
 *                    .collect(Collectors.groupingBy(Employee::getDepartment,
 *                                                   Collectors.summingInt(Employee::getSalary)));
 *
 *     // Partition students into passing and failing
 *     Map<Boolean, List<Student>> passingFailing =
 *         students.stream()
 *                 .collect(Collectors.partitioningBy(s -> s.getGrade() >= PASS_THRESHOLD));
 *
 * }</pre>
 *
 * @since 1.8
 */

恒等处理Collector用法 把流中的元素归集到集合：toList、toSet和toMap等集合
因为流不存储数据，那么在流中的数据完成处理后，需要将流中的数据重新归集到新的集合里。toList、toSet和toMap都是常用的集合容器，另外还有toCollection、toConcurrentMap等复杂一些的用法。
所谓恒等处理，指的就是把流（Stream）中的元素经过Collect()收集到集合，处理前后数据对象完全不变。例如toList()操作，只是把流（Stream）中的元素放入集合List对象（容器）中，对元素本身（数据对象）未做任何的更改。
先看一个示例：

	public static void collect恒等处理() {
		Stream.of("Hello", "world", "Java").collect(Collectors.toList());
		List<Integer> list = Arrays.asList(1, 6, 3, 4, 6, 7, 9, 6, 20);
		List<Integer> listNew = list.stream().filter(x -> x % 2 == 0).collect(Collectors.toList());
		Set<Integer> set = list.stream().filter(x -> x % 2 == 0).collect(Collectors.toSet());
	}

在这里插入图片描述

定制收集器容器
上述代码使用的是类库指定的默认的容器，能够满足大部分需求，但由于返回结果是接口类型，我们并不知道类库实际选择的容器类型。
有些情形有特殊要求我们可能会指定容器的实际类型，此时我们可使用Collectors.toCollection(Supplier collectionFactory)方法完成。

// 使用toCollection()指定归约容器的类型
ArrayList<String> arrayList = stream.collect(Collectors.toCollection(ArrayList::new));
HashSet<String> hashSet = stream.collect(Collectors.toCollection(HashSet::new));

Collectors.toMap()收集器容器

收集到映射表map。
收集器Collectors.toMap()也可归为恒等处理，但它的情况有点复杂。

创建一个收集器Collector，它会产生一个映射表map、不可修改的映射表或并发映射表map。keyMapper 和 valueMapper 函数会应用于每个收集到的元素上，从而在所产生的映射表map中生成一个键/值项。

收集器Collectors.toMap()可将流中的元素收集到Map实例中。toMap()两个参数都是函数接口实例：
1, keyMapper
2, valueMapper

keyMapper用于从流中的元素提取映射键key，valueMapper用于提取与给定键相关联的值value。

下面是一个Collectors.toMap的例子（版本一）：

	/***版本一***/
	public static void testToMap() {
		Map<String, Integer> sMap = Stream.of("World", "me", "you")
			.collect(Collectors.toMap(String::length, Function.identity()));
		sMap.forEach((k,v) ->System.out.println(k + ":"+v) );
    }

Collectors.toMap()方法出现重复key冲突的处理策略
使用Collectors.toMap时，因为在收集器进行map转换的时候
默认情况下，当两个元素产生相同的键（重复的key）时，会抛出一个IllegalStateException异常。你可以提供一个mergeFunction（合并函数）来合并具有相同键的值。默认情况下，其收集结果是一个Hashhap或ConcurrentHashMap。你可以提供一个 mapSupplier，它会产生所期望的映射表容器（映射表map实例）。

我们把上面的例子修改一下来演示这种情形：

	/***版本一 更新演示抛出异常***/
	public static void testToMap() {
		Map<Integer, String> sMap = Stream.of("World", "me", "you","Hello")
			.collect(Collectors.toMap( String::length,Function.identity() ));
		sMap.forEach((k,v) ->System.out.println(k + ":"+v) );  
    }

测试时就会出现如下结果：
在这里插入图片描述

同时，我们看到，toMap()方法其实也有提供多个形式的重载方法，可以由使用者自行指定key值重复的时候的执行策略。请看toMap()方法的另两个重载方法的方法签名：

		//重载形式之一，两个参数的toMap()方法
	    Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper)
         //重载形式之二，叁个参数的toMap()方法，增加了第三个参数：合并操作符mergeFunction
        Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper,
                                    BinaryOperator<U> mergeFunction)

下面是Java类库中Collectors类中对于叁个参数的toMap()收集器的定义，好好学习一下：
这三个参数的含义分别为：
第一个参数：表示 key
第二个参数：表示 value
第三个参数：合并规则（这是当key冲突时使用的合并策略）
Java类库中三个参数toMap()收集器的源码：

public static <T, K, U>
    Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper,
                                    BinaryOperator<U> mergeFunction) {
    return toMap(keyMapper, valueMapper, mergeFunction, HashMap::new);
}

有多种方法可以处理当多个元素之间映射到相同的关键字key的冲突。两个参数的toMap()方法只是使用一个无条件抛出的合并函数；但你可以很容易地使用叁个参数的toMap()方法来编写更灵活的合并策略。例如，如果你有一个流，元素是 Person 对象，您想生成一个“地址簿”，将key（姓名）映射到value（地址），但两个人可能同名，你可以按照以下步骤优雅地处理这些冲突，并生成映射表（Map）将名称姓名（key）映射到地址（value）列表：

     Map<String, String> addressBook = people.stream()
     	.collect(toMap(Person::getName,
			Person::getAddress,
			(s, a) -> s + ", " + a)); //合并多个地址

所以，我们的目标是出现重复值的时候，使用某种策略来处理新的值而非抛出异常。
例如，我们改造前面的那段代码，传入一个mergeFun函数块，处理冲突的策略是使用新的值覆盖替换已有的值，即指定下如果key重复的时候，以新的对象数据为准。
更新后的程序 版本二 如下：

	/***版本二***/
	public static void testToMap() {
		BinaryOperator<String> mergeFun = (o, n) -> n; // o表示原来的值，n表示新值
		/***
		Map<Integer, String> sMap = Stream.of("World", "me", "you","Hello")
			.collect(Collectors.toMap( String::length,Function.identity() ));
		***/
		Map<Integer, String> sMap = Stream.of("World", "me", "you","Hello")
			.collect(Collectors.toMap( String::length,Function.identity(),mergeFun ));
		sMap.forEach((k,v) ->System.out.println(k + ":"+v) );
    }

两种版本测试结果比较：
在这里插入图片描述
说明： 我们上面讨论的是常规的toMap()，它是线程不安全的，它主要用于顺序流。另有一种toConcurrentMap()是线程安全的，主要用于并行流。

更通用的叁参数collect()方法
上面介绍的是终止操作collect 方法最基本的用法，都是用于将流中的数据元素收集到集合，例如；

	Stream.of("Hello", "world", "Java").collect(Collectors.toList());

其实，collect()方法还有一种更通用的形式，即叁个参数的collect()方法，其方法签名如下；

	<R> R collect(Supplier<R> supplier, ObjIntConsumer<R> accumulator, BiCnsumer<R,R> combiner)

其中：方法的第一个参数是一个供给器，相当于初始化一个集合型的容器；第二个参数是累加器，是给容器（第一个参数初始化的）添加元素（赋值）的，第三个参数是组合器，相当于将这些元素全部组合到一个容器，第三个参数通常在并行流中才会产生作用。

三个参数的collect()收集方法的一个示例：
字符串与流之间的转换，将 String 转为流有两种方法，分别是 java.lang.CharSequence 接口定义的默认方法 chars() 和 codePoints() ，本例使用方法 chars()来演示。

        String str = "中国新能源电池技术领先世界".chars()//转换成流
            .collect(StringBuffer::new,
                 StringBuffer::appendCodePoint,
                 StringBuffer::append)//将流转换为字符串
            .toString();

连接Collectors.joining()用法

Collectors.joining()的三种重载形式，分别对应三种不同的连接方式。其方法签名如下所示：

	Collector<CharSequence, ?, String> joining()
	Collector<CharSequence, ?, String> joining(CharSequence delimiter)
	Collector<CharSequence, ?, String> joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix)

1，无参数的joinning()收集器方法会返回一个字符序列，它以空字符串 ( “” ) 来拼接在流（Stream）中收集到的所有元素。

2，单个参数的joining(CharSequence delimiter)收集器方法，可指定一个字符串序列的参数（作为连接符），并返回一个用指定的连接符来拼接流（Stream）中收集到的所有元素。

3，三个参数的joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix)收集器方法，第一个参数是连接符，第二个参数是前缀字符序列，第三个参数是后缀字符序列。用指定的连接符来拼接流（Stream）中收集到的所有元素，并在拼接完成后添加前缀和后缀。假设我们的流中有四个元素 [“A”,“B”,“C”,“D”]，我们给三参数的joining(）传入的连接符为 “-”，前缀为 “[” ，后缀为 “]” 。那么输出结果为 [A-B-C-D]。

多个字符串连接时使用Collectors.joining()生成的收集器，无须for循环。Collectors.joining()方法三种重载形式，请看示例。

	public static void testToJoining2() {
		// 使用Collectors.joining()拼接字符串
		List<String> list = Arrays.asList("苏州", "上海", "宁波","杭州");
		String joinedA = list.stream().collect(Collectors.joining());
		String joinedB = list.stream().collect(Collectors.joining("|"));
		String joinedC = list.stream().collect(Collectors.joining(",", "{", "}"));

		List<String> listM = Arrays.asList("张龙", "赵虎", "王五");
		String message = listM.stream().collect(Collectors.joining("、", "在中国小说中，", "都是练武之人。"));
		System.out.println("JoinedA："+joinedA);
		System.out.println("JoinedB："+joinedB);
		System.out.println("JoinedC："+joinedC);
		System.out.println(message); //打印：在中国小说中，张龙、赵虎、王五都是练武之人。
	}

程序测试结果：
在这里插入图片描述

归约（reducing）的Collector用法

归约（reducing）提供了 3 个重载方法：
1，单参数的归约（reducing）方法

public static <T> Collector<T, ?, Optional<T>> reducing(BinaryOperator<T> op) 
//直接通过BinaryOperator操作，返回值是Optional

2，两个参数的归约（reducing）方法

public static <T> Collector<T, ?, T> reducing(T identity, BinaryOperator<T> op) 
//初值identity，然后通过BinaryOperator操作

3，叁参数的归约（reducing）方法

public static <T, U> Collector<T, ?, U> reducing(U identity, Function<? super T, ? extends U> mapper, BinaryOperator<U>  op)  
//初值identity，通过Function操作元素，然后通过BinaryOperator操作

reducing()这个方法非常有用！其中一个参数是二元操作符BinaryOperator 。这是一个函数式接口，传入两个相同类型的操作数，返回一个同类型的结果，伪代码为 (T,T) -> T。

Java核心库中Collectors类中不少收集器，如counting()、maxBy()和minBy()都是通过reducing()这个方法实现的，它们的实现源代码如下：

    public static <T> Collector<T, ?, Long>  counting() {
        return reducing(0L, e -> 1L, Long::sum);
    }

    public static <T> Collector<T, ?, Optional<T>>  minBy(Comparator<? super T> comparator) {
        return reducing(BinaryOperator.minBy(comparator));
    }

    public static <T> Collector<T, ?, Optional<T>> maxBy(Comparator<? super T> comparator) {
        return reducing(BinaryOperator.maxBy(comparator));
    }

Collectors收集器reducing的操作可用java.util.stream.Stream中的终止操作reduce方式替代。
下面是一个reducing()使用示例

	public static void collectReducing() {
		List<Integer> list = Arrays.asList( 3, 8, 6, 7, 9, 20);
		Optional<Integer> sum = list.stream().collect(Collectors.reducing(Integer::sum));
		System.out.println("sum:" + sum.get());
		int total = list.stream().collect(Collectors.reducing(0,Integer::sum));
		System.out.println("total:" + total);
		
		total = personList.stream().collect(Collectors.reducing(0,Person::getSalary,Integer::sum));
		System.out.println("工资汇总:" + total);

		/***使用reduce()实现相同功能***/
		System.out.println("-------使用reduce()实现-------");
		sum = list.stream().reduce(Integer::sum);
		System.out.println("sum:" + sum.get());
		total = list.stream().reduce(0,Integer::sum);
		System.out.println("total:" + total);
		total = crtList().stream().map(Person::getSalary).reduce(0, Integer::sum);
		System.out.println("工资汇总:" + total);
	}

聚合、汇总统计(summarizing、counting和averaging)的Collector用法

使用collect()方法将Stream中元素归约为统计值。
Collectors分别提供了求平均值averaging、总数couting、最小值minBy、最大值maxBy、求和suming，以及汇总统计summarizing等操作。
在这里插入图片描述
maxBy的示例：

	public static void collectMaxBy() {
		Optional<Integer> collectMaxBy = Stream.of(9, 2, 8, 4)
	        .collect(Collectors.maxBy(Comparator.comparing(Function.identity())));
		System.out.println("collectMaxBy:" + collectMaxBy.get());

		List<String> strList = Arrays.asList("World", "me", "you","Hello!");
		Optional<String> result = strList.stream()
			.collect(Collectors.maxBy(Comparator.naturalOrder()));
		System.out.println("result:" + result.get());
	}

在这里插入图片描述
我们来看一个Collectors.summingInt()的使用例子：如果需要计算上海子公司每个月需要支付的员工总工资，使用Collectors.summingInt()可以这么实现：

public void calculateSum() {
    Integer salarySum = getAllEmployees().stream()
            .filter(employee -> "上海公司".equals(employee.getSubCompany()))
            .collect(Collectors.summingInt(Employee::getSalary));
    System.out.println(salarySum);
}

对象流（Stream）可使用Collectors.SummaryStatistics 进行汇总统计

假如你希望对对象流（Stream）中元素做汇总统计：总和、平均值、最大值和最小值，那么Collectors.summarizing(Int/Long/Double)就是为你准备的，它可以一次获取前面的所有结果，其返回值为(Int/Long/Double)SummaryStatistics。
下面来看一个例程：

public class CollectTest {
	public static List<Person> crtList() {
		List<Person> personList = new ArrayList<>();
	    personList.add(new Person("刘明", 28, 2000));
	    personList.add(new Person("李伟", 44, 4060));
	    personList.add(new Person("王振国", 55, 5050));
	    personList.add(new Person("赵云" , 66, 6080));
	    personList.add(new Person("张三", 33, 3300));
	    personList.add(new Person("钱玄同", 23, 1080));
	    return personList;
	}
    
	public static void test汇总统计() {
		// 求总人数
		Long count = crtList().stream().collect(Collectors.counting());
		// 求平均工资
		Double average = crtList().stream().collect(Collectors.averagingDouble(Person::getSalary));
		// 求最高工资
		Optional<Integer> max = crtList().stream().map(Person::getSalary).collect(Collectors.maxBy(Integer::compare));
		// 求工资之和
		Integer sum = crtList().stream().collect(Collectors.summingInt(Person::getSalary));
		// 一次性统计所有信息
		DoubleSummaryStatistics collect = crtList().stream().collect(Collectors.summarizingDouble(Person::getSalary));

		System.out.println("员工总数：" + count);
		System.out.println("员工最高工资：" + max.get());
		System.out.println("员工平均工资：" + average);
		System.out.println("员工工资总和：" + sum);
		System.out.println("员工工资所有统计：" + collect);
	}

	public static void main(String[] args) {
		test汇总统计();
	}
}