StreamAPI源码分析之一（Colector收集器）

最新推荐文章于 2023-02-15 18:14:01 发布

朴实搬砖人

最新推荐文章于 2023-02-15 18:14:01 发布

阅读量395

点赞数

分类专栏： StreamAPI 文章标签： JAVA

本文链接：https://blog.csdn.net/qq_33145973/article/details/102671425

版权

StreamAPI 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

前言

前面介绍JAVA8之Stream API使用介绍一、二、三的时候有使用过collect(toList())收集函数。接下来分析的Colector收集器与collect(toList())收集函数有着密不可分的关系。

1、什么是collect()函数

collect()函数是及早求值操作，参数是收集器Collector（Collector是一个接口，需要进行实现，当然JDKAPI已经为我们实现了一些通用的Collector了，如果需要自己需求的Collector，需要自定义）

Stream<String> stream = Stream.of("hello", "world", "helloworld");
                List<String> list = stream.collect(Collectors.toList());
                List<String> list1 = stream.collect(
                        () -> new ArrayList(), 
                        (theList, item) -> theList.add(item),
                        (theList1, theList2) -> theList1.addAll(theList2));
        
        List<String> list2 = stream.collect(LinkedList::new, LinkedList::add, LinkedList::addAll);
        list.forEach(System.out::println);
        //定制返回的容器toCollection()方法
        List<String> list = stream.collect(Collectors.toCollection(ArrayList::new));
        Set<String> set = stream.collect(Collectors.toCollection(TreeSet::new));

collect()函数有toList()、toSet()、toCollection()三种直接创建集合的方式。更详细的信息，请查看Collectors静态工厂类

2、Colector收集器

Colector收集器(理解函数式思维的关键) 更详细的说明请查看Colector接口文档（如果看完列完的定义不明白一定要去看Colector接口文档)

1、它是一个接口,它是一个可变的汇聚操作，将输入元素累积到一个可变的结果容器中，它会在所有元素都处理完毕后，将累积的结果转换成一个最终目标(这是一个可选操作)
，它支持串行和并行方式执行。
2、Collectors本身提供了关于Collector的常见汇聚实现，Collectors其实就是一个工厂。
3、collector由四个函数指定，这些函数一起工作，将条目累积到可变结果容器中，并可以选择对结果执行最终转换。
4、为了确保串行和并行操作结果的等价性，collector函数需要满足两个条件:identity(同一性)与associativity(结合性)
5、a=combiner.apply(a, supplier.get()); a是一个部分结果
6、函数式编程最大的特点:表示做什么,而不是如何做
例如:

(list1,list2)->{list1.addAll(list2);return list1;}

colletor包含的核心变量有：

        They are: <ul>
         *     <li>creation of a new result container ({@link #supplier()})</li>
         *     <li>incorporating a new data element into a result container ({@link #accumulator()})</li>
         *     <li>combining two result containers into one ({@link #combiner()})</li> 
         *     <li>performing an optional final transform on the container ({@link #finisher()})</li>
         * </ul>

对 Collector三个泛型参数进行解释
public interface Collector<T, A, R>

@param T 还原操作的输入元素类型
@param A 还原操作的可变累积类型（通常作为实现细节*隐藏）
@ param R 还原操作的结果类型

colletor接口中的四个五个变量：
1、supplier函数，生成新容器
2、accumulator函数，对于数据两两进行操作，合并成一个放入容器中
3、combiner函数(只有多线程使用，串行流不会使用)，如果有四个线程同时执行，那么就会生成四个部分结果. 1,2,3,4,集合进行合并
4、finisher函数:将accumulator函数结果或者是combiner函数结果转换成指定的类型
5、 characteristics特性值:流计算特性进行指定

colletor的内部枚举类(对于流计算的特性进行设置，完成我们需要的功能)

 /**
     *表示{@code collector}属性的特征，该属性可*用于优化精简实现。
     */
    enum Characteristics {
        /**
         *指示此收集器是并发的，这意味着*结果容器可以支持与来自多个		  *线程的同一结果容器同时*调用的累加器函数。
         * 
         *<p>如果{@code concurrent}收集器也不是{@code unordered}，*则只应在应用于*无序数据源时并发计算。
         */
        CONCURRENT,

        /**
         * 指示集合操作不承诺保留*输入元素的相遇顺序。（如果*结果容器没有内在顺序，例如{@link set}，则可能是这样的。）
         */
        UNORDERED,

        /**
         * 表示分页装订器函数是标识函数，*可以省略。如果设置了，则从a到r的未检查强制转换*必须成功。
         */
        IDENTITY_FINISH
    }

3、自定义自己的colletor定制器

public class MySetCollector2<T> implements Collector<T, Set<T>, Map<T, T>> {

    @Override
    public Supplier<Set<T>> supplier() {
        System.out.println("supplier invoked!");
		 return HashSet<T>::new;
    }

    @Override
    public BiConsumer<Set<T>, T> accumulator() {
        System.out.println("accumulator invoked!");

        return (set, item) -> {
            System.out.println("accumulator: " +set+", "+ Thread.currentThread().getName());
            set.add(item);
        };
    }

    @Override
    public BinaryOperator<Set<T>> combiner() {
        System.out.println("combiner invoked!");
        return (set1, set2) -> {
            set1.addAll(set2);
            return set1;
        };
    }

    @Override
    public Function<Set<T>, Map<T, T>> finisher() {
        System.out.println("finisher invoked!");
        return set -> {
            Map<T, T> map = new TreeMap<>();
            set.stream().forEach(item -> map.put(item, item));
            return map;
        };
    }

    @Override
    public Set<Characteristics> characteristics() {
        System.out.println("characteristics invoked!");
        return Collections.unmodifiableSet(EnumSet.of(Characteristics.UNORDERED, Characteristics.CONCURRENT));
    }

    public static void main(String[] args) {

        System.out.println(Runtime.getRuntime().availableProcessors());

        for(int i = 0; i < 99; ++i) {

            List<String> list = Arrays.asList("hello", "world", "welcome", "hello", "a",
                    "b", "c", "d", "e", "f", "g");
            Set<String> set = new HashSet<>();
            set.addAll(list);
            System.out.println("set: " + set);
            Map<String, String> map = set.parallelStream().collect(new MySetCollector2<>());
            System.out.println(map);
        }

    }
}

其实自定义收集器的话，我本人在工作中用的也是比较少的，基本上JDK自带的功能都能完成需求开发，这里就是对它进行理解，刚开始理解的位置那激就是知道里面有四个函数式接口，一个特性，完成四个函数式接口完成不同的任务，可以流Stream指定运行特性这样子可能理解起来更简单点。

这里再给大家加个餐，通过示例在理解一下收集器：
jdk8-Collector收集器之并行流陷阱与原理link

总结

本节主要对collector进行了介绍并进行分析，理解collector是对流计算及其重要的一环，希望博客可以帮到更多的人。
下一节为StreamAPI源码分析之二（Colectors工厂类）

朴实搬砖人

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
StreamAPI源码分析之一（Colector收集器）

前言前面介绍JAVA8之Stream API使用介绍一、二、三的时候有使用过collect(toList())收集函数。接下来分析的Colector收集器与collect(toList())收集函数有着密不可分的关系。1、什么是collect()函数collect()函数是及早求值操作，参数是收集器Collector（Collector是一个接口，需要进行实现，当然JDKAPI已经为我们实现...
复制链接

扫一扫