java8学习笔记2_流与收集器使用

最新推荐文章于 2024-08-22 16:17:58 发布

dchangjian

最新推荐文章于 2024-08-22 16:17:58 发布

阅读量244

点赞数

分类专栏： java8

本文链接：https://blog.csdn.net/dcj199411/article/details/80100924

版权

java8 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

1. 接口定义方法

java8接口可以定义default method 和 static method,其中default method供实现该接口的实例调用，static method供类直接调用。

注意点：

多个接口默认方法签名相同的问题。

假如有实现类实现了多个含有相同默认方法的接口，这时候编译器会报错，需要类重写该默认方法。

重写默认方法后，想调用某个接口的默认方法，需要这么写：接口名.super.方法名()调用某个接口的默认方法。

实现类的优先级高于接口的优先级

假如一个类C实现了接口I并重写了的默认方法M，类S继承了C并且实现了I, 当类S调用M时，其实调用的是C类种的M方法。

2. 流(Stream)

流管道由三部分组成：
1. 源(可能是数组，集合，生成器函数，I/O通道)
2. 0个或多个中间操作(intermediate operation),它会转换成另外一个流。
3. 终止操作

流是惰性的，当终止操作初始化的时候，流中的元素才会被消费处理。也就是说，只有当流遇到终止操作的时候，流的中间操作的逻辑代码才会执行。

2.1 获取Stream的方式

Collection接口实现了返回Stream实例的stream方法
Stream接口的of方法
Arrays类的静态stream方法

2.2 Stream常用方法

map方法：接收一个Function,表示一种映射，是一个中间操作
distinct方法：针对于流中的元素去重。
generate方法：接收一个Supplier参数，返回Stream对象
findFirst(findAny)方法：返回一个Optional对象。
iterate方法：接收seed和UnaryOperator(Function的特例)，返回Stream对象.它一般需要配合Stream.limit方法使用，否则就是一个无限流。
eg:

int res = Stream.iterate(1, item -> item + 2).limit(6).filter(item -> item > 2).mapToInt(item -> item * 2).skip(2).limit(2).sum();

flatMap方法：接收一个Function(接收一个T, 返回R类型的Stream)，是中间操作。它是将多个R类型的Stream整合为一个Stream,。
eg:

  Stream<List<Integer>> stream3 = Stream.of(Arrays.asList(1,2), Arrays.asList(3,4), Arrays.asList(5));
  stream3.flatMap(theList -> theList.stream()).map(i -> i * i).forEach(System.out::println);

IntStream类的summaryStatistics方法：它是一个终止操作，返回一个IntSummaryStatistics对象，我们可以取流中的元素的数量，最小值，最大值平均值等等。
Collectors.groupingBy方法：返回一个Collector对象，用于将流中的元素分组, 它作为Stream对象的collect方法的入参，然后collect方法将会返回一个Map；同时，groupingBy的重载方法也支持二级分组，将返回Map<T, Map<K,List>>类型的Map对象。

eg:

Student s1 = new Student("zhangsan", 20);
Student s2 = new Student("lisi", 20);
Student s3 = new Student("wangwu", 30);
Student s4 = new Student("zhaoliu", 40);

List<Student> list = Arrays.asList(s1,s2,s3,s4);
Map<Integer, List<Student>> map = list.stream().collect(Collectors.groupingBy(Student::getAge));
System.out.println(map);

System.out.println("====================");

Map<Integer, Long> map2 = list.stream().collect(Collectors.groupingBy(Student::getAge, Collectors.counting()));
System.out.println(map2);

System.out.println("====================");
//二级分组
Map<Integer, Map<String, List<Student>>> map = list.stream().
                collect(Collectors.groupingBy(Student::getAge, Collectors.groupingBy(Student::getName)));

Collectors.partitioningByPredicate(<? super T> predicate)方法：用于将流中的元素分区(partition), 它作为Stream对象的collect方法的入参，然后collect方法将会返回一个Map<Boolean,T> 分区是分组的一种特殊情况，相当于永远只返回两个分组。

2.3 Stream总结与注意点

流与集合的区别：集合关注的是数据与数据存储本身，而流关注的是对数据的计算。
流与迭代器的相似点：流不可被重复使用和消费，我们平时的流的方法链式调用都会返回新的Stream对象。
流是短路运算的，只要找到符合条件的值，就不会再去执行。

如下代码的map方法只会输出 item: hello，虽然单词world的长度也为5，想想为啥？

List<String> list = Arrays.asList("hello", "world", "hello world");
        list.stream().map(item -> {
            System.out.println("item: " + item);
            return item.toUpperCase();
        }).filter(item -> item.length() == 5).findFirst().ifPresent(System.out::println);

3. 收集器（Collector）

3.1 收集器相关概念

Stream与Collector（收集器）有着莫大的关系。
明确几个概念：
1. collect 表示收集器，它是Stream接口声明的一个抽象方法。
2. Collector作为collect方法的参数
3. Collector本身是一个接口，它是可变的汇聚操作，将输入元素累积到可变的结果容器中；在所有元素处理完毕之后，将累积的结果转换为一个最终的表示(可选操作)，支持串行和并行两种方式执行。
4. Collectors类提供了关于Collector的常见汇聚实现，它本身是一个工厂类。
5. 为了确保串行与并行操作结果的等价性，Collector函数需要满足两个条件：identity（同一性）和associativity(结合性)。
6. 同一性： a == combiner.apply(a, supplier.get());
7. 结合性：要求如下代码的R1和R2等价。

A a1 = supplier.get();
accumulator.accept(a1, t1);
accumulator.accept(a1, t2);
R r1 = finisher.apply(a1);  // result without splitting

A a2 = supplier.get();
accumulator.accept(a2, t1);
A a3 = supplier.get();
accumulator.accept(a3, t2);
R r2 = finisher.apply(combiner.apply(a2, a3));  // result with splitting

首先看Stream接口的两个collect方法定义：

<R, A> R collect(Collector<? super T, A, R> collector);

<R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);

Collector接口定义：

//T:表示流中的每一个元素的类型
//A：可变的累积类型，比如ArrayList
//R:汇聚操作的结果类型
public interface Collector<T, A, R> {
    /**
     * A function that creates and returns a new mutable result container.
     * @return a function which returns a new, mutable result container
     */
    Supplier<A> supplier();
    /**
     * A function that folds(reduce) a value into a mutable result container.
     * @return a function which folds a value into a mutable result container
     */
    //将 T 累积到 A 当中
    BiConsumer<A, T> accumulator();

    /**
     * A function that accepts two partial results and merges them.  The
     * combiner function may fold state from one argument into the other and
     * return that, or may return a new result container.
     *
     * @return a function which combines two partial results into a combined
     * result
     */
    // A -> apply(A,A)
    BinaryOperator<A> combiner();

    /**
     * Perform the final transformation from the intermediate accumulation type
     * {@code A} to the final result type {@code R}.
     *
     * <p>If the characteristic {@code IDENTITY_TRANSFORM} is
     * set, this function may be presumed to be an identity transform with an
     * unchecked cast from {@code A} to {@code R}.
     *
     * @return a function which transforms the intermediate result to the final
     * result
     */
    Function<A, R> finisher();

    enum Characteristics {
        /**
         * Indicates that this collector is <em>concurrent</em>, meaning that
         * the result container can support the accumulator function being
         * called concurrently with the same result container from multiple
         * threads.
         *
         * <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
         * then it should only be evaluated concurrently if applied to an
         * unordered data source.
         */
        CONCURRENT,

        /**
         * Indicates that the collection operation does not commit to preserving
         * the encounter order of input elements.  (This might be true if the
         * result container has no intrinsic order, such as a {@link Set}.)
         */
        UNORDERED,

        /**
         * Indicates that the finisher function is the identity function and
         * can be elided.  If set, it must be the case that an unchecked cast
         * from A to R will succeed.
         */
        IDENTITY_FINISH
    }

接口的javadoc文档已经写的很清楚了。

supplier：返回一个可变的结果容器(比如ArrayList)。
accumulator：将新的元素，也就是流中的每一个元素折叠(汇聚)到可变的结果容器中，比如ArrayList.add方法。
combiner：接收两个部分结果并且合并,在并行流里会用到。
finisher：将中间的累积类型转换成结果类型(可选操作)。

3.2 自定义收集器

自定义一个返回Set的收集器：

public class MySetCollector<T> implements Collector<T, Set<T>, Set<T>> {

    //get a mutuble result container
    @Override
    public Supplier<Set<T>> supplier() {
        System.out.println("supplier invoked!");
        return HashSet::new;
    }

    // fold a value into the result container
    @Override
    public BiConsumer<Set<T>, T> accumulator() {
        System.out.println("accumulator invoked!");
        //return Set::add;
        return (set, item) -> set.add(item);
    }

    //combile two partial result and merge them
    @Override
    public BinaryOperator<Set<T>> combiner() {
        System.out.println("combiner invoked!");
        return (set1, set2) -> {
            set1.addAll(set2);
            return set1;
        };
    }

    //a function which transfers a intermediate result to a final result
    //if intermediate result type is eaual to final result type, this method will not be invoked.
    @Override
    public Function<Set<T>, Set<T>> finisher() {
        System.out.println("finisher invoked!");
        return Function.identity();
    }


    @Override
    public Set<Characteristics> characteristics() {
        System.out.println("characteristics invoked!");
        //IDENTITY_FINISH: the intermediate result type must be equal to result type, otherwise it will throw exception
        //IDENTITY_FINISH implies that finisher() method will not be invoked!
        return Collections.unmodifiableSet(EnumSet.of(IDENTITY_FINISH, UNORDERED));
    }

    public static void main(String[] args) {
        List<String> list = Arrays.asList("hello", "world", "hello world", "world", "hello");
        Set<String> set = list.stream().collect(new MySetCollector<>());
        System.out.println(set);
    }
}

&emsp；基于3.1节对相关概念的介绍，代码还是很好理解的。
main方法的运行结果如下：

supplier invoked!
accumulator invoked!
combiner invoked!
characteristics invoked!
characteristics invoked!
[world, hello, hello world]

finisher函数并未执行，因为中间结果类型和最终结果类型是同一类型。

源码分析思路：

1.首先找到Stream接口的collect方法声明，找到其该方法的实现类ReferencePipeline。
2.ReferencePipeline的collect方法定义如下图所示。
3.根据程序输出的结果，关注某个点，反推其运行过程，在关键代码地方打断点调试，就可看的很清楚。

这就是为什么会输出如上的结果了。

太长了，先到这儿吧，后面文章继续。

dchangjian

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录