Java8 Stream流处理详细使用讲解

羽觞醉月11

已于 2022-03-06 19:25:13 修改

阅读量4.4k

点赞数 7

文章标签： java 开发语言后端 stream

于 2022-01-14 00:38:08 首次发布

本文链接：https://blog.csdn.net/qq_44494578/article/details/122485882

版权

一、stream概要与使用场景

首先，Java8中的stream与InputStream和OutputStream是完全不同的概念，stream是用于对集合迭代器的增强，使之能够完成更高效的聚合操作（过滤、排序、统计分组）或大批量数据操作，使用一种类似用 SQL 语句从数据库查询数据的直观方式来提供一种对 Java 集合运算和表达的高阶抽象。

此外，stream与lambda表达式结合后编码效率大大提高，并且可读性更强。

首先来看一个例子：

创建一个学生列表，填入不同的数据进行模拟，实际的业务中肯定要比这个复杂得多，这里只是做一个简单的示例。现在我们用两种不同的方式来取出学生列表中所有男同学。

// 找出所有的男同学01
public void test1(){
    List<Student> boyList = new ArrayList<>();
    for (Student stu : studentList) {
        if (stu.getSex().equals("男")) {
            boyList.add(stu);
        }
    }
}
// 找出所有的男同学02
public void test2(){
    List<Student> boys = studentList.stream()
        .filter(a -> a.getSex().equals("男"))
        .collect(Collectors.toList());
}

这样看是不是两种方式也没差多少工作量？真的是这样吗？其实并不是的！

这里只是一个简单的过滤条件，如果过滤条件再复杂些呢？条件不只是性别，还有年龄，分数等信息全部作为过滤条件的时候呢？在这种情况下再用 if 筛选数据就会使得代码臃肿，可读性差，而且极容易出错。这个时候使用stream来处理此类问题，就显得极其方便：

// 改进后：筛选出年龄大于18，分数大于90的男同学
public void test2(){
    List<Student> collect = studentList.stream()
            .filter(a -> a.getSex().equals("男"))
            .filter(b->b.getAge()>18)
            .filter(c->c.getScore()>90)
            .collect(Collectors.toList());
}

我们来看下面这个例子：根据年龄计算出每个年龄的学生对应分数的平均值：

// 找出每个年龄的平均分数01——传统方法
public void test1(){
    // 基于年龄分组
    Map<Integer, List<Student>> maps=new HashMap<>();
    for (Student stu :
            studentList) {
        List<Student> students = maps.computeIfAbsent(stu.getAge(), key -> new ArrayList<>());
        students.add(stu);
    }

    for (Map.Entry<Integer, List<Student>> entry: maps.entrySet()){
        int scores=0;
        for (Student student : entry.getValue()) {
            scores+=student.getScore();
        }
        System.out.println(scores / entry.getValue().size());
    }
}

// 找出每个年龄的平均分数02——stream
public void test2(){
    studentList.stream()
            .collect(Collectors.groupingBy(a->a.getAge(),       // 基于年龄分组
                    Collectors.averagingInt(a->a.getScore())))  // 统计平均分数
            .forEach((k, v)->System.out.println(k+": "+v));     // 打印
}

看上面这个例子，对比结果已经很明显了，利用stream处理无论是代码的整洁性还是可读性，都比传统方式有了巨大的提升。

二、stream的执行机制

1. 流的操作特性

stream不储存数据
stream不改变数据源
stream不可重复使用

2. 流的操作类型

stream所有操作组合在一起即便成了管道，管道中有如下两种操作：

中间操作：调用中间操作方法会返回一个新的流。通过连续执行多个操作就组成了Stream中的执行管道。需要注意的是这些管道被添加后并不会真正执行，只有等到终止操作后才会执行。
终止操作：在调用该方法后，将执行之前的所有中间操作，获取返回结果并结束对流的使用

流的执行顺序说明：其每个元素挨着作为参数去调用中间操作及终止操作，而不是遍历完一个方法，再遍历下一个方法

无状态：指元素的处理不受之前元素的影响；

有状态：指该操作只有拿到所有元素之后才能继续下去。

非短路操作：指必须处理所有元素才能得到最终结果；

短路操作：指遇到某些符合条件的元素就可以得到最终结果，如 A || B，只要A为true，则无需判断B的结果。

3. 流的执行过程图：

中间两个黄色的区域称为中间节点，也叫懒节点，在加入流时并不会立马执行，直到遇到终止节点，即红色区域的时候才会执行。中间节点可以有多个，而终止节点只能有一个，且只能放在最后面。

只有中间操作，没有终止操作，所以并没有执行打印语句：

public static void main(String[] args) {
    // 中间节点 -> 懒节点
    studentList.stream().filter(a->{
        System.out.println("hello");
        return true;
    });
}

// 无输出

下面添加了终止操作.toArray()，开始执行前面的中间操作：

public static void main(String[] args) {
    // 中间节点 -> 懒节点
    studentList.stream().filter(a->{
        System.out.println("hello");
        return true;
    }).toArray();
}

// 输出为：hello

4. 不可重复使用

会报如下信息的错误：

因为流只可以按照顺序来执行，并不允许有分岔路口，当执行了filter1后，再去执行filter2，这是行不通的。

若要正常执行程序，可做如些修改：

这样既可顺利执行：

三、具体用法

（一）、流的创建

使用Collection下的 stream() 和 parallelStream() 方法

List<String> list = new ArrayList<>();
Stream<String> stream = list.stream(); //获取一个顺序流
Stream<String> parallelStream = list.parallelStream(); //获取一个并行流

使用Arrays 中的 stream() 方法，将数组转成流

Integer[] nums = new Integer[10];
Stream<Integer> stream = Arrays.stream(nums);

使用Stream中的静态方法：of()、iterate()、generate()

Stream<Integer> stream = Stream.of(1,2,3,4,5,6);
 
Stream<Integer> stream2 = Stream.iterate(0, (x) -> x + 2).limit(6);
stream2.forEach(System.out::println); // 0 2 4 6 8 10
 
Stream<Double> stream3 = Stream.generate(Math::random).limit(2);
stream3.forEach(System.out::println);

使用 BufferedReader.lines() 方法，将每行内容转成流

BufferedReader reader = new BufferedReader(new FileReader("F:\\test_stream.txt"));
Stream<String> lineStream = reader.lines();
lineStream.forEach(System.out::println);

使用 Pattern.splitAsStream() 方法，将字符串分隔成流

Pattern pattern = Pattern.compile(",");
Stream<String> stringStream = pattern.splitAsStream("a,b,c,d");
stringStream.forEach(System.out::println);

（二）、中间操作

filter()：返回结果生成新的流中只包含满足筛选条件的数据。
distinct()：数据去重
skip(n)：将前几个元素跳过（取出）再返回一个流，如果流中的元素小于或者等于n，就会返回一个空的流
limit(n)：返回指定数量的元素的流。返回的是stream里前面的n个元素

Stream<Integer> stream = Stream.of(6, 4, 6, 7, 3, 9, 8, 10, 12, 14, 14);
stream.filter(s -> s > 5) //6 6 7 9 8 10 12 14 14
      .distinct() //6 7 9 8 10 12 14
      .skip(2) //9 8 10 12 14
      .limit(2) //9 8
      .forEach(System.out::println);

map()：接收一个函数作为参数，将流中的每一个元素通过此函数映射为新的元素，并作为新流中对应的元素

Stream<Integer> stream = Stream.of(1, 2, 3);
stream.map(a->a * a).forEach(System.out::print); // 149

flatMap()：将流中的每个元素都放到一个流中，最后将所有的流合并成一个新流，所有流对象中的元素都合并到这个新生成的流中返回

List<Integer> num1 = Arrays.asList(1, 2, 3);
List<Integer> num2 = Arrays.asList(4, 5, 6);
List<Integer> num3 = Arrays.asList(7, 8, 9);
List<List<Integer>> lists = Arrays.asList(num1, num2, num3);
Stream<Integer> outputStream = lists.stream().flatMap(l -> l.stream());
List<Integer> flatMapResult = outputStream.sorted().collect(Collectors.toList());
System.out.println(flatMapResult);
// [1, 2, 3, 4, 5, 6, 7, 8, 9]

sorted()：自然排序，流中元素需实现Comparable接口
sorted(Comparator com)：定制排序，自定义Comparator排序器

List<String> list = Arrays.asList("aa", "ff", "dd");
//String 类自身已实现Compareable接口
list.stream().sorted().forEach(System.out::println);// aa dd ff
 
Student s1 = new Student("aa", 10);
Student s2 = new Student("bb", 20);
Student s3 = new Student("aa", 30);
Student s4 = new Student("dd", 40);
List<Student> studentList = Arrays.asList(s1, s2, s3, s4);
 
//自定义排序：先按姓名升序，姓名相同则按年龄升序
studentList.stream().sorted(
        (o1, o2) -> {
            if (o1.getName().equals(o2.getName())) {
                return o1.getAge() - o2.getAge();
            } else {
                return o1.getName().compareTo(o2.getName());
            }
        }
).forEach(System.out::println);

peek()：对流中每个元素执行操作，并返回一个新的流，返回的流还是包含原来流中的元素
与map类似，但是map有返回值，peek无返回值。

Stream<Integer> stream = Stream.of(1, 2, 3);
stream.peek(System.out::print).count(); // 123

（三）、终止操作方法分类：

forEach()：内部迭代
forEachOrdered()

List<String> strs = Arrays.asList("a", "b", "c");
strs.stream().forEachOrdered(System.out::print);//abc
System.out.println();
strs.stream().forEach(System.out::print);//abc
System.out.println();
strs.parallelStream().forEachOrdered(System.out::print);//abc
System.out.println();
strs.parallelStream().forEach(System.out::print);//bca

先看第一段输出和第二段输出，使用的是stream的流，这个是一个串行流，也就是程序是串行执行的，所有看到遍历的结果都是按照集合的元素放入的顺序；

看第三段和第四段输出，使用的parallelStream的流，这个流表示一个并行流，也就是在程序内部迭代的时候，会并行处理；第三段代码的forEachOrdered表示严格按照顺序取数据，forEach在并行中，随机排列。

toArray()

// 将字符串流转换为字符串数组
List<String> list = Arrays.asList("A", "B", "C", "D");
String[] strArray = list.stream().toArray(String[]::new);

// 将整数流转换为整数数组
List<Integer> list = Arrays.asList(1,3,2,4);
Integer[] strArray = list.stream().toArray(Integer[]::new);

reduce()：可以将流中元素反复结合起来，得到一个值。

List<Integer> list = Arrays.asList(1,2,3,4);
// reduce()方法中第一个参数是起始值,第二个参数Lambda表达式中第一个参数x就是起始值,lambda表达式第二个参数y就是集合中的每个值
// 遍历集合中每个参数作为y,然后进行计算(x+y) 得到结果作为x,最后将所有结果相加,得到sum
Integer sum = list.stream().reduce(0, (x, y) -> x + y);
System.out.println(sum); // 10


Integer sum = list.stream().reduce((x, y) -> x + y).get();
System.out.println(sum); // 10

allMatch()：检查是否匹配所有元素
noneMatch()：检查是否没有匹配所有元素
anyMatch()：检查是否至少匹配一个元素
findFirst()：返回第一个元素
findAny()：返回当前流中的任意元素
count()：返回流中总数
max()：返回流中最大值
min()：返回流中最小值

List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);
 
boolean allMatch = list.stream().allMatch(e -> e > 10); //false
boolean noneMatch = list.stream().noneMatch(e -> e > 10); //true
boolean anyMatch = list.stream().anyMatch(e -> e > 4);  //true
 
Integer findFirst = list.stream().findFirst().get(); //1
Integer findAny = list.stream().findAny().get(); //1
 
long count = list.stream().count(); //5
Integer max = list.stream().max(Integer::compareTo).get(); //5
Integer min = list.stream().min(Integer::compareTo).get(); //1

collect()：将流转换为其他形式，接收一个Collector接口的实现，用于给Stream中元素做汇总的方法

// 第一个数组没有重复元素，第二个数组有重复元素
List<String> list = Arrays.asList("A", "B", "C", "D");
List<String> dlist = Arrays.asList("A", "A", "C", "D");

1. Collectors.toList()

List<String> listResult = list.stream().collect(Collectors.toList());
System.out.println(listResult); // [A, B, C, D]

2. Collectors.toSet()

Set<String> setResult = list.stream().collect(Collectors.toSet());
System.out.println(setResult);  // [A, B, C, D]
Set<String> dsetResult = dlist.stream().collect(Collectors.toSet());
System.out.println(dsetResult); // [A, C, D]

3. Collectors.toCollection()

上面的toMap,toSet转换出来的都是特定的类型，如果我们需要自定义，则可以使用toCollection()

List<String> custListResult = list.stream().collect(Collectors.toCollection(LinkedList::new));
System.out.println(custListResult.getClass()); // class java.util.LinkedList

4. Collectors.toMap()

toMap接收两个参数，第一个参数是keyMapper，第二个参数是valueMapper：

如果stream中有重复的值，则转换会报IllegalStateException异常，在toMap中添加第三个参数mergeFunction，来解决冲突的问题。

Map<String, Integer> mapResult = list.stream()
                .collect(Collectors.toMap(Function.identity(), String::length));
System.out.println(mapResult);		// {A=1, B=1, C=1, D=1}
Map<String, Integer> dMapResult = dlist.stream()
    .collect(Collectors.toMap(Function.identity(), String::length, (item, identicalItem) -> item));
System.out.println(dMapResult);		// {A=1, C=1, D=1}

5. Collectors.collectingAndThen()

collectingAndThen允许我们对生成的集合再做一次操作。

List<String> collectAndThenResult = list.stream()
                .collect(Collectors.collectingAndThen(Collectors.toList(), l -> {return new ArrayList<>(l);}));
System.out.println(collectAndThenResult);	// [A, B, C, D]

6. Collectors.joining()：joining用来连接stream中的元素：

String joinResult = list.stream().collect(Collectors.joining());
System.out.println(joinResult);	// ABCD
String joinResult1 = list.stream().collect(Collectors.joining(" "));
System.out.println(joinResult1);// A B C D
String joinResult2 = list.stream().collect(Collectors.joining(" ", "E","F"));
System.out.println(joinResult2);// EA B C DF

7. Collectors.counting()

counting主要用来统计stream中元素的个数

8. Collectors.summarizingDouble/Long/Int()

summarizingDouble/Long/Int为stream中的元素生成了统计信息，返回的结果是一个统计类：

IntSummaryStatistics intResult = list.stream()
                .collect(Collectors.summarizingInt(String::length));
System.out.println(intResult);	// IntSummaryStatistics{count=4, sum=4, min=1, average=1.000000, max=1}

9. Collectors.averagingDouble/Long/Int()

averagingDouble/Long/Int()对stream中的元素做平均

10. Collectors.summingDouble/Long/Int()

summingDouble/Long/Int()对stream中的元素做sum操作

11. Collectors.maxBy()/minBy()

maxBy()/minBy()根据提供的Comparator，返回stream中的最大或者最小值

Optional<String> maxByResult = list.stream().collect(Collectors.maxBy(Comparator.naturalOrder()));
System.out.println(maxByResult);	// Optional[D]

12. Collectors.groupingBy()

GroupingBy根据某些属性进行分组，并返回一个Map

Map<Integer, Set<String>> groupByResult = list.stream()
                .collect(Collectors.groupingBy(String::length, Collectors.toSet()));
System.out.println(groupByResult);	// {1=[A, B, C, D]}

13. Collectors.partitioningBy()

PartitioningBy是一个特别的groupingBy，PartitioningBy返回一个Map，这个Map是以boolean值为key，从而将stream分成两部分，一部分是匹配PartitioningBy条件的，一部分是不满足条件的

Map<Boolean, List<String>> partitionResult = list.stream()
                .collect(Collectors.partitioningBy(s -> s.length() > 3));
System.out.println(partitionResult); // {false=[A, B, C, D], true=[]}