Java 8 (4/6篇) - Stream API 流操作集合

Jomurphys

已于 2022-06-27 21:12:10 修改

阅读量756

点赞数 1

分类专栏： Java 8 文章标签： java

于 2021-09-09 14:45:25 首次发布

本文链接：https://blog.csdn.net/HugMua/article/details/120144528

版权

Java 8 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Stream是对集合(Collection)对象功能的增强，用来进行各种方便高效的操作，过程就像工人在流水线上加工一样。Stream 的操作分为三步：创建流、中间操作、终止操作，只有添加了终止操作其它的步骤才会执行。单个元素执行完所有操作后，再进行下一个元素。

无状态：指元素的处理不受之前元素的影响。
有状态：指该操作只有拿到所有元素之后才能继续操作。
非短路操作：指必须处理所有元素才能得到结果。
短路操作：指遇到某些符合条件的元素就可以得到最终结果。

中间操作	无状态	unordered、filter、map、mapToInt、mapToLong、mapToDouble、flatMap、flatMapToInt、flatMapToLong、flatMapToDouble、peek
中间操作	有状态	distinct、sorted、limit、skip
终止操作	非短路操作	forEach、forEachOrdered、toArray、reduce、collect、max、min、count
终止操作	短路操作	anyMatch、allMatch、noneMatch、findFirst、findAny

创建流

从数据源中获取流，例如从集合、数组、另一个流、文件。

流和迭代器类似只能迭代一次，并行流就是把内容分成多个数据块使用不同的线程分别处理每个数据块的流。

创建流		作用	说明
Collection 实例调用	stream ()	转换为一个串行流（顺序流）	功能上没有差别，单线程和多线程的效率区别。
Collection 实例调用	parallelStream()	转换为一个并行流（坑多少用）	功能上没有差别，单线程和多线程的效率区别。
Arrays 静态调用	stream ( T [ ] ) stream ( int [ ] ) stream ( double [ ] ) stream ( long [ ] )	转换为一个流
Stream 静态调用	of ( T... )	用传入的值生成流	值生成流
	empty ()	创建一个空的流	值生成流
	iterate ()	依次对每个新生成的值应用函数	函数生成流
	generate ()	根据函数生成值	函数生成流
Files 静态调用	lines ()	转换为一个流	每个元素是文件的其中一行

//Collection
List list = new ArrayList();
list.stream();
list.parallelStream();

Set set = new HashSet();
set.stream();

Map map = new HashMap();
map.keySet().stream();
map.values().stream();
map.entrySet().stream();

//Arrays
int[] aa = {1,2,3,4};
Arrays.stream(aa,1,3).forEach(System.out::println);    //打印：2,3

//Stream
Stream<String> stream4 = Stream.of("aaa","bbb","ccc");    //也可以传入数组，可变参数就是数组
Stream<int[]> stream5 = Stream.of(aa);    //但是不能传入基本数据类型的数组，整个数组会被看作成一个元素，而不是对每个元素操作
Stream<Integer> stream6 = Stream.iterate(0, n -> n + 2);    //首元素为0，之后依次+2
Stream<Object> stream7 = Stream.generate(MAth::random);    //元素为0到1的随机双精度
Stream<Integer> stream8 = Stream.generate(() -> 1);    //元素全为1

//Files
String<String> stream9 = Files.lines(Paths.get("data.txt"));

合并流

静态方法，将两个流合并成一个流，合并之后不能再操作之前的流。

Stream<String> stream1 = Stream.of("张三");
Stream<String> stream2 = Stream.of("李四");
Stream<String> stream3 = Stream.concat(stream1, stream2);   //合并之后不能再操作stream1，stream2
stream3.forEach(System.out::println);   //打印：张三，李四

中间操作符

在执行处理后会返回一个新的流，供后续操作。

中间操作符	作用	说明
filter (Predicate predicate)	过滤	通过设置的条件过滤出元素
distint ()	去重	通过equals()去除重复元素
limit (long maxSize)	限流	选取前几个元素，限制最大数量
skip (long n)	跳过	去除前几个元素，数量不足返回空流
map (Function mapper)	转换	类型转换，接收的函数作用于每个元素上，返回的是流中流
flatmap (Function mapper)	拍平	将每个元素映射为一个流(流中流)，再将每个流连接成一个流
peek (Consumer action)	挑出
sorted sorted (Comparator comparator)	自然排序比较器排序

ArrayList<String> list1 = new ArrayList<>();
Collections.addAll(list1, "张无忌", "赵敏", "张三丰", "周芷若", "张峰");

//过滤 filter()
list1.stream()
        .filter(s -> s.startsWith("张")) //拿到以"张"开头的元素
        .filter(s -> s.length() == 3) //拿到长度为3的元素
        .forEach(System.out::println);  //遍历打印：张无忌、张三丰
//限流 limit()
list1.stream().limit(3).forEach(System.out::println);    //遍历打印：张无忌、赵敏、张三丰
//跳过 skip()
list1.stream().skip(3).forEach(System.out::println);     //遍历打印：周芷若、张峰
//转换 map()
List<String> list2 = Arrays.asList("1", "2", "3");
Stream<Integer> stream = list2.stream().map(Integer::parseInt); //泛型String变为Integer
//排序 sorted()
Stream.of(21, 45, 3, 64, 234, 1).sorted().forEach(System.out::println); //自然排序：1,3,21,45,64,234
Stream.of(21, 45, 3, 64, 234, 1).sorted((o1, o2) -> o2 - o1).forEach(System.out::println);   //比较器排序，上面的降序打印
//去重 distint()
Stream.of(1, 3, 2, 2, 1, 3).distinct().forEach(System.out::println);    //遍历打印：1,3,2

终止操作符

Stream必须使用终止操作符，否则整个流是不会流动起来的，即操作不会执行。

操作类型	终止操作符	作用
收集	collect ()	将所有元素收集起来，Collectors提供了非常多的收集器
匹配	allMatch (Predicate predicate)	每个元素都符合判断返回true，否则false
	noneMatch (Predicate predicate)	每个元素都不符合判断返回true，否则false
	anyMatch (Predicate predicate)	只要有一个元素符合判断返回true，否则false
查找	findFrist ()	返回第一个元素
查找	findAny ()	使用 stream()，返回第一个元素使用 parallelStream()，返回随机一个元素
统计	count ()	返回元素的个数
最值	max (Comparator comparator)	返回最大值
最值	min (Comparator comparator)	返回最小值
规约	reduce ()	将整个流的值规约为一个值，count、min、max底层使用的这个
遍历	forEach ()	对最终数据进行消费了
遍历	forEachOrdered ()	对最终数据进行消费了
数组	toArray ()	将流的元素转换成数组

List<String> list = Arrays.asList("A", "B", "C", "D");
//遍历 fotEach()
list.stream().forEach(System.out::println); //遍历打印：A,B,C,D
//统计 count()
System.out.println(list.stream().count());  //打印：4
//匹配 match()
boolean b = Stream.of(1, 4, 23, 54, 12, 6, 33)
//                .allMatch(num -> num > 3) //是否全部匹配
//                .anyMatch(num -> num > 3) //是否任一匹配
        .noneMatch(num -> num > 3); //是否都不匹配
//查找 findFirst()
Optional<Integer> first = Stream.of(33, 11, 22, 5).findFirst();
int num = first.get(); //33
//最值 max()、min()
Optional<Integer> max = Stream.of(33, 11, 22, 5).max(Integer::compareTo);   //自然排序
Optional<Integer> min = Stream.of(33, 11, 22, 5).min((o1, o2) -> o2 - o1);   //比较器排序，倒序
System.out.println(max.get());  //33
System.out.println(min.get());  //33
//规约 reduce()
//参数一：默认值   参数二：数据处理方式
//第一次：将默认值参数一赋值给a，流中第一个元素赋值给b，进行数据处理
//第二次：将处理结果赋值给a，流中第二个元素赋值给b，进行数据处理
//以此类推...
Integer reduce = Stream.of(1, 2, 3, 4).reduce(0, (a, b) -> {
    System.out.println("a=" + a + ",b=" + b);
    return a + b;
});
System.out.println(reduce); //10
//转为数组 toArray()
Object[] objects = Stream.of("哈哈", "呵呵").toArray(); //转为Object数组不方便
String[] strings = Stream.of("哈哈", "呵呵").toArray(String[]::new);    //转为String类型数组

collect() 收集数据

传入的参数	说明
Collectors.toList () Collectors.toSet () Collectors.toCollection ()	把流中所有元素收集到一个 List, Set 或 Collection 中
Collectors.counting ()	元素数量
Collectors.maxBy ()	最大值
Collectors.minBy ()	最小值
Collectors.summingInt () Collectors.summingDouble () Collectors.summingLong ()	总和
Collectors.averagingInt () Collectors.averagingDouble () Collectors.averagingLong ()	平均值
Collectors.groupingBy () Collection.partitioningBy ()	分组分区	根据元素属性进行分组（可分为多组）根据条件判断进行分区（true false只有两个）
Collection.joining ()	拼接

Stream<Integer> stream = Stream.of(1, 4, 2, 3, 4);
//转为List
List<Integer> list = stream.collect(Collectors.toList());
//转为HashSet，自动去重
HashSet<Integer> hashSet = stream.collect(Collectors.toCollection(HashSet::new));

List<Person> list = Arrays.asList(new Person("张三", 18), new Person("李四", 20), new Person("王五", 22), new Person("赵六", 31));
//元素个数 Collectors.counting()
Long number = list.stream().collect(Collectors.counting());
System.out.println("y元素个数是：" + number);  //元素个数是：4
//最大值 Collectors.maxBy()
Optional<Person> max = list.stream().collect(Collectors.maxBy((o1, o2) -> o1.age - o2.age));
System.out.println("最大年龄是：" + max.get());    //最大年龄是：Person{name='赵六', age=31}
//最小值 Collectors.minBy()
Optional<Person> min = list.stream().collect(Collectors.minBy((o1, o2) -> o1.age - o2.age));
System.out.println("最小年龄是：" + min.get());    //最小年龄是：Person{name='张三', age=18}
//总和 Collectors.summingInt()
Integer totalAge = list.stream().collect(Collectors.summingInt(value -> value.age));
System.out.println("年龄总和是：" + totalAge);    //年龄总和是：91
//平均值 Collectors.averagingInt()
Double averageAge = list.stream().collect(Collectors.averagingInt(value -> value.age));
System.out.println("年龄平均是：" + averageAge);  //年龄平均是：22.75

//分组 Collectors.groupingBy()
List<Person> list2 = Arrays.asList(new Person("张三", 18), new Person("张三", 31), new Person("李四", 19), new Person("李四", 33));
//名字相同为一组
Map<String, List<Person>> nameGroup = list2.stream().collect(Collectors.groupingBy(Person::getName));
/*
    姓名为【李四】的人员有：李四19, 李四33
    姓名为【张三】的人员有：张三18，张三31
 */
nameGroup.forEach((name, people) -> System.out.println("姓名为【" + name + "】的人员有：" + people));
//年龄<20组名青年，年龄>20组名“中年”
Map<String, List<Person>> ageGroup = list2.stream().collect(Collectors.groupingBy(person -> {
    if (person.getAge() < 20)
        return "青年";
    else
        return "中年";
}));
/*
    年龄段为【青年】的人员有：张三18, 李四19
    年龄段为【中年】的人员有：张三31, 李四33
 */
ageGroup.forEach((age, people) -> System.out.println("年龄段为【" + age + "】的人员有：" + people));
//多级分组
/*
    姓名为【李四】的人员里面：
        年龄段为【青年】的人员有：李四19
        年龄段为【中年】的人员有：李四33
    姓名为【张三】的人员里面：
        年龄段为【青年】的人员有：张三18
        年龄段为【中年】的人员有：张三31
 */
Map<String, Map<String, List<Person>>> map = list2.stream().collect(Collectors.groupingBy(Person::getName, Collectors.groupingBy(person -> {    //第二个参数可以是任意Collectors的收集器
    if (person.getAge() < 20)
        return "青年";
    else
        return "中年";
})));
map.forEach((key, value) -> {
    System.out.println("姓名为【" + key + "】的人员里面：");
    value.forEach((k, v) -> {
        System.out.println("\t年龄段为【" + k + "】的人员有：" + v);
    });
});
//分组 partitionBy()，重载方法也支持多级分组
Map<Boolean, List<Person>> map2 = list2.stream().collect(Collectors.partitioningBy(person -> person.age > 20));
/*
    分组为【false】里面的元素有：张三18, 李四19
    分组为【true】里面的元素有：张三31,李四33
 */
map2.forEach((aBoolean, people) -> System.out.println("分组为【" + aBoolean + "】里面的元素有：" + people));

//拼接 joining()
//无参方法：直接拼接
//一个参数方法：连词符
//三个参数方法：连词符，前缀，后缀
System.out.println("--------------------------");
String join1 = list.stream().map(person -> person.getName()).collect(Collectors.joining()); //张三李四王五赵六
String join2 = list.stream().map(person -> person.getName()).collect(Collectors.joining("-"));  //张三-李四-王五-赵六
String join3 = list.stream().map(person -> person.getName()).collect(Collectors.joining("-","前缀","后缀"));    //前缀张三-李四-王五-赵六后缀

数值流

计算元素的时候存在装箱拆箱成本，因此引入数值流来减少内存开销。

方法	说明
mapToInt (T -> int) mapToDouble (T -> double) mapToLong (T -> long)	将Stream转换为数值流	IntStream DoubleStream LongStream
boxed ()	将数值流转换为Stream
rangeClosed (int, int) rangeClosed (long, long)	闭区间，值包括首位 (1,100) 值为1...100
range (int, int) range (long, long)	左闭右开区间，值不包括最后一位 (1,100] 值为1...99

IntStream intStream = Stream.of(1, 2, 3, 4).mapToInt(num -> num.intValue());//可以使用Integer::intValue
Stream<Integer> stream = intStream.boxed();
//1到100累加的总和
int sum = IntStream.rangeClosed(1, 100).sum();
System.out.println(sum);    //5050

并行流（坑多少用）

并行处理会将一个大任务切分成多个小任务，充分利用多核CPU的优势，底层实现原理是Fork/Join。但不一定效率就是最高的，例如存在装箱问题，使用的时候也无法保证元素的顺序性。

public static void main(String[] args) {
    //创建并行流
    List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);
    Stream<Integer> parallelStream1 = list.parallelStream();//直接获取并行流
    Stream<Integer> parallelStream2 = list.stream().parallel();//将串行流转换为并行流

    //查看线程名称
    parallelStream1.filter(integer -> {
        System.out.println(Thread.currentThread());
        return integer > 10;
    }).count();

    //效率比较
    System.out.println("for循环，所耗时间为：" + getTime(num -> {    //所耗时间为：1066
        int sum = 0;
        for (int x = 0; x < num; x++) {
            sum += x;
        }
    }));
    System.out.println("串行流，所耗时间为：" +getTime(num ->{    //所耗时间为：263
        LongStream.rangeClosed(0,num).reduce(0,Long::sum);
    }));
    System.out.println("并行流，所耗时间为：" +getTime(num ->{    //所耗时间为：140
        LongStream.rangeClosed(0,num).parallel().reduce(0,Long::sum);
    }));
}

//分别用for循环、串行流、并行流计算5亿相加的和所花费的时间
public static String getTime(Consumer<Long> consumer) {
    long start = System.currentTimeMillis();
    consumer.accept(5_0000_0000L);
    long end = System.currentTimeMillis();
    return "" + (end - start);
}

线程安全解决方法：

①使用同步锁

②使用线程安全的集合

③使用 boxed() 转为串行流 Stream 再操作

//线程不安全
List<Integer> list1 = new ArrayList<>();
IntStream.rangeClosed(1, 1000).parallel().forEach(list1::add);
System.out.println("线程不安全："+list1.size());    //不为1000
//方式一：同步代码块
List<Integer> list2 = new ArrayList<>();
Object obj = new Object();
IntStream.rangeClosed(1, 1000).parallel().forEach(value -> {
    synchronized (obj) {
        list2.add(value);
    }
});
System.out.println("方式一："+list2.size());    //为1000
//方式二：使用线程安全的集合
Vector<Integer> vector = new Vector<>();    //使用线程同步的集合Vector
List<Integer> synchronizedList = Collections.synchronizedList(new ArrayList<>());   //使用Collections方法返回一个线程安全的List
IntStream.rangeClosed(1, 1000).parallel().forEach(vector::add);
IntStream.rangeClosed(1, 1000).parallel().forEach(synchronizedList::add);
System.out.println("方式二 Vctor："+vector.size());    //为1000
System.out.println("方式二：Collections："+synchronizedList.size());    //为1000