Java8 用流收集数据、归约汇总、分组、多级分组、分组并统计、分组并映射

文章详细介绍了Java8中的收集器(Collectors)在处理流数据时的作用,包括归约(如最大值、最小值、统计总数、求和、平均值)、分组(按属性、条件、多级分组)和分区(按条件、多级分区)等操作,以及如何使用这些预定义收集器进行数据处理和分析。
摘要由CSDN通过智能技术生成

[Q&A] 收集器作用

用收集器定义collect 去生成结果集合。

[Q&A] 什么是预定义收集器?

也就是那些可以从Collectors类提供的工厂方法创建的收集器。

[Q&A] 预定义收集器作用?

1、将元素归约汇总为一个值
2、将元素分组
3、将元素分区

在这里插入图片描述

归约汇总

最大值 (max)、最小值 (min)

# 合体写法
Dish dish = menu.stream().collect(reducing(BinaryOperator.maxBy(Comparator.comparingInt(Dish::getCalories)))).get();
Comparator<Dish> comparator = Comparator.comparingInt(Dish::getCalories);
Optional<Dish> res = menu.stream().collect(BinaryOperator.maxBy(comparator));
Dish dish = res.get()

# 优选
Dish dish = menu.stream().max(Comparator.comparingInt(Dish::getCalories)).get();
Dish dish = menu.stream().min(Comparator.comparingInt(Dish::getCalories)).get();

Comparator.comparingInt、Comparator.comparingDouble、Comparator.comparingLong 三种

统计总数 (count)

long howManyDishes = menu.stream().collect(Collectors.counting());

# 优选
long howManyDishes = menu.stream().count();

统计求和 (summingIntsummingLongsummingDoublesum

int total    = Dish.menu.stream().collect(Collectors.summingInt(Dish::getCalories));
Long total   = Dish.menu.stream().collect(Collectors.summingLong(Dish::getCalories));
Double total = Dish.menu.stream().collect(Collectors.summingDouble(Dish::getCalories));

# 优选
int total    = menu.stream().mapToInt(Dish::getCalories).sum();

mapToInt、mapToLong、mapToDouble 三种

平均值 (averagingIntaveragingLongaveragingIntaverage

double avg = Dish.menu.stream().collect(Collectors.averagingInt(Dish::getCalories));
Double avg = Dish.menu.stream().collect(Collectors.averagingLong(Dish::getCalories));
Double avg = Dish.menu.stream().collect(Collectors.averagingInt(Dish::getCalories));

# 优选
double avg = Dish.menu.stream().mapToInt(Dish::getCalories).average().getAsDouble();
Double avg = Dish.menu.stream().mapToInt(Dish::getCalories).average().getAsDouble();

统计梗概 (summarizingIntsummarizingLongsummarizingDouble

#  IntSummaryStatistics{count=9, sum=4300, min=120,average=477.777778, max=800}
IntSummaryStatistics statistics     = Dish.menu.stream().collect(Collectors.summarizingInt(Dish::getCalories));
LongSummaryStatistics statistics    = Dish.menu.stream().collect(Collectors.summarizingLong(Dish::getCalories));
DoubleSummaryStatistics statistics  = Dish.menu.stream().collect(Collectors.summarizingDouble(Dish::getCalories));

连接字符串 (joining

joining在内部使用了StringBuilder来把生成的字符串逐个追加起来;

String shortMenu = menu.stream().map(Dish::getName).collect(Collectors.joining());
// 结果:porkbeefchickenfrench friesriceseason fruitpizzaprawnssalmon

String shortMenu = menu.stream().map(Dish::getName).collect(Collectors.joining(", "));
// 结果:pork, beef, chicken, french fries, rice, season fruit, pizza, prawns, salmon

广义的归约汇总(reducing)

事实上,我们已经讨论的所有收集器,都是一个可以用reducing工厂方法定义的归约过程的特殊情况而已。

# 计算菜单总热量的归约过程
int totalCalories = menu.stream().collect(Collectors.reducing(0, Dish::getCalories, (i, j) -> i + j));
int totalCalories = menu.stream().collect(Collectors.reducing(0, Dish::getCalories, Integer::sum));
int totalCalories = menu.stream().map(Dish::getCalories).reduce(Integer::sum).get();
int totalCalories = menu.stream().mapToInt(Dish::getCalories).sum();
# 我们更倾向于最后一个解决方案,因为它最简明,也很可能最易读。同时,它也是性能最好的一个,因为IntStream可以让我们避免自动拆箱操作。

# 用reducing连接字符串
String shortMenu = menu.stream().collect(Collectors.reducing("", Dish::getName, (s1, s2) -> s1 + s2 ));
String shortMenu = menu.stream().map(Dish::getName).collect(Collectors.reducing((s1, s2) -> s1 + s2)).get();
String shortMenu = menu.stream().map(Dish::getName).collect(Collectors.joining());
# 就实际应用而言,不管是从可读性还是性能方面考虑,我们始终建议使用joining收集器。

在这里插入图片描述

分组 (groupingBy)

普通分组 (groupingBy)

按属性分组

# 菜 → 按类型分类
Map<Dish.Type, List<Dish>> dishesByType = menu.stream().collect(Collectors.groupingBy(Dish::getType));
// {FISH=[prawns, salmon], OTHER=[french fries, rice, season fruit, pizza],MEAT=[pork, beef, chicken]}

按条件分组

# 菜 → 按指定条件分类
# 热量不到400卡路里的菜划分为“低热量”(diet)
# 热量400到700卡路里的菜划为“普通”(normal)
# 高于700卡路里的划为“高热量”(fat)。
public enum CaloricLevel {DIET, NORMAL, FAT}
Map<CaloricLevel, List<Dish>> dishesByCaloricLevel = Dish.menu.stream().collect(Collectors.groupingBy(dish -> {
    if (dish.getCalories() <= 400) {
        return CaloricLevel.DIET;
    } else if (dish.getCalories() <= 700) {
        return CaloricLevel.NORMAL;
    } else {
        return CaloricLevel.FAT;
    }
}));
// {FAT=[pork], DIET=[chicken, rice, season fruit, prawns], NORMAL=[beef, french fries, pizza, salmon]}

[实践总结] Java8 List结果集进行数据分组为Map结果集

多级分组 (groupingBy+groupingBy)

Map<Dish.Type, Map<CaloricLevel, List<Dish>>> dishesByTypeCaloric = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.groupingBy(dish -> {
    if (dish.getCalories() <= 400) {
        return CaloricLevel.DIET;
    } else if (dish.getCalories() <= 700) {
        return CaloricLevel.NORMAL;
    } else {
        return CaloricLevel.FAT;
    }
})));

// {
// MEAT = {DIET =[chicken], NORMAL =[beef],FAT =[pork]},
// FISH = {DIET =[prawns], NORMAL =[salmon]},
// OTHER = {DIET =[rice, seasonal fruit],NORMAL =[french fries, pizza]}
// }

分组并统计 (groupingBy+countingsummingIntaveragingIntmaxByminBysummarizingInt)

# 按类型分组,求每组个数
Map<Dish.Type, Long> CountByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.counting()));
// {MEAT=3, FISH=2, OTHER=4}

# 按类型分组,求每组总和
Map<Dish.Type, Integer> SumByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.summingInt(Dish::getCalories)));

# 按类型分组,求每组平均值
Map<Dish.Type, Double> avgByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.averagingInt(Dish::getCalories)));

# 按类型分组,求每组Max,Min
Map<Dish.Type, Optional<Dish>> maxOptByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.maxBy(Comparator.comparingInt(Dish::getCalories))));
// {FISH=Optional[salmon], OTHER=Optional[pizza], MEAT=Optional[pork]}
Map<Dish.Type, Dish> maxByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.collectingAndThen(Collectors.maxBy(Comparator.comparingInt(Dish::getCalories)), Optional::get)));
// {FISH=salmon, OTHER=pizza, MEAT=pork}

# 按类型分组,求每组summarizing 
Map<Dish.Type, IntSummaryStatistics> summarizing = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.summarizingInt(Dish::getCalories)));

分组并映射 (mapping)

# 按类型分组,值 →映射→ 新值,并存到指定数据结构中
Map<Dish.Type, Set<CaloricLevel>> levelsByType = Dish.menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.mapping(dish -> {
    if (dish.getCalories() <= 400) {
        return CaloricLevel.DIET;
    } else if (dish.getCalories() <= 700) {
        return CaloricLevel.NORMAL;
    } else {
        return CaloricLevel.FAT;
    }
}, Collectors.toSet())));
# 写法2 }, Collectors.toCollection(HashSet::new)))); 
// {OTHER=[DIET, NORMAL], MEAT=[DIET, NORMAL, FAT], FISH=[DIET, NORMAL]}

在这里插入图片描述

分区 (partitioningBy)

分区是分组的特殊情况:分为两组——true是一组,false是一组。

普通分区 (partitioningBy)

# 把菜单按照素食和非素食分开
Map<Boolean, List<Dish>> partitionedMenu = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian));
// { false=[pork, beef, chicken, prawns, salmon], true=[french fries, rice, season fruit, pizza] }
List<Dish> vegetarianDishes = partitionedMenu.get(true);

多级分区 (partitioningBy+partitioningBy)

# partitioningBy
Map<Boolean, Map<Boolean, List<Dish>>> collect = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian, Collectors.partitioningBy(d -> d.getCalories() > 500)));
// { false={false=[chicken, prawns, salmon], true=[pork, beef]}, true={false=[rice, season fruit], true=[french fries, pizza]} }

分区+分组 (partitioningBy+groupingBy)

# groupby
Map<Boolean, Map<Dish.Type, List<Dish>>> vegetarianDishesByType =menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian, Collectors.groupingBy(Dish::getType)));
// {false={FISH=[prawns, salmon], MEAT=[pork, beef, chicken]}, true={OTHER=[french fries, rice, season fruit, pizza]}}

分区并统计 (partitioningBy)

# max
Map<Boolean, Dish> max = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian, Collectors.collectingAndThen(Collectors.maxBy(Collectors.comparingInt(Dish::getCalories)), Optional::get)));
// {false=pork, true=pizza}

# count
Map<Boolean, Long> count = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian, Collectors.counting()));
// {false=5, true=4}

分区通俗用法

# 针对 普通分区
List<Dish> res =menu.stream().filter(Dish::isVegetarian).collect(Collectors.toList());
# 针对 多级分区
List<Dish> res =Dish.menu.stream().filter(Dish::isVegetarian).filter(d -> d.getCalories() > 500).collect(Collectors.toList());
# 针对 分区+分组
Map<Dish.Type, List<Dish>> res = Dish.menu.stream().filter(Dish::isVegetarian).collect(Collectors.groupingBy(Dish::getType));
# 针对 分区并统计
IntSummaryStatistics res = Dish.menu.stream().filter(Dish::isVegetarian).collect(Collectors.summarizingInt(Dish::getCalories));

将数字按质数和非质数分区

# 测试某一个待测数字是否是质数
public boolean isPrime(int candidate) {
    return IntStream.range(2, candidate).noneMatch(i -> candidate % i == 0);
}
# 简单的优化
public boolean isPrime(int candidate) {
    return IntStream.rangeClosed(2,  (int) Math.sqrt((double) candidate)).noneMatch(i -> candidate % i == 0);
}

# 假设你要写一个方法,它接受参数int n,并将前 n 个自然数分为质数和非质数。
public Map<Boolean, List<Integer>> partitionPrimes(int n) {
    return IntStream.rangeClosed(2, n).boxed().collect(Collectors.partitioningBy(candidate -> isPrime(candidate)));
}

-----------------------------------------------------------------------------读书笔记摘自 书名:Java 8实战 作者:[英] Raoul-Gabriel Urma [意] Mario Fusco [英] Alan M

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值