欢迎关注公众号
Collector类
1 collector 介绍
首先我们先看下Collector类,原始类型是T,首先通过Supplier supplier进行初始化转换类型A,然后针对每一个Stream中的元素应用累加器中的apply方法, 加工转换类型A, 通过finish这个Function再将A转换为R。
static class CollectorImpl<T, A, R> implements Collector<T, A, R> {
private final Supplier<A> supplier;
private final BiConsumer<A, T> accumulator;
private final BinaryOperator<A> combiner;
private final Function<A, R> finisher;
private final Set<Characteristics> characteristics;
CollectorImpl(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Function<A,R> finisher,
Set<Characteristics> characteristics) {
this.supplier = supplier;
this.accumulator = accumulator;
this.combiner = combiner;
this.finisher = finisher;
this.characteristics = characteristics;
}
CollectorImpl(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Set<Characteristics> characteristics) {
this(supplier, accumulator, combiner, castingIdentity(), characteristics);
}
@Override
public BiConsumer<A, T> accumulator() {
return accumulator;
}
@Override
public Supplier<A> supplier() {
return supplier;
}
@Override
public BinaryOperator<A> combiner() {
return combiner;
}
@Override
public Function<A, R> finisher() {
return finisher;
}
@Override
public Set<Characteristics> characteristics() {
return characteristics;
}
}
2 规约(reduce 类)
2.1 maxBy:
Optional<Integer> reMax = Lists.newArrayList(1, 2, 3, 4).stream().collect(Collectors.maxBy((Integer::compareTo)));
reMax.ifPresent((Integer e) -> {
System.out.println(e);
});
System.out.println("==============================");
public static <T> Collector<T, ?, Optional<T>>
maxBy(Comparator<? super T> comparator) {
return reducing(BinaryOperator.maxBy(comparator));
}
* Returns a {@link BinaryOperator} which returns the greater of two elements
* according to the specified {@code Comparator}.
*
* @param <T> the type of the input arguments of the comparator
* @param comparator a {@code Comparator} for comparing the two values
* @return a {@code BinaryOperator} which returns the greater of its operands,
* according to the supplied {@code Comparator}
* @throws NullPointerException if the argument is null
*/
public static <T> BinaryOperator<T> maxBy(Comparator<? super T> comparator) {
Objects.requireNonNull(comparator);
return (a, b) -> comparator.compare(a, b) >= 0 ? a : b;
}
public static <T> Collector<T, ?, Optional<T>>
reducing(BinaryOperator<T> op) {
class OptionalBox implements Consumer<T> {
T value = null;
boolean present = false;
@Override
public void accept(T t) {
if (present) {
value = op.apply(value, t);
}
else {
value = t;
present = true;
}
}
}
return new CollectorImpl<T, OptionalBox, Optional<T>>(
OptionalBox::new, OptionalBox::accept,
(a, b) -> { if (b.present) a.accept(b.value); return a; },
a -> Optional.ofNullable(a.value), CH_NOID);
}
一开始原始流的每个元素的类型是T, 之后累加器转换为OptionBox类型,最后combine将结果再次转为Optional。
针对第一个元素1的时候,首先执行 OptionalBox::new 这个Supplier
2.2 minBy
原理同maxBy类似,只是比较器翻过来
Lists.newArrayList(1, 2, 3, 4).stream().collect(Collectors.minBy(Integer::compareTo))
.ifPresent(System.out::println);
2.3 summarizingInt summarizingLong summarizingDouble
IntSummaryStatistics summarizingInt = Lists.newArrayList(1, 2, 3, 4).stream().collect(Collectors.summarizingInt(a -> a));
System.out.println(summarizingInt);
System.out.println("===================================================");
2.4 广义的规约,使用reduce
上面的那些规约实际上就是reducing规约的一个特殊情况,而已,
因此你可以按照自己的诉求定义出自己想要的逻辑。下面我们来看下reducing的方法。reducing有几种实现方法,分别看看是怎么应用的。
- one
/** * Returns a {@code Collector} which performs a reduction of its * input elements under a specified {@code BinaryOperator} using the * provided identity. * * @apiNote * The {@code reducing()} collectors are most useful when used in a * multi-level reduction, downstream of {@code groupingBy} or * {@code partitioningBy}. To perform a simple reduction on a stream, * use {@link Stream#reduce(Object, BinaryOperator)}} instead. * * @param <T> element type for the input and output of the reduction * @param identity the identity value for the reduction (also, the value * that is returned when there are no input elements) * @param op a {@code BinaryOperator<T>} used to reduce the input elements * @return a {@code Collector} which implements the reduction operation * * @see #reducing(BinaryOperator) * @see #reducing(Object, Function, BinaryOperator) */ public static <T> Collector<T, ?, T> reducing(T identity, BinaryOperator<T> op) { return new CollectorImpl<>( boxSupplier(identity), (a, t) -> { a[0] = op.apply(a[0], t); }, (a, b) -> { a[0] = op.apply(a[0], b[0]); return a; }, a -> a[0], CH_NOID); }
针对每一个输入元素,应用特殊的BinaryOperator函数。
T identity: 即最终输出的规约的值,同时也是起始值(即如果没有input的情况下就是返回这个起始值的)。这里举两个例子来说明一下问题,一种是计算sum和, 一种是字符串链接
计算sum和:
Integer sumResult = Lists.newArrayList(1, 2, 3, 4).stream().collect(Collectors.reducing(0, (a, b) -> a + b));
System.out.println(String.format("sum result = %d", sumResult));
System.out.println("======================================================");
sum result = 10
计算字符串连接:
BinaryOperator<String> binaryOperator = new BinaryOperator<String>() {
String combine = "-";
int num = 0;
@Override
public String apply(String s, String s2) {
if (num == 0) {
num++;
return s + s2;
} else {
return s + combine + s2;
}
}
};
String customReduce = Lists.newArrayList("a", "b", "c", "d").stream().collect(Collectors.reducing("", binaryOperator));
System.out.println(String.format("custom reduce string values = %s", customReduce));
custom reduce string values = a-b-c-d
- two
public static <T> Collector<T, ?, Optional<T>>
reducing(BinaryOperator<T> op) {
class OptionalBox implements Consumer<T> {
T value = null;
boolean present = false;
@Override
public void accept(T t) {
if (present) {
value = op.apply(value, t);
}
else {
value = t;
present = true;
}
}
}
return new CollectorImpl<T, OptionalBox, Optional<T>>(
OptionalBox::new, OptionalBox::accept,
(a, b) -> { if (b.present) a.accept(b.value); return a; },
a -> Optional.ofNullable(a.value), CH_NOID);
}
这里举以下例子
Optional<String> stringCombine = Lists.newArrayList("a", "b", "c", "d").stream().collect(Collectors.reducing((s1, s2) -> s1 + "-" + s2));
stringCombine.ifPresent(val -> {
System.out.println(String.format("string combine = %s", val));
});
strings.stream().collect(Collectors.reducing((s1, s2) -> s1 + "-" + s2))
.ifPresent(val -> {
System.out.println(String.format("string combine = %s", val));
});
- three
public static <T, U>
Collector<T, ?, U> reducing(U identity,
Function<? super T, ? extends U> mapper,
BinaryOperator<U> op) {
return new CollectorImpl<>(
boxSupplier(identity),
(a, t) -> { a[0] = op.apply(a[0], mapper.apply(t)); },
(a, b) -> { a[0] = op.apply(a[0], b[0]); return a; },
a -> a[0], CH_NOID);
}
这个是先对Stream原始流里的元素值先转换成另一种类型,然后针对另一种类型进行规约计算。举例如下
Integer result = Lists.newArrayList("a", "b", "c", "d")
.stream()
.collect(Collectors.reducing(0, (String s) -> s.charAt(0) - 'a', (a, b) -> a + b));
System.out.println(result);
输出结果为6.
3 聚合类 分组(groupingBy)
我们经常遇到的一个操作就是根据一个或者多个属性对集合里面的元素进行分组聚合, java8 通过groupingBy可以快速实现该功能。
3.1 聚合接壤Collector
// 分组
List<Integer> list = Lists.newArrayList(1,1,2,2,3,3,3,3,3);
// 聚合获取count
Map<Integer, Long> countResultMap = list.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(countResultMap);
System.out.println("===================================================");
结果如下:
{1=2, 2=2, 3=5}
groupingBy首先接收的第一个Function也是一个分类函数,简言之就是Map中的key值, 就是通过该值对所有元素进行分组的。而后一个参数Collector 则是生成Map中的value。比如
Collectors.groupingBy(Function.identity(), Collectors.counting())
这里Function.identity() 代表自身作为Map的key, 然后根据key的分组聚合的count作为value
Collectors.groupingBy(Function.identity(), Collectors.toList())
这里Function.identity() 代表自身作为Map的key, 然后根据key的分组聚合的元素的集合作为value
后面再介绍一个复杂一点的例子,自定义一下Function函数。比如小于5的一组, 5-10范围的一组, 10以上的一组:
Map<String, List<Integer>> customFunctionGroupby = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors
.groupingBy(val -> {
if (val < 5) {
return "LOW";
} else if (val < 10) {
return "MEDIAN";
} else {
return "HIGH";
}
}, Collectors.toList()));
System.out.println(customFunctionGroupby);
返回
{HIGH=[12, 12, 11, 111], LOW=[1, 1, 2, 2, 3, 3, 3, 3, 3], MEDIAN=[8, 8, 8, 8, 8]}
3.2 分组结果转换collectingAndThen(即收集器的结果转换成另一种类型)
···
Map<Integer, Integer> collectingAndThen = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.collectingAndThen(Collectors.summingInt((Integer a) -> a), a -> a)));
//.collect(Collectors.groupingBy(Function.identity(), Collectors.collectingAndThen(Collectors.maxBy(Integer::compareTo), Optional::get)));
System.out.println(collectingAndThen);
···
返回: {1=2, 2=4, 3=15, 8=40, 11=11, 12=24, 111=111}
3.3 groupingBy接mapping
和groupingBy联合使用的另一个收集器是mapping。这个方法接受两个参数:一个函数对流中的元素做变换,另一个则将变换的结果对象收集起来。其目的是在累加之前对每个输入元素应用一个映射函数,这样就可以让接受特定类型元素的收集器适应不同类型的对象
Map<Integer, Set<String>> groupByMapping = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.mapping((Integer val) -> {
if (val < 5) {
return "LOW";
} else if (val < 10) {
return "MEDIAN";
} else {
return "HIGH";
}
}, Collectors.toSet())));
System.out.println(groupByMapping);
返回: {1=[LOW], 2=[LOW], 3=[LOW], 8=[MEDIAN], 11=[HIGH], 12=[HIGH], 111=[HIGH]}
3.4 多级分组
多级分组,即为groupBy嵌套groupBy
Map<String, Map<Integer, Long>> twoLayerGroupBy = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors.groupingBy(val -> {
if (val < 5) {
return "LOW";
} else if (val < 10) {
return "MEDIAN";
} else {
return "HIGH";
}
}, Collectors.groupingBy(Function.identity(), Collectors.counting())));
System.out.println(twoLayerGroupBy);
{HIGH={11=1, 12=2, 111=1}, LOW={1=2, 2=2, 3=5}, MEDIAN={8=5}}
4 分区partitioningBy
分区比较类似groupingBy, 只是分区的map的key是True和False,其他用法均和groupingBy类似
Map<Boolean, List<Integer>> partitoningByTest = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors.partitioningBy((Integer v) -> v > 5, Collectors.toList()));
System.out.println(partitoningByTest);
返回{false=[1, 1, 2, 2, 3, 3, 3, 3, 3], true=[8, 8, 8, 8, 8, 12, 12, 11, 111]}
5 自定义收集器collector
Collector接口:
public interface Collector<T, A, R> {
/**
* A function that creates and returns a new mutable result container.
*
* @return a function which returns a new, mutable result container
*/
Supplier<A> supplier();
/**
* A function that folds a value into a mutable result container.
*
* @return a function which folds a value into a mutable result container
*/
BiConsumer<A, T> accumulator();
/**
* A function that accepts two partial results and merges them. The
* combiner function may fold state from one argument into the other and
* return that, or may return a new result container.
*
* @return a function which combines two partial results into a combined
* result
*/
BinaryOperator<A> combiner();
/**
* Perform the final transformation from the intermediate accumulation type
* {@code A} to the final result type {@code R}.
*
* <p>If the characteristic {@code IDENTITY_FINISH} is
* set, this function may be presumed to be an identity transform with an
* unchecked cast from {@code A} to {@code R}.
*
* @return a function which transforms the intermediate result to the final
* result
*/
Function<A, R> finisher();
/**
* Returns a {@code Set} of {@code Collector.Characteristics} indicating
* the characteristics of this Collector. This set should be immutable.
*
* @return an immutable set of collector characteristics
*/
Set<Characteristics> characteristics();
接口中的T 就是流中的收集的元素的泛型。
A是累加器的泛型
R则是最后的收集器返回的结果的类型。
了解到上述的接口之后,我们来简单实现以下类似Collectors.toList的这种功能。
首先定义收集器类ListCollector
static class ListCollector implements Collector<Integer, List<Integer>, List<Integer>> {
@Override
public Supplier<List<Integer>> supplier() {
return () -> Lists.newArrayList();
}
@Override
public BiConsumer<List<Integer>, Integer> accumulator() {
return List::add;
}
@Override
public BinaryOperator<List<Integer>> combiner() {
return (List<Integer> l1, List<Integer> l2) ->{
l1.addAll(l2);
return l1;
};
}
@Override
public Function<List<Integer>, List<Integer>> finisher() {
return Function.identity();
}
@Override
public Set<Characteristics> characteristics() {
return Sets.newHashSet(Characteristics.IDENTITY_FINISH);
}
}
之后收集器的使用如下:
// 自定义Collector的实现.
ListCollector listCollector = new ListCollector();
Map<Integer, List<Integer>> customCollector = Lists.newArrayList(1, 1, 2, 2, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 12, 12, 11, 111)
.stream()
.collect(Collectors.groupingBy(Function.identity(), listCollector));
System.out.println(customCollector);
返回结果:
{1=[1, 1], 2=[2, 2], 3=[3, 3, 3, 3, 3], 8=[8, 8, 8, 8, 8], 11=[11], 12=[12, 12], 111=[111]}