流是什么
- 流的定义:从支持数据操作处理的源生成的元素序列。
- 元素序列:流也提供接口,可以访问元素的有序排列,这一点与集合类似。
- 流来源于源:流需要使用一个能够提供数据的源,如集合,数组或输入/输出资源。
- 数据处理操作:允许进行filter,map、reduce、find、match
- 流的特点:
- 流水线:流操作返回的仍然是流
- 内部迭代:流的迭代与集合外部迭代,流的迭代是在背后执行
- (1,2)两点的例子
public static void main(String[] args) {
List<Integer> integerList = Arrays.asList(new Integer[]{1,2,3,4});
List<Integer> collect = integerList.stream().filter(item -> item <= 3)
.map(item -> item * 2)
.limit(2)
.collect(Collectors.toList());
}
- 重点1:源即为 intergetList,调用stream生成的元素序列,调用filter可以访问元素的有序排列,迭代为内部迭代
- 重点2:在没有调用collect的时候,所有的操作,流水线仍然还是流水线,只有调用collect元素才会有真正的元素迭代开始。(真正放水是从collect触发的)
- 重点3:内部迭代自动,因此如果limit只取三条就不会遍历所有元素。 collect源码
@Override
@SuppressWarnings("unchecked")
public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
A container;
if (isParallel()
&& (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
&& (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
container = collector.supplier().get();
BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
forEach(u -> accumulator.accept(container, u));
}
else {
container = evaluate(ReduceOps.makeRef(collector));
}
return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
? (R) container
: collector.finisher().apply(container);
}
不难看出调用集合的foreach()方法将元素放入容器返回。
流与集合
- 举例说明
- DVD电影就是一个集合,包含了电影所有的帧。而流媒体电影则是一个流,不需要包含所有帧就可以播放。
- 原理,取决于计算的时机,集合等到所有元素完毕才会执行,而流则根据按需要分配元素
流只能够遍历一次
不同的流可以执行多次
List<String> stringList = Arrays.asList("1","2","3");
stringList.stream().forEach(System.out::println);
stringList.stream().forEach(System.out::println);
相同只能给够执行一次
List<String> stringList = Arrays.asList("1","2","3");
Stream<String> stream = stringList.stream();
stream.forEach(System.out::println);
stream.forEach(System.out::println);
返回信息:stream has already been operated upon or closed
- 哲学理解
- 流是不同时间同一内存位置的信息,集合是同一时间不同地址数据的集合。
- 流是一个水管按时间流出来的,集合就是一个水桶
- 迭代上的区别
- 集合是显式声明的外部迭代
- 流是隐式声明的内部迭代
- 流的三板斧
- 集合变为流->stream()。可以类比增加动力,上图中的搬到高处
- 中间操作链,可以类比增加过滤网,过滤管道
- 终端操作,打开水龙头
流的操作API
上文中流的三板斧中第二环节,也是最重要的环节就是操作链。接下来详解流的API操作。
- 筛选,切片
- filter-筛选:接受一个谓词参数,其实就是,增加一个过滤网,能够通过过滤网的元素才通过。
- distinct-去重:去除重复的元素
- limit-截断:只能够通过部分元素,剩下的关闭
- skip-跳过:与limit相反,只要后面的元素。
- 映射
- map:对通过map关卡的元素进行映射操作,即该元素替换成新的元素
- flatMap:对通过的元素转换成流,并且将流合并起来。
- 查找与匹配
- anyMatch:参数为一个谓词函数,判断流中是否有一个元素满足谓词
- allMatch:参数为一个谓词函数, 判断流中是否全部元素都满足谓词
- noneMatch:参数为一个谓词函数,判断流中所有元素都不满足谓词
- findAny:无参数,流操作执行完毕以后,返回当前流中任意元素,因此底层可以优化出最短路径执行。
- findFirst:无参数,流有逻辑顺序获取出现的第一个元素。*
- 归约
- reduce(a,lambda):就是在流的终点有一个初始值a,到来的每一元素都调用一下lambda函数进行计算获取最终值,赋给a。
- reduce(lambda);无初始值则返回optinal元素。因为流中为空则返回空8
- 例子:计算最大值,最小值,求和。
- 操作API分类
- 状态分类:有无状态,如无状态的Map,filter返回一个会0个结果,不需要累计元素到来之前的状态。有状态的如refuce,sum,max,sort,distict。
- 有状态分类:状态值的长度,有限长度如上述的reduce,sum,max都只需要维护一个int/double等变量,无限长度:sort,distinct在接受变量之前需要在维护一个stream的流来维护已经通过的元素,这种长度就为无线的
- 操作类型:终端操作、中间操作。返回流的为中间操作,返回非流的为终端操作。
- 数值流是一种特殊的流,能够避免装箱的问题
- 原始类型流: IntStream、DoubleStream,LongStream将传入的参数生成一个新流,避免装箱操作
- 卸箱: mapToInt,mapToLong,mapToDouble
- 装箱:boxed
- 数值流:max,sum,min等API会被提供,仍然返回一个OptionalInt,OptinalLong,OptionalDouble变量
- 生成数值范围:IntStream,LongStreamAPI中有一个range(start,end),rengeClosed(start,end)。分别为是否包含结束值。
- 流生成API
- Stream.of(),接收变长的参数,当然也可以用Stream.empty()生成一个空流
- Arrays.stream(),将数组转换为对应元素的流
- Files.lines(),Files非常多的静态方法可以生成NIO的流。
- Stream.iterate(),接受一个初始值,以及lambda函数返回上一个元素操作以后的新元素,这个函数可以生成无限流,这个流是无界的,我们可以通过limit来限制生成流(这也是集合与流的区别,集合必须按照该lambda的规律计算出所需元素)
- Stream.generate(),接受一个lambda函数,Supplier接口生成函数,按需求生成元素
终端操作-数据收集
- 名词解释
- collect(),一个规约操作,接受每一个参数并返回一个汇总接口
- Collector接口,collect接受的参数
- Collector简介
- 用收集器做高级规约,Collector决定了对流如何执行规约操作,并将结果累计在一个数据结构中。
- Collector预定义方法分为三大功能:规约汇总为一个值、元素分区、元素分组。写下来详述三个方法
- 归约(Collector中的静态方法)
- counting():统计流中元素总数
public static void main(String[] args) {
Integer integer =0;
Long collect = Stream.generate(Apple::new).limit(100).collect(Collectors.counting());
System.out.println(collect);
}
//counting源码
public static <T> Collector<T, ?, Long>
counting() {
return reducing(0L, e -> 1L, Long::sum);
}
//reducing源码
public static <T, U>
Collector<T, ?, U> reducing(U identity,
Function<? super T, ? extends U> mapper,
BinaryOperator<U> op) {
return new CollectorImpl<>(
boxSupplier(identity),
(a, t) -> { a[0] = op.apply(a[0], mapper.apply(t)); },
(a, b) -> { a[0] = op.apply(a[0], b[0]); return a; },
a -> a[0], CH_NOID);
}
private static <T> Supplier<T[]> boxSupplier(T identity) {
return () -> (T[]) new Object[] { identity };
}
//CollectorImpl
CollectorImpl(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Function<A,R> finisher,
Set<Characteristics> characteristics) {
this.supplier = supplier;
this.accumulator = accumulator;
this.combiner = combiner;
this.finisher = finisher;
this.characteristics = characteristics;
}
//Collector接口
/*
*
* @param <T> the type of input elements to the reduction operation
T:流里面的每一个元素类型
* @param <A> the mutable accumulation type of the reduction operation (often
* hidden as an implementation detail)
A:可变的归约操作类型
* @param <R> the result type of the reduction operation
R:归约以后的结果集
*/
public interface Collector<T, A, R> {
/**
* A function that creates and returns a new mutable result container.
* 提供一个可变结果集容器
* @return a function which returns a new, mutable result container
*/
Supplier<A> supplier();
/**
* A function that folds a value into a mutable result container.
* 将放入易变的结果容器中
* @return a function which folds a value into a mutable result container
*/
BiConsumer<A, T> accumulator();
/**
* A function that accepts two partial results and merges them. The
* combiner function may fold state from one argument into the other and
* return that, or may return a new result container.
* 合并两个操作数
* @return a function which combines two partial results into a combined
* result
*/
BinaryOperator<A> combiner();
/**
* Perform the final transformation from the intermediate accumulation type
* {@code A} to the final result type {@code R}.
*
* <p>If the characteristic {@code IDENTITY_TRANSFORM} is
* set, this function may be presumed to be an identity transform with an
* unchecked cast from {@code A} to {@code R}.
* 函数将结果集从A转换为R
* @return a function which transforms the intermediate result to the final
* result
*/
Function<A, R> finisher();
/**
* Returns a {@code Set} of {@code Collector.Characteristics} indicating
* the characteristics of this Collector. This set should be immutable.
*
* @return an immutable set of collector characteristics
*/
Set<Characteristics> characteristics();
/**
* Returns a new {@code Collector} described by the given {@code supplier},
* {@code accumulator}, and {@code combiner} functions. The resulting
* {@code Collector} has the {@code Collector.Characteristics.IDENTITY_FINISH}
* characteristic.
*
* @param supplier The supplier function for the new collector
* @param accumulator The accumulator function for the new collector
* @param combiner The combiner function for the new collector
* @param characteristics The collector characteristics for the new
* collector
* @param <T> The type of input elements for the new collector
* @param <R> The type of intermediate accumulation result, and final result,
* for the new collector
* @throws NullPointerException if any argument is null
* @return the new {@code Collector}
*/
public static<T, R> Collector<T, R, R> of(Supplier<R> supplier,
BiConsumer<R, T> accumulator,
BinaryOperator<R> combiner,
Characteristics... characteristics) {
Objects.requireNonNull(supplier);
Objects.requireNonNull(accumulator);
Objects.requireNonNull(combiner);
Objects.requireNonNull(characteristics);
Set<Characteristics> cs = (characteristics.length == 0)
? Collectors.CH_ID
: Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH,
characteristics));
return new Collectors.CollectorImpl<>(supplier, accumulator, combiner, cs);
}
/**
* Returns a new {@code Collector} described by the given {@code supplier},
* {@code accumulator}, {@code combiner}, and {@code finisher} functions.
*
* @param supplier The supplier function for the new collector
* @param accumulator The accumulator function for the new collector
* @param combiner The combiner function for the new collector
* @param finisher The finisher function for the new collector
* @param characteristics The collector characteristics for the new
* collector
* @param <T> The type of input elements for the new collector
* @param <A> The intermediate accumulation type of the new collector
* @param <R> The final result type of the new collector
* @throws NullPointerException if any argument is null
* @return the new {@code Collector}
*/
public static<T, A, R> Collector<T, A, R> of(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Function<A, R> finisher,
Characteristics... characteristics) {
Objects.requireNonNull(supplier);
Objects.requireNonNull(accumulator);
Objects.requireNonNull(combiner);
Objects.requireNonNull(finisher);
Objects.requireNonNull(characteristics);
Set<Characteristics> cs = Collectors.CH_NOID;
if (characteristics.length > 0) {
cs = EnumSet.noneOf(Characteristics.class);
Collections.addAll(cs, characteristics);
cs = Collections.unmodifiableSet(cs);
}
return new Collectors.CollectorImpl<>(supplier, accumulator, combiner, finisher, cs);
}
/**
* Characteristics indicating properties of a {@code Collector}, which can
* be used to optimize reduction implementations.
*/
enum Characteristics {
/**
* Indicates that this collector is <em>concurrent</em>, meaning that
* the result container can support the accumulator function being
* called concurrently with the same result container from multiple
* threads.
*
* <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
* then it should only be evaluated concurrently if applied to an
* unordered data source.
*/
CONCURRENT,
/**
* Indicates that the collection operation does not commit to preserving
* the encounter order of input elements. (This might be true if the
* result container has no intrinsic order, such as a {@link Set}.)
*/
UNORDERED,
/**
* Indicates that the finisher function is the identity function and
* can be elided. If set, it must be the case that an unchecked cast
* from A to R will succeed.
*/
IDENTITY_FINISH
}
}
//stream.collect源码
public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
A container;
if (isParallel()
&& (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
&& (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
container = collector.supplier().get();
BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
forEach(u -> accumulator.accept(container, u));
}
else {
container = evaluate(ReduceOps.makeRef(collector));
}
return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
? (R) container
: collector.finisher().apply(container);
}
- maxBy(Comparator),minBy(Compaator):归约最大与最小值.函数返回流中的Comparator比较器获取最大值与最小值
Optional<Apple> collect1 = Stream.generate(Apple::new).limit(100).collect(Collectors.maxBy(Comparator.comparing(Apple::getWeight)));
System.out.println(collect1.get());
minBy(Comparator<? super T> comparator) {
return reducing(BinaryOperator.minBy(comparator));
}
public static <T> Collector<T, ?, Optional<T>>
reducing(BinaryOperator<T> op) {
class OptionalBox implements Consumer<T> {
T value = null;
boolean present = false;
@Override
public void accept(T t) {
if (present) {
value = op.apply(value, t);
}
else {
value = t;
present = true;
}
}
}
- summingInt(x->int),averageInt(x->int):求和,接受一个返回参数的lambda表达式,并将元素按照lambda表达式进行求和。
System.out.println(Stream.generate(Apple::new).limit(100).collect(Collectors.summingInt(Apple::getWeight)));
- summarizingInt,将上述元素中结果放入SumearyStatistics的对象中返回。
System.out.println(Stream.generate(Apple::new).limit(10).collect(Collectors.summarizingInt(Apple::getWeight)));
//打印结果
IntSummaryStatistics{count=10, sum=392, min=1, average=39.200000, max=88}
- joining:将每一个元素调用toString()方法拼接起来。重载方法joining(String)用来表示分隔符
System.out.println(Stream.generate(Apple::new).map(Apple::toString).limit(10).collect(joining("-")));
- reducing:它有三个参数,第一个参数为归约操作的其实质,也是流中没有元素时返回的值。第二个参数是元素中需要转换的值。第三个元素是一个BinaryOperator用来累计形成同一个类型的值。
System.out.println(Stream.generate(Apple::new).limit(10).collect(reducing(0, Apple::getWeight, (i, j) -> i + j)));
//仅有一个参数,可以看成特殊的初始值为第一个元素,第二个参数恒等表达式
System.out.println(Stream.generate(Apple::new).limit(10).collect(reducing((d1,d2)->d1.getWeight()>d2.getWeight()?d1:d2)));
- 分组收集器
@Before
public void initial(){
stream = Stream.of(new Person("赵云",24,"china"),
new Person("张飞",10,"anhui"),
new Person("诸葛亮",50,"sichuan"),
new Person("关羽",55,"sichuan"),
new Person("刘备",55,"sichuan")
);
}
- groupbying(Function<? super T, ? extends K> classifier()),参数接受一个Function,流中每个元素按照classifier函数进行分组
@Test
public void testOneGroupBy(){
Map<String, List<Person>> collect = stream.collect(groupingBy(Person::getArea));
Assert.assertEquals(3,collect.size());
System.out.println(collect);
}
- groupbying(Function<? super T, ? extends K> classifier(),Collector<? super T, A, D> downstream)接受一个分类器,并且对拆分出来的流按照第二个Collector进行收集。下例子为用groupbying
@Test
public void testSecondGroupBy(){
Map<String, Map<Integer, List<Person>>> collect = stream.collect(groupingBy(Person::getArea, groupingBy(Person::getAge)));
Assert.assertEquals(3,collect.size());
System.out.println(collect);
}
- 第二个参数使用counting进行分组统计总数
@Test
public void testSecondCounting(){
Map<String, Long> collect = stream.collect(groupingBy(Person::getArea, Collectors.counting()));
Assert.assertEquals(new Long(3),collect.get("sichuan"));
System.out.println(collect);
}
Collector接口源码理解
CollectorImpl(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Function<A,R> finisher,
Set<Characteristics> characteristics)
- T:流中要元素的泛型
- A: 累加器的类型,累加器用于累计部分结果的对象
- R: 收集结果对象的类型
如collector中的TolistCollector类的签名如下
public class ToListCollector<T> implements Collector<T,List<T>,List<T>>
- T:说明流中元素类型为T
- List:说明累加器累计部分结果的对象的类型
- List:收集器结果对象的类型
理解Collector接口声明的方法
- supplier:建立新的结果容器–》创造一个空的累加器实例,供收集器过程使用。所以我们的ToListCollector中supplier方法为如下所示
supplier<List<T>> supplier = ()->new ArrayList<T>();
2.accumalator方法:执行归约的函数,该函数定义:如何利用n-1的累加器与第N个元素进行归约。参数为(累加器,n)->void。因为累加器是原位更新,即函数的执行改变了它的内部状态以体现遍历的元素的效果。toCollectorList的accumalator对象如下
accumalator<List<T>,T> a = (list,t)->{list.add(t)}
- 对结果容器应用最终转换:finisher方法,将累加器结果转换为最终的结果所以实例如下
publiv Function<List<T>,List<T>> finishier(){
return Function.identity();
}
- 合并两个结果容器:combiner方法,会返回一个供归约操作使用的函数,它定义了流的各个字部分进行并行处理,各个字部分归约所得到的累加器如何合并。
详细解释collect流程
- 原十六会以递归方式拆分为子流,直到定一流是否需要进一步拆分的条件为非
- 使用子流进行并行处理,执行顺序归约算法
- 最后使用收集器conbiner方法返回的函数将所有结果两两合并
characteristics方法
- 定义了流是否能够进行归并操作,以及可以使用哪些优化的提示
- UNORDERD:不受流中项目的遍历累计顺序的影响
- CONCURRENT:可以从多个线程同时调用,且该收集器可以并行归约流
- identity_finish:归约结果finisher是恒等函数,不用调用finishier