0 Stream简介
总体感觉,Stream
相当于一个进化版的Iterator
。Java8源码里是这么注释的:
A sequence of elements supporting sequential and parallel aggregate operations
可以方便的对集合进行遍历、过滤、映射、汇聚、切片等复杂操作。最终汇聚
成一个新的Stream,不改变原始数据
。并且各种复杂的操作都是lazy
的,也就是说会尽可能的将所有的中间操作在最终的汇聚操作一次性完成。
比起传统的对象和数据的操作
,Stream更专注于对流的计算
,和传说中的函数式编程有点类似。
他具体进化的多牛逼,自己体验吧。
给一组输入数据:
List<Integer> list = Arrays.asList(1, null, 3, 1, null, 4, 5, null, 2, 0);
求输入序列中非空奇数之和,并且相同奇数算作同一个。
- 在lambda还在娘胎里的时候,为了实现这个功能,可能会这么做
int s = 0;
Set<Integer> set = new HashSet<>(list);
for (Integer i : set) {
if (i != null && (i & 1) == 0) {
s += i;
}
}
System.out.println(s);
int sum = list.stream().filter(e -> e != null && (e & 1) == 1).distinct().mapToInt(i -> i).sum();
1 获取Stream
从1.8开始,接口中也可以存在 default
修饰的方法了。
Java.util.Collection<E>
中有如下声明:
public interface Collection<E> extends Iterable<E> {
default Stream<E> stream() {
return StreamSupport.stream(spliterator(), false);
}
default Stream<E> parallelStream() {
return StreamSupport.stream(spliterator(), true);
}
}
java.util.Arrays
中有如下声明:
public static <T> Stream<T> stream(T[] array) {
return stream(array, 0, array.length);
}
public static IntStream stream(int[] array) {
return stream(array, 0, array.length);
}
示例
List<String> strs = Arrays.asList("apache", "spark");
Stream<String> stringStream = strs.stream();
IntStream intStream = Arrays.stream(new int[] { 1, 25, 4, 2 });
Stream<String> stream = Stream.of("hello", "world");
Stream<String> stream2 = Stream.of("haha");
Stream<HouseInfo> stream3 = Stream.of(new HouseInfo[] { new HouseInfo(), new HouseInfo() });
Stream<Integer> stream4 = Stream.iterate(1, i -> 2 * i + 1);
Stream<Double> stream5 = Stream.generate(() -> Math.random());
注意:Stream.iterate()
和 Stream.generate()
生成的是无限流
,一般要手动limit
。
2 转换Stream
流过滤、流切片
这部分相对来说还算简单明了,看个例子就够了
Stream<String> stream = Stream.of(
null, "apache", null, "apache", "apache",
"github", "docker", "java",
"hadoop", "linux", "spark", "alifafa");
stream
.filter(e -> e != null && e.contains("a"))
.distinct()
.limit(3)
.forEach(System.out::println);
map/flatMap
Stream的map定义如下:
<R> Stream<R> map(Function<? super T, ? extends R> mapper);
也就是说,接收一个输入(T:当前正在迭代的元素),输出另一种类型(R)。
Stream.of(null, "apache", null, "apache", "apache",
"hadoop", "linux", "spark", "alifafa")
.filter(e -> e != null && e.length() > 0)
.map(str -> str.charAt(0))
.forEach(System.out::println);
sorted
排序也比较直观,有两种:
Stream<T> sorted();
Stream<T> sorted(Comparator<? super T> comparator);
示例:
List<HouseInfo> houseInfos = Lists.newArrayList(
new HouseInfo(1, "恒大星级公寓", 100, 1),
new HouseInfo(2, "汇智湖畔", 999, 2),
new HouseInfo(3, "张江汤臣豪园", 100, 1),
new HouseInfo(4, "保利星苑", 23, 10),
new HouseInfo(5, "北顾小区", 66, 23),
new HouseInfo(6, "北杰公寓", null, 55),
new HouseInfo(7, "保利星苑", 77, 66),
new HouseInfo(8, "保利星苑", 111, 12)
);
houseInfos.stream().sorted((h1, h2) -> {
if (h1 == null || h2 == null)
return 0;
if (h1.getDistance() == null || h2.getDistance() == null)
return 0;
int ret = h1.getDistance().compareTo(h2.getDistance());
if (ret == 0) {
if (h1.getBrowseCount() == null || h2.getBrowseCount() == null)
return 0;
return h1.getBrowseCount().compareTo(h2.getBrowseCount());
}
return ret;
});
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
3 终止/消费Stream
条件测试、初级统计操作
List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);
System.out.println(list.stream().allMatch(e -> e > 0));
System.out.println(list.stream().anyMatch(e -> (e & 1) == 0));
System.out.println(list.stream().noneMatch(e -> e < 0));
Optional<Integer> optional = list.stream().filter(e -> e >= 4).findFirst();
optional.ifPresent(System.out::println);
System.out.println(list.stream().filter(e -> e >= 4).count());
System.out.println(list.stream().min(Integer::compareTo));
System.out.println(list.stream().max(Integer::compareTo));
System.out.println(list.stream().mapToInt(i -> i).max());
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
reduce
这个词不知道怎么翻译,有人翻译为 规约
或 汇聚
。
反正就是将经过一系列转换后的流中的数据最终收集起来,收集的同时可能会反复 apply
某个 reduce函数
。
reduce()方法有以下两个重载的变体:
T reduce(T identity, BinaryOperator<T> accumulator);
<U> U reduce(U identity,
BiFunction<U, ? super T, U> accumulator,
BinaryOperator<U> combiner);
示例:
Integer reduce = Stream.iterate(1, i -> i + 1)
.limit(10)
.reduce(0, (i, j) -> i + j);
Optional<Integer> reduce2 = Stream.iterate(1, i -> i + 1)
.limit(10)
.reduce((i, j) -> i + j);
collect
该操作很好理解,顾名思义就是将Stream中的元素collect到一个地方。
<R> R collect(Supplier<R> supplier,
BiConsumer<R, ? super T> accumulator,
BiConsumer<R, R> combiner);
<R, A> R collect(Collector<? super T, A, R> collector);
Collector接口(他不是函数式接口,没法使用lambda)的关键代码如下:
public interface Collector<T, A, R> {
/**
*
*/
Supplier<A> supplier();
/**
*
*/
BiConsumer<A, T> accumulator();
/**
*
*/
BinaryOperator<A> combiner();
/**
*
*/
Function<A, R> finisher();
/**
*
*/
Set<Characteristics> characteristics();
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
先来看一个关于三个参数的collect()方法的例子,除非特殊情况,不然我保证你看了之后这辈子都不想用它……
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
ArrayList<Integer> ret1 = numbers.stream()
.map(i -> i * 2)
.collect(
() -> new ArrayList<Integer>(),
(list, e) -> list.add(e),
(list1, list2) -> list1.addAll(list2)
);
/***
* <pre>
* collect()方法的三个参数解释如下:
* 1. () -> new ArrayList<Integer>()
* 生成一个新的用来存储结果的集合
* 2. (list, e) -> list.add(e)
* list:是参数1中生成的新集合
* e:是Stream中正在被迭代的当前元素
* 该参数的作用就是将元素添加到新生成的集合中
* 3. (list1, list2) -> list1.addAll(list2)
* 合并集合
* </pre>
***/
ret1.forEach(System.out::println);
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
不使用lambda的时候,等价的代码应该是这个样子的……
List<Integer> ret3 = numbers.stream()
.map(i -> i * 2)
.collect(new Supplier<List<Integer>>() {
@Override
public List<Integer> get() {
return new ArrayList<>();
}
}, new BiConsumer<List<Integer>, Integer>() {
@Override
public void accept(List<Integer> list, Integer e) {
list.add(e);
}
}, new BiConsumer<List<Integer>, List<Integer>>() {
@Override
public void accept(List<Integer> list1, List<Integer> list2) {
list1.addAll(list2);
}
});
ret3.forEach(System.out::println);
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
是不是被恶心到了……
同样的,用Java调用Spark的api的时候,如果没有lambda的话,比上面的代码还恶心……
顺便打个免费的广告,可以看看本大侠这篇使用各种版本实现的Spark的HelloWorld: http://blog.csdn.net/hylexus/article/details/52606540,来证明一下有lambda的世界是有多么幸福……
不过,当你理解了三个参数的collect方法之后,可以使用构造器引用和方法引用来使代码更简洁:
ArrayList<Integer> ret2 = numbers.stream()
.map(i -> i * 2)
.collect(
ArrayList::new,
List::add,
List::addAll
);
ret2.forEach(System.out::println);
Collectors工具的使用(高级统计操作)
上面的三个和一个参数的collect()方法都异常复杂,最常用的还是一个参数的版本。但是那个Collector自己实现的话还是很恶心。
还好,常用的Collect操作对应的Collector都在java.util.stream.Collectors
中提供了。很强大的工具……
以下示例都是对该list的操作:
List<HouseInfo> houseInfos = Lists.newArrayList(
new HouseInfo(1, "恒大星级公寓", 100, 1),
new HouseInfo(2, "汇智湖畔", 999, 2),
new HouseInfo(3, "张江汤臣豪园", 100, 1),
new HouseInfo(4, "保利星苑", 111, 10),
new HouseInfo(5, "北顾小区", 66, 23),
new HouseInfo(6, "北杰公寓", 77, 55),
new HouseInfo(7, "保利星苑", 77, 66),
new HouseInfo(8, "保利星苑", 111, 12)
);
好了,开始装逼之旅 ^_^ ……
List<String> ret1 = houseInfos.stream()
.map(HouseInfo::getHouseName).collect(Collectors.toList());
ret1.forEach(System.out::println);
Set<String> ret2 = houseInfos.stream()
.map(HouseInfo::getHouseName).collect(Collectors.toSet());
ret2.forEach(System.out::println);
String names = houseInfos.stream()
.map(HouseInfo::getHouseName).collect(Collectors.joining("_^_"));
System.out.println(names);
ArrayList<String> collect = houseInfos.stream()
.map(HouseInfo::getHouseName)
.collect(Collectors.toCollection(ArrayList::new));
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
Optional<HouseInfo> ret3 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.collect(Collectors.maxBy((h1, h2) -> Integer.compare(h1.getBrowseCount(), h2.getBrowseCount())));
System.out.println(ret3.get());
Optional<Integer> ret4 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.map(HouseInfo::getBrowseCount)
.collect(Collectors.maxBy(Integer::compare));
System.out.println(ret4.get());
Long total = houseInfos.stream().collect(Collectors.counting());
System.out.println(total);
Integer ret5 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.collect(Collectors.summingInt(HouseInfo::getBrowseCount));
System.out.println(ret5);
Integer ret6 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.map(HouseInfo::getBrowseCount).collect(Collectors.summingInt(i -> i));
System.out.println(ret6);
int ret7 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.mapToInt(HouseInfo::getBrowseCount)
.sum();
System.out.println(ret7);
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
Double ret8 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.collect(Collectors.averagingDouble(HouseInfo::getBrowseCount));
System.out.println(ret8);
OptionalDouble ret9 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.mapToDouble(HouseInfo::getBrowseCount)
.average();
System.out.println(ret9.getAsDouble());
DoubleSummaryStatistics statistics = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.collect(Collectors.summarizingDouble(HouseInfo::getBrowseCount));
System.out.println("avg:" + statistics.getAverage());
System.out.println("max:" + statistics.getMax());
System.out.println("sum:" + statistics.getSum());
Map<Integer, List<HouseInfo>> ret10 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null)
.collect(Collectors.groupingBy(HouseInfo::getBrowseCount));
ret10.forEach((count, house) -> {
System.out.println("BrowseCount:" + count + " " + house);
});
Map<Integer, Map<String, List<HouseInfo>>> ret11 = houseInfos.stream()
.filter(h -> h.getBrowseCount() != null && h.getDistance() != null)
.collect(Collectors.groupingBy(
HouseInfo::getBrowseCount,
Collectors.groupingBy((HouseInfo h) -> {
if (h.getDistance() <= 10)
return "较近";
else if (h.getDistance() <= 20)
return "近";
return "较远";
})));
ret11.forEach((count, v) -> {
System.out.println("浏览数:" + count);
v.forEach((desc, houses) -> {
System.out.println("\t" + desc);
houses.forEach(h -> System.out.println("\t\t" + h));
});
});
/****
* <pre>
* 浏览数:66
较远
HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
浏览数:100
较近
HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
浏览数:999
较近
HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
浏览数:77
较远
HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
浏览数:111
近
HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
较近
HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
*
* </pre>
*
****/
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
Map<Boolean, List<HouseInfo>> ret12 = houseInfos.stream()
.filter(h -> h.getDistance() != null)
.collect(Collectors.partitioningBy(h -> h.getDistance() <= 20));
/****
* <pre>
* 较远
HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
较近
HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
*
* </pre>
****/
ret12.forEach((t, houses) -> {
System.out.println(t ? "较近" : "较远");
houses.forEach(h -> System.out.println("\t\t" + h));
});
Map<Boolean, Map<Boolean, List<HouseInfo>>> ret13 = houseInfos.stream()
.filter(h -> h.getDistance() != null)
.collect(
Collectors.partitioningBy(h -> h.getDistance() <= 20,
Collectors.partitioningBy(h -> h.getBrowseCount() >= 70))
);
/*****
* <pre>
* 较远
浏览较少
HouseInfo [houseId=5, houseName=北顾小区, browseCount=66, distance=23]
浏览较多
HouseInfo [houseId=6, houseName=北杰公寓, browseCount=77, distance=55]
HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
较近
浏览较少
浏览较多
HouseInfo [houseId=1, houseName=恒大星级公寓, browseCount=100, distance=1]
HouseInfo [houseId=2, houseName=汇智湖畔, browseCount=999, distance=2]
HouseInfo [houseId=3, houseName=张江汤臣豪园, browseCount=100, distance=1]
HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
* </pre>
****/
ret13.forEach((less, value) -> {
System.out.println(less ? "较近" : "较远");
value.forEach((moreCount, houses) -> {
System.out.println(moreCount ? "\t浏览较多" : "\t浏览较少");
houses.forEach(h -> System.out.println("\t\t" + h));
});
});