接上篇,篇幅一里面介绍了一些基本的操作,诸如常见的字符串拼接,类型转换,本篇说说一些stream的其他操作
- filter:顾名思义,对一个stream流执行过滤操作,符合条件的将会保留
- distinct:去重操作
- limit:区范围结果集
- skip:略过多少条
- count:统计
- group by:分组
看到上面的这些操作,有木有在写sql感觉,没错,之前一些简单的”类sql”操作,我们现在都可以借助于stream去实现,下面结合实例一起看看常用的集合sql式操作
集合List简单的理解为关系型db中的表,属性字段理解为表结构中的列,建立一个测试用的类:
class Foo{
private Integer id;
private Integer age;
private Byte gender;
private String name;
private String dept;
}
对应的db表结构可以想象成是这样的:
create table foo(
id int auto_increment,
age int,
gender tinyint,
name varchar(30),
dept varchar(20)
)
一个完整的示例代码,未免代码篇幅过长,后续测试都是基于此讲解
public class StreamSqlTest {
private List<Foo> list = new ArrayList<>();
@Before
public void beforeTest() {
for (int i = 0; i < 10; i++) {
Foo foo = new Foo(i, i % 4, (byte) (i % 2), "name" + i % 8, "dept" + i % 7);
list.add(foo);
}
}
@Test
public void testFilter() {
System.out.println("before filter:");
printList(list);
List<Foo> foos = list.stream().filter(foo -> foo.getAge() > 2).collect(Collectors.toList());
System.out.println("after filter");
printList(foos);
}
private static final void printList(List<Foo> list) {
System.out.println("list size:" + list.size());
for (int i = 0; i < list.size(); i++) {
System.out.println(list.get(i));
}
}
}
class Foo {
public Foo(int id, int age, byte gender, String name, String dept) {
this.id = id;
this.age = age;
this.gender = gender;
this.name = name;
this.dept = dept;
}
private Integer id;
private Integer age;
private Byte gender;
private String name;
private String dept;
@Override
public String toString() {
return JSON.toJSONString(this);
}
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public Integer getAge() {
return age;
}
public void setAge(Integer age) {
this.age = age;
}
public Byte getGender() {
return gender;
}
public void setGender(Byte gender) {
this.gender = gender;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getDept() {
return dept;
}
public void setDept(String dept) {
this.dept = dept;
}
}
上述testFilter中filter筛选出了age>2的内容,以下是完整的输出
before filter:
list size:10
{"age":0,"dept":"dept0","gender":0,"id":0,"name":"name0"}
{"age":1,"dept":"dept1","gender":1,"id":1,"name":"name1"}
{"age":2,"dept":"dept2","gender":0,"id":2,"name":"name2"}
{"age":3,"dept":"dept3","gender":1,"id":3,"name":"name3"}
{"age":0,"dept":"dept4","gender":0,"id":4,"name":"name4"}
{"age":1,"dept":"dept5","gender":1,"id":5,"name":"name5"}
{"age":2,"dept":"dept6","gender":0,"id":6,"name":"name6"}
{"age":3,"dept":"dept0","gender":1,"id":7,"name":"name7"}
{"age":0,"dept":"dept1","gender":0,"id":8,"name":"name0"}
{"age":1,"dept":"dept2","gender":1,"id":9,"name":"name1"}
after filter
list size:2
{"age":3,"dept":"dept3","gender":1,"id":3,"name":"name3"}
{"age":3,"dept":"dept0","gender":1,"id":7,"name":"name7"}
这就跟sql中写select * from foo where age>2的效果一样,
- count
在来看看count的示例
long count = list.stream().filter(foo -> foo.getAge() > 2).count();
count示例比较简单,上面的写法就跟sql中写
select count(1) from foo where age>2
的效果一样的
- distinct:
stream方式:
list.stream().map(foo->foo.getDept()).distinct().collect(Collectors.toList());
sql方式:
select distinct dept from foo;
- skip&limit:
stream写法:
list.stream().skip(2).limit(5).collect(Collectors.toList());
sql写法:
select * from foo limit 2,5
- group by
stream 写法:按照部门对Foo进行分组
list.stream().collect(Collectors.groupingBy((Foo f) -> f.getDept()));
sql写法:
select id,dept from foo group by dept
注意,这两个在严格意义上返回的结构是不一样的,通常sql的group by操作配合聚合函数使用,stream的group by将集合中的数据按关键字进行分组,想象一下,开发人员在使用mybatis等半自动orm框架的时候,根据分组sql 从db中取出来的数据通常是一个集合List列表,以前的做法通常都是遍历一个集合,然后使用一个Map的方式取手动分组,类似于下面这样的:
List<Foo> list = getFooListFromDB();
Map<String,List<Foo>> result = new HashMap<>();
for(Foo f:list){
String key = f.getDept();
if(result.get(key) != null){
result.get(key).add(f);
}else{
List<Foo> list = new ArrayList<>();
result.put(key,list);
}
}
相信应该不止我一个人写过这种代码,在拥有jdk1.8 Stream api后,我们的代码可以写的更简洁易懂
- toMap方法
有木有经常遇到这种需求,我有一批id,需要去调用接口批量获取数据,然后将数据赋给对应的实例,看看”普通”的写法
List<Integer> ids = getIdsFromSomeWhere();
for(Integer id:ids){
RemoteObject obj = remoteCall();
// do some thing here for remote object
}
这种写法我在工作中见到过很多,通常我们认为网络io磁盘io这种属于比较耗时的操作,能够通过批量获取的一般通过批量获取,毕竟cpu的速度跟网络磁盘io的速度间隔好几个数量级
看看下面这种
List<Integer> ids = getIdsFromSomeWhere();
StringBuffer sb = new StringBuffer();
for(Integer id:ids){
sb.append(id).append(",");
}
String idStr = sb.toString();
idStr = idStr.subString(0,idStr.length()-1);
List<RemoteObject> list = batchGetRemoteObject(idStr);
Map<Integer,RemoteObjcet> map = new HashMap<>();
for(RemoteObject r:list){
map.put(r.getId(),r);
}
for(int i=0;i<10;i++){
// do some thing here ,
RemoteObject r = map.get(key);s
}
上面是一种伪代码的表达方式,将远程方法调用合并为一次调用,然后将list转换为map的形式,后续for循环处理的时候在通过map对象的get方法,去除相对应的RemoteObject对象,其实这种写法还是挺好的,唯一的一点坏处是代码过于”啰嗦”,看看下面的方式:
List<Integer> ids = getIdsFromSomeWhere();
String idStr = ids.stream().collect(Collectors.joining(","));
List<RemoteObject> list = batchGetRemoteObject(idStr);
Map<Integer,RemoteObject> map = list.stream.collect(Collectors.toMap((Foo f)->f.getId(),Function.identity()));
// for loop do some thing
toMap可以被认为是group by的一个特例,不过toMap要求提供的key值是唯一的,如果有相同的key值,将会抛出java.lang.IllegalStateException异常