前言
上一篇文章Spring Data JPA 百万级数据量动态分页查询的优化中,提到了在JPA动态分页中,如何去掉总量统计来解决分页查询慢的问题,该方案虽然看起来无懈可击,但是有一个缺陷:
- 无法根据当前接口的返回数据判断是否有下一页
虽然在一些不需要页码导航条的场景中,这个缺陷也算不上缺陷,用户只需要滚动翻页就好了,若当前页没有数据则则停止滚动,提示无更多数据。
但是对于那些需要页码导航栏的场景来说,这个问题是致命的,因为前端拿到的接口数据中仅包含当前页的数据,它缺失了总页数和总量,所以前端不知道是否还有下一页,页码导航栏就只能显示到当前页码。这样造成的问题就是:
- 假设按钮【下一页】始终可以点击:
- 初次查询时,页码显示 1
- 点击下一页时,页码显示 1 2
- 继续点击下一页时,页码显示1 2 3
- …
- 直到最后一页,页码显示 1 2 3…n
- 假设按钮【下一页】要根据最大页码和当前页码来比较从而判断是否可以点击:
- 初次查询时,页码显示 1
- 永远无法点击下一页
看出来没,在第一个场景中,页码永远是变化的,用户每点击一次下一页,页码导航条都在增长。在第二个场景中,永远只能查到第一页数据。
本文的目标就是解决这个问题,本文的部分代码建立在上一篇文章的基础上,如果你还没看过建议看一下。
百度搜索的分页特征
百度是我们最常用的网站了吧,每次从百度搜索东西,如果留意的话,你应该会注意到,它的分页导航条有个特点:
- 没有总页数,没有搜索结果总数,但是页码是可变的
以下是每次点击一个页数,分页导航条的变换特征:
…
…
可以看出来,从第1页~第6页,每次点击一个页码,分页导航条都不会变,末页页码始终为10,从第7页开始,每点击下一个页码,末页的页码都会+1。点亮的页码最终会停留在第6个页码的位置上。
动态分页导航栏的实现
那么如果要实现一个如同百度动态页码导航栏的分页接口,该如何实现呢?
- 多查询若干页数据,返回一个动态的总量
大体思想就是,每次分页查询时对输入的页码参数做一个累加,看看后面的某一页是否有数据,如果有,那么按照后面这一页计算总量和总页数,虽然这里的总量并不是真实的数据总量,但是这个总量对应的页区间的页码都是可以返回数据的,随着用户逐渐往后翻页,总页数和总量都会逐渐增长,直到返回末页。通过这个方案实现一个动态翻页的效果。
基本原理正如下面流程图所示:
基于上述流程图的代码实现:
@Component
public class PageTotalCounter {
private final CriteriaNoCountDao criteriaNoCountDao;
public PageTotalCounter(CriteriaNoCountDao criteriaNoCountDao) {
this.criteriaNoCountDao = criteriaNoCountDao;
}
/**
* Find a suitable total for method {@link CriteriaNoCountDao#findAll(Specification, Pageable, Class)}
*/
public <T> long getTotal(Pageable sourceRequest, Specification<T> specification, Slice<T> sourcePage, Integer extensionSize,
Class<T> domainClass) {
this.checkParameter(sourceRequest, specification, sourcePage, extensionSize);
long total = sourceRequest.getPageSize() * (sourceRequest.getPageNumber() + 1);
if (sourcePage.getNumberOfElements() == sourceRequest.getPageSize()) {
Pageable totalRequest = PageRequest.of(sourceRequest.getPageNumber() + extensionSize, sourceRequest.getPageSize());
total = this.calculateTotal(totalRequest, sourceRequest.getPageNumber(), sourcePage.getNumberOfElements(), specification, domainClass);
} else if (sourcePage.getNumberOfElements() > 0 && sourcePage.getNumberOfElements() < sourceRequest.getPageSize()) {
total = sourceRequest.getPageSize() * (sourceRequest.getPageNumber() - 1) + sourcePage.getNumberOfElements();
} else if (sourcePage.getNumberOfElements() == 0) {
total = sourceRequest.getPageSize() * (sourceRequest.getPageNumber() - 1);
}
return total;
}
private <T> void checkParameter(Pageable sourceRequest, Specification<T> specification, Slice<T> sourcePage,
Integer extensionSize) {
Assert.notNull(sourceRequest, "sourceRequest must not be null");
Assert.notNull(specification, "specification must not be null");
Assert.notNull(sourcePage, "extensionSize must not be null");
if (extensionSize == null || extensionSize <= 0) {
throw new IllegalArgumentException("extensionSize must have a positive value");
}
}
private <T> long calculateTotal(Pageable totalRequest, int sourcePageNo, int numberOfElements, Specification<T> specification,
Class<T> domainClass) {
Slice<T> totalPage = criteriaNoCountDao.findAll(specification, totalRequest, domainClass);
if (totalPage.getNumberOfElements() == totalRequest.getPageSize()) {
return totalRequest.getPageSize() * totalRequest.getPageNumber();
} else if (totalPage.getNumberOfElements() > 0
&& totalPage.getNumberOfElements() < totalRequest.getPageSize()) {
return totalRequest.getPageSize() * totalRequest.getPageNumber() + totalPage.getNumberOfElements();
} else if (totalPage.getNumberOfElements() == 0) {
if (totalPage.getNumber() - 1 == sourcePageNo) {
return numberOfElements;
}
totalRequest = PageRequest.of(totalPage.getNumber() - 1, totalPage.getSize());
return calculateTotal(totalRequest, sourcePageNo, numberOfElements, specification, domainClass);
}
return 0;
}
}
方案好不要用,要经过验证才知道。
实战检验
我们在controller中增加两个接口的访问入口:
@GetMapping("slice")
public Page<Authors> findSlice(Integer pageNo, Integer pageSize) {
AuthorsRequest queryRequest = new AuthorsRequest();
queryRequest.setPageNo(pageNo);
queryRequest.setPageSize(pageSize);
return authorsQueryService.dynamicSliceQuery(queryRequest);
}
@GetMapping("page")
public Page<Authors> findPage(Integer pageNo, Integer pageSize) {
AuthorsRequest queryRequest = new AuthorsRequest();
queryRequest.setPageNo(pageNo);
queryRequest.setPageSize(pageSize);
return authorsQueryService.dynamicQuery(queryRequest);
}
这两个方法,findSlice
将返回一个无总量分页的Page对象,findPage
将返回一个动态总量分页对象。
对应的Service层代码如下:
public Page<Authors> dynamicSliceQuery(AuthorsRequest authorsRequest) {
Pageable pageable = PageRequest.of(authorsRequest.getPageNo(), authorsRequest.getPageSize());
Specification<Authors> specification = this.specification4Query(authorsRequest);
return criteriaNoCountDao.findAll(specification, pageable, Authors.class);
}
public Page<Authors> dynamicQuery(AuthorsRequest authorsRequest) {
Pageable pageable = PageRequest.of(authorsRequest.getPageNo(), authorsRequest.getPageSize());
Specification<Authors> specification = this.specification4Query(authorsRequest);
Slice<Authors> slice = criteriaNoCountDao.findAll(specification, pageable, Authors.class);
long total = totalCounter.getTotal(pageable, specification, slice, 5, Authors.class);
return new PageImpl<>(slice.getContent(), pageable, total);
}
private Specification<Authors> specification4Query(AuthorsRequest authorsRequest) {
return (root, query, criteriaBuilder) -> {
List<Predicate> list = new ArrayList<>();
if (authorsRequest.getFirstName() != null && !authorsRequest.getFirstName().trim().isEmpty()) {
list.add(criteriaBuilder.like(root.get("firstName").as(String.class),
authorsRequest.getFirstName() + "%"));
}
if (authorsRequest.getLastName() != null && !authorsRequest.getLastName().trim().isEmpty()) {
list.add(criteriaBuilder.equal(root.get("lastName").as(String.class), authorsRequest.getLastName()));
}
if (authorsRequest.getEmail() != null && !authorsRequest.getEmail().trim().isEmpty()) {
list.add(criteriaBuilder.equal(root.get("email").as(String.class), authorsRequest.getEmail()));
}
Predicate[] p = new Predicate[list.size()];
return criteriaBuilder.and(list.toArray(p));
};
}
这两个方法共同点都是先通过CriteriaNoCountDao#findAll
查询一个无总量分页对象,二者唯一区别就是后者dynamicQuery
中多了一次总量计算的过程,其中参数5
是自定义的,它用来对分页导航条做拓展,数字越大,分页导航条越长。
启动SpringBoot项目,分别访问slice
和page
接口,二者返回结果如下表格中的图片所示:
http://localhost:8080/authors/slice?pageNo=0&pageSize=10 | http://localhost:8080/authors/page?pageNo=0&pageSize=10 |
---|---|
http://localhost:8080/authors/slice?pageNo=1&pageSize=10 | http://localhost:8080/authors/page?pageNo=1&pageSize=10 |
---|---|
显然:
- 从两个表格的第一列可以看出,slice接口的
last
属性始终为true
,意味着当前页是最后一页,且总量totalElements
和总页数totalPages
是完全按照当前页pageNo
作为最后一页而计算的。 - 从两个表格的第二列可以看出,page接口的
last
属性始终为false
,意味着有下一页,且总量totalElements
和总页数totalPages
是按照参数pageNo+5
的作为最后一页计算的。
因此我们这个方案是有效的,能够解决无总量分页无法适配页码导航条的问题。
性能分析
上一篇文章中提到的无总量分页方案,本质是为了解决分页查询时的效率问题,如果本文的总量计算涉及的数据库查询带来了性能损耗,你可以接受吗?
根据之前的流程图,如果当前页数据小于页大小,是不会额外访问数据库的,这种情况下可认为没有性能损耗。
如果当前页数据量等于页大小,那么一定会有性能损耗,那这个损耗有多大呢?
肯定是取决于你传递给PageTotalCounter#getTotal
方法的extensionSize
参数,乐观情况下,若实际的总页数大于extensionSize
则只会产生一次数据库查询,悲观情况下,若实际的总页数小于extensionSize
,则最多会产生extensionSize
次的数据库查询,这些额外的查询中最多有一次查询会命中数据,最多有extensionSize
次查询都不会命中数据,由于不会命中数据的查询其耗时是很低的,所以即使做了多次的数据库访问也不会造成太大的性能损耗,总体是可以接受的
为了论证这个观点,我们做一个简单的测试,当前数据集总量为1000,先将dynamicQuery
方法加入时间统计:
public Page<Authors> dynamicQuery(AuthorsRequest authorsRequest) {
Pageable pageable = PageRequest.of(authorsRequest.getPageNo(), authorsRequest.getPageSize());
Specification<Authors> specification = this.specification4Query(authorsRequest);
long t1 = System.currentTimeMillis();
Slice<Authors> slice = criteriaNoCountDao.findAll(specification, pageable, Authors.class);
long t2 = System.currentTimeMillis();
long total = totalCounter.getTotal(pageable, specification, slice, 5, Authors.class);
long t3 = System.currentTimeMillis();
System.out.println("query cost time : " + (t2-t1));
System.out.println("total cost time : " + (t3-t2));
return new PageImpl<>(slice.getContent(), pageable, total);
}
然后请求 http://localhost:8080/authors/page?pageNo=0&pageSize=1000,由于设置了页大小为1000,所以只会有一页,这种情况下性能是最低的,查看控制台的打印:
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
query cost time : 216
total cost time : 172
请求更换为:http://localhost:8080/authors/page?pageNo=99&pageSize=10,页大小为10,查询第100页,所以页只会有一页,此时性能也最低,查看控制台的打印:
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
Hibernate: select authors0_.id as id1_0_, authors0_.added as added2_0_, authors0_.birthdate as birthdat3_0_, authors0_.email as email4_0_, authors0_.first_name as first_na5_0_, authors0_.last_name as last_nam6_0_ from authors authors0_ where 1=1 limit ?, ?
query cost time : 196
total cost time : 159
可以看出统计总量这段逻辑,即使走了5次数据库访问,其总体耗时还是低于第一次分页查询的耗时。这里尝试实验的数据量比较小,如果上了百万级,性能损耗会增加,真到了那个量,不建议用关系型数据库做分页搜索,要速度还是得上ES。