java distinct 用法_java-使用JPA和Hibernate时DISTINCT如何工作

正如我在本文中所解释的,根据底层的JPQL或Criteria API查询类型,JPA中的DISTINCT具有两种含义。

标量查询

对于返回标量投影的标量查询,例如以下查询:

List publicationYears = entityManager

.createQuery(

"select distinct year(p.createdOn) " +

"from Post p " +

"order by year(p.createdOn)", Integer.class)

.getResultList();

LOGGER.info("Publication years: {}", publicationYears);

应该将DISTINCT关键字传递给基础SQL语句,因为我们希望DB引擎在返回结果集之前过滤重复项:

SELECT DISTINCT

extract(YEAR FROM p.created_on) AS col_0_0_

FROM

post p

ORDER BY

extract(YEAR FROM p.created_on)

-- Publication years: [2016, 2018]

实体查询

对于实体查询,DISTINCT具有不同的含义。

在不使用DISTINCT的情况下,进行如下查询:

List posts = entityManager

.createQuery(

"select p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

将要加入DISTINCT和Post表格,如下所示:

SELECT p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1, 1]

但是父级DISTINCT记录在结果集中为每个关联的Post行重复。 因此,Post实体的List将包含重复的Post实体引用。

为了消除DISTINCT实体引用,我们需要使用Post:

List posts = entityManager

.createQuery(

"select distinct p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

但是然后DISTINCT也传递给SQL查询,这是完全不希望的:

SELECT DISTINCT

p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1]

通过将DISTINCT传递给SQL查询,EXECUTION PLAN将执行一个额外的Sort阶段,这会增加开销而不会带来任何值,因为父子组合始终会由于子PK列而返回唯一记录:

Unique (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)

-> Sort (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)

Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review

Sort Method: quicksort Memory: 25kB

-> Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)

Hash Cond: (pc.post_id = p.id)

-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)

-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)

Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)

Rows Removed by Filter: 3

Planning time: 0.227 ms

Execution time: 0.179 ms

具有HINT_PASS_DISTINCT_THROUGH的实体查询

为了从执行计划中消除排序阶段,我们需要使用DISTINCT JPA查询提示:

List posts = entityManager

.createQuery(

"select distinct p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

现在,SQL查询将不包含DISTINCT,但是Post实体引用重复项将被删除:

SELECT

p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1]

执行计划将确认我们这次不再具有额外的排序阶段:

Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)

Hash Cond: (pc.post_id = p.id)

-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)

-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)

Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)

Rows Removed by Filter: 3

Planning time: 1.184 ms

Execution time: 0.160 ms

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值