java distinct 用法_java-使用JPA和Hibernate时DISTINCT如何工作

最新推荐文章于 2024-06-25 10:22:57 发布

意呆

最新推荐文章于 2024-06-25 10:22:57 发布

阅读量940

点赞数

文章标签： java distinct 用法

本文链接：https://blog.csdn.net/weixin_33347467/article/details/115020383

版权

正如我在本文中所解释的，根据底层的JPQL或Criteria API查询类型，JPA中的DISTINCT具有两种含义。

标量查询

对于返回标量投影的标量查询，例如以下查询：

List publicationYears = entityManager

.createQuery(

"select distinct year(p.createdOn) " +

"from Post p " +

"order by year(p.createdOn)", Integer.class)

.getResultList();

LOGGER.info("Publication years: {}", publicationYears);

应该将DISTINCT关键字传递给基础SQL语句，因为我们希望DB引擎在返回结果集之前过滤重复项：

SELECT DISTINCT

extract(YEAR FROM p.created_on) AS col_0_0_

FROM

post p

ORDER BY

extract(YEAR FROM p.created_on)

-- Publication years: [2016, 2018]

实体查询

对于实体查询，DISTINCT具有不同的含义。

在不使用DISTINCT的情况下，进行如下查询：

List posts = entityManager

.createQuery(

"select p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

将要加入DISTINCT和Post表格，如下所示：

SELECT p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1, 1]

但是父级DISTINCT记录在结果集中为每个关联的Post行重复。因此，Post实体的List将包含重复的Post实体引用。

为了消除DISTINCT实体引用，我们需要使用Post：

List posts = entityManager

.createQuery(

"select distinct p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

但是然后DISTINCT也传递给SQL查询，这是完全不希望的：

SELECT DISTINCT

p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1]

通过将DISTINCT传递给SQL查询，EXECUTION PLAN将执行一个额外的Sort阶段，这会增加开销而不会带来任何值，因为父子组合始终会由于子PK列而返回唯一记录：

Unique (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)

-> Sort (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)

Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review

Sort Method: quicksort Memory: 25kB

-> Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)

Hash Cond: (pc.post_id = p.id)

-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)

-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)

Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)

Rows Removed by Filter: 3

Planning time: 0.227 ms

Execution time: 0.179 ms

具有HINT_PASS_DISTINCT_THROUGH的实体查询

为了从执行计划中消除排序阶段，我们需要使用DISTINCT JPA查询提示：

List posts = entityManager

.createQuery(

"select distinct p " +

"from Post p " +

"left join fetch p.comments " +

"where p.title = :title", Post.class)

.setParameter(

"title",

"High-Performance Java Persistence eBook has been released!"

)

.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)

.getResultList();

LOGGER.info(

"Fetched the following Post entity identifiers: {}",

posts.stream().map(Post::getId).collect(Collectors.toList())

);

现在，SQL查询将不包含DISTINCT，但是Post实体引用重复项将被删除：

SELECT

p.id AS id1_0_0_,

pc.id AS id1_1_1_,

p.created_on AS created_2_0_0_,

p.title AS title3_0_0_,

pc.post_id AS post_id3_1_1_,

pc.review AS review2_1_1_,

pc.post_id AS post_id3_1_0__

FROM post p

LEFT OUTER JOIN

post_comment pc ON p.id=pc.post_id

WHERE

p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1]

执行计划将确认我们这次不再具有额外的排序阶段：

Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)

Hash Cond: (pc.post_id = p.id)

-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)

-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)

Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)

Rows Removed by Filter: 3

Planning time: 1.184 ms

Execution time: 0.160 ms

意呆

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java distinct 用法_java-使用JPA和Hibernate时DISTINCT如何工作

正如我在本文中所解释的，根据底层的JPQL或Criteria API查询类型，JPA中的DISTINCT具有两种含义。标量查询对于返回标量投影的标量查询，例如以下查询：List publicationYears = entityManager.createQuery("select distinct year(p.createdOn) " +"from Post p " +"order by ye...
复制链接

扫一扫