循序渐进丨MogDB 数据库查询重写规则uniquecheck详解

云和恩墨

于 2024-07-29 17:02:36 发布

阅读量1

点赞数

文章标签：数据库 oracle

原文链接：https://mp.weixin.qq.com/s?__biz=MjM5MzExMTU2OQ==&mid=2650637426&idx=1&sn=af335c3fde34150a6017822ff65c56c9&chksm=bfc0d40d975b27fe53f35a5fedd4bfa37c65ba42924ced6980c3d17f72b8f18b1e4727fe3248&scene=126&sessionid=0

版权

在 MogDB 和 openGauss 中，参数rewrite_rule用于控制查询重写，本文则介绍查询重写规则uniquecheck。uniquecheck表示提升无agg的where子查询。

参考示例

现在有如下例子：

orcl=> explain analyze select count(*)
  from test01 t1
 where t1.data_object_id =
       (select data_object_id
          from test02 t2
         where t1.object_id = t2.object_id);
                                                                            QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=369500968.48..369500968.49 rows=1 width=20) (actual time=122494.785..122494.785 rows=1 loops=1)
   ->  Seq Scan on test01 t1  (cost=0.00..369500411.84 rows=222656 width=12) (actual time=0.912..121981.180 rows=4696064 loops=1)
         Filter: (data_object_id = (SubPlan 1))
         Rows Removed by Filter: 39841280
         SubPlan 1
           ->  Index Scan using idx_test02_objectid on test02 t2  (cost=0.00..8.27 rows=1 width=6) (actual time=94019.753..99841.735 rows=44537344 loops=44537344)
                 Index Cond: (t1.object_id = object_id)
 Total runtime: 122494.942 ms
(8 rows)

从执行计划上可以看到：

test01 t1是大表，有4000多万行数据；
子查询test02 t2没有被提升（Oracle的说法是没有被展开），走了filter，被扫描了4000多万次；
整个SQL查询耗时122秒，性能较差。

现在设置rewrite_rule=uniquecheck，我们再来看一下执行计划：

orcl=> explain analyze select /*+ set(rewrite_rule uniquecheck) */  count(*)
  from test01 t1
 where t1.data_object_id =
       (select data_object_id
          from test02 t2
         where t1.object_id = t2.object_id);
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1467229.99..1467230.00 rows=1 width=8) (actual time=10758.577..10758.577 rows=1 loops=1)
   ->  Hash Join  (cost=5660.88..1467225.97 rows=1608 width=0) (actual time=66.574..10424.858 rows=4696064 loops=1)
         Hash Cond: ((t1.object_id = subquery."?column?") AND (t1.data_object_id = subquery.data_object_id))
         ->  Seq Scan on test01 t1  (cost=0.00..1227776.53 rows=44531153 width=12) (actual time=0.003..3561.282 rows=44537344 loops=1)
         ->  Hash  (cost=4356.08..4356.08 rows=86987 width=12) (actual time=65.462..65.462 rows=9172 loops=1)
                Buckets: 131072  Batches: 1  Memory Usage: 1427kB
               ->  Subquery Scan on subquery  (cost=2616.34..4356.08 rows=86987 width=12) (actual time=33.027..54.531 rows=86987 loops=1)
                     ->  HashAggregate  (cost=2616.34..3486.21 rows=86987 width=12) (actual time=33.025..47.518 rows=86987 loops=1)
                           Group By Key: t2.object_id
                           Unique Check Required
                           ->  Seq Scan on test02 t2  (cost=0.00..2398.87 rows=86987 width=12) (actual time=0.005..9.467 rows=86987 loops=1)
 Total runtime: 10758.989 ms
(12 rows)

设置rewrite_rule=uniquecheck之后，子查询被提升了，t1和t2走了hash join，整个SQL耗时10秒，性能较好。

执行计划中Unique Check Required表示检查子查询JOIN列(t2.object_id)是否唯一，如果子查询JOIN列(t2.object_id)不唯一，SQL会报错：

orcl=> insert into test02 select * from test02;
INSERT 0 86987
orcl=> explain analyze select /*+ set(rewrite_rule uniquecheck) */  count(*)
  from test01 t1
 where t1.data_object_id =
       (select data_object_id
          from test02 t2
         where t1.object_id = t2.object_id);
ERROR:  more than one row returned by a subquery used as an expression

如果子查询有agg函数，无需设置查询重写参数，可以自动提升：

orcl=> explain analyze select count(*)
  from test01 t1
 where t1.data_object_id =
       (select max(data_object_id)
          from test02 t2
         where t1.object_id = t2.object_id); 
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1470319.09..1470319.10 rows=1 width=8) (actual time=10345.027..10345.027 rows=1 loops=1)
   ->  Hash Join  (cost=8749.93..1470315.02 rows=1628 width=0) (actual time=122.192..10044.758 rows=4696064 loops=1)
         Hash Cond: ((t1.object_id = subquery."?column?") AND (t1.data_object_id = subquery.max))
         ->  Seq Scan on test01 t1  (cost=0.00..1227776.53 rows=44531153 width=12) (actual time=0.007..3708.561 rows=44537344 loops=1)
         ->  Hash  (cost=7428.51..7428.51 rows=88095 width=38) (actual time=120.853..120.853 rows=9172 loops=1)
                Buckets: 131072  Batches: 1  Memory Usage: 1427kB
               ->  Subquery Scan on subquery  (cost=5666.61..7428.51 rows=88095 width=38) (actual time=79.485..110.452 rows=86987 loops=1)
                     ->  HashAggregate  (cost=5666.61..6547.56 rows=88095 width=44) (actual time=79.483..103.440 rows=86987 loops=1)
                           Group By Key: t2.object_id
                           ->  Seq Scan on test02 t2  (cost=0.00..4796.74 rows=173974 width=12) (actual time=0.007..20.196 rows=173974 loops=1)
 Total runtime: 10345.491 ms
(11 rows)

关于作者

罗炳森，云和恩墨数据库架构师，15年SQL优化&数据库性能优化经验，擅长超大型，超复杂SQL&存储过程优化；国内首位专注于SQL优化实战案例博客作者；5年+数据库/ETL/BI/大数据培训经验，累计为中国培养了1000多名DBA，2000多名ETL/BI/大数据人才；2015年出版《Oracle查询优化改写技巧与案例》一书，累计销量达6万余册；2018年出版《SQL优化核心思想》，被翻译为繁体版在台湾发行。

数据驱动，成就未来，云和恩墨，不负所托！

云和恩墨创立于2011年，是业界领先的“智能的数据技术提供商”。公司总部位于北京，在国内外35个地区设有本地办公室并开展业务。

云和恩墨以“数据驱动，成就未来”为使命，致力于将创新的数据技术产品和解决方案带给全球的企业和组织，帮助客户构建安全、高效、敏捷且经济的数据环境，持续增强客户在数据洞察和决策上的竞争优势，实现数据驱动的业务创新和升级发展。

自成立以来，云和恩墨专注于数据技术领域，根据不断变化的市场需求，创新研发了系列软件产品，涵盖数据库、数据库存储、数据库云管和数据智能分析等领域。这些产品已经在集团型、大中型、高成长型客户以及行业云场景中得到广泛应用，证明了我们的技术和商业竞争力，展现了公司在数据技术端到端解决方案方面的优势。

在云化、数字化和智能化的时代背景下，云和恩墨始终以正和多赢为目标，感恩每一位客户和合作伙伴的信任与支持，“利他先行”，坚持投入于数据技术核心能力，为构建数据驱动的智能未来而不懈努力。

我们期待与您携手，共同探索数据力量，迎接智能未来。

云和恩墨

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
循序渐进丨MogDB 数据库查询重写规则uniquecheck详解

在 MogDB 和 openGauss 中，参数rewrite_rule用于控制查询重写，本文则介绍查询重写规则uniquecheck。uniquecheck表示提升无agg的where子查询。参考示例现在有如下例子：orcl=> explain analyze select count(*) from test01 t1 where t1.data_object_id = ...
复制链接

扫一扫