lightdb22.4-新增优化器提示cardinality 和ordered_predicates

最新推荐文章于 2024-01-05 19:18:07 发布

紫无之紫

最新推荐文章于 2024-01-05 19:18:07 发布

阅读量410

点赞数

分类专栏： lightdb 数据库文章标签：数据库 postgres lightdb hint

本文链接：https://blog.csdn.net/qq_17713935/article/details/127924170

版权

数据库同时被 2 个专栏收录

53 篇文章 0 订阅

订阅专栏

lightdb

51 篇文章 1 订阅

订阅专栏

介绍了 LightDB 22.4 版本中新引入的两个优化器提示功能：cardinality 和 ordered_predicates。前者用于设定表的预估行数以影响查询计划的选择，后者则用于强制保留 SQL 中指定的约束顺序。

摘要由CSDN通过智能技术生成

lightdb 22.4 新增优化器提示cardinality 和ordered_predicates

lightdb 在22.4 新增了如下两个优化器提示：

cardinality
ordered hint

下面对这两个hint 进行介绍

1. cardinality hint

cardinality hint 用于强制设置表或多表查询结果的预估行数，从而影响优化器选择执行路径，hint的效果体现在对explain 中rows值的印象。

具体请看如下案例：

1.0 前置准备

lightdb@postgres=# create table t1 (key1 int , key2 int);
CREATE TABLE
lightdb@postgres=# create table t2 (key1 int , key2 int);
CREATE TABLE

1.1 影响join算法选择

通过使用cardinality设置t1表预估行数为10，导致走了hashjoin：


lightdb@postgres=# explain select * from t1, t2 where t1.key1 = t2.key1;
                            QUERY PLAN                            
------------------------------------------------------------------
 Merge Join  (cost=317.01..711.38 rows=25538 width=16)
   Merge Cond: (t1.key1 = t2.key1)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: t1.key1
         ->  Seq Scan on t1  (cost=0.00..32.60 rows=2260 width=8)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: t2.key1
         ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)
(8 rows)

lightdb@postgres=# explain select /*+cardinality(t1 10)*/* from t1, t2 where t1.key1 = t2.key1;
                               QUERY PLAN                               
------------------------------------------------------------------------
 Hash Join  (cost=32.73..74.93 rows=113 width=16)
   Hash Cond: (t2.key1 = t1.key1)
   ->  Seq Scan on t2 @"lt#0"  (cost=0.00..32.60 rows=2260 width=8)
   ->  Hash  (cost=32.60..32.60 rows=10 width=8)
         ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=10 width=8)
(5 rows)

lightdb@postgres=#

1.2 影响内外表选择

对应hashjoin，可以通过把rows 设小，可以强制把设置的表作为内表构建hash表。

lightdb@postgres=# explain select /*+cardinality(t1 20) cardinality(t2 20)*/* from t1, t2 where t1.key1 = t2.key1;
                               QUERY PLAN                               
------------------------------------------------------------------------
 Hash Join  (cost=32.85..65.57 rows=2 width=16)
   Hash Cond: (t1.key1 = t2.key1)
   ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=20 width=8)
   ->  Hash  (cost=32.60..32.60 rows=20 width=8)
         ->  Seq Scan on t2 @"lt#0"  (cost=0.00..32.60 rows=20 width=8)
(5 rows)

lightdb@postgres=# explain select /*+cardinality(t1 10) cardinality(t2 20)*/* from t1, t2 where t1.key1 = t2.key1;
                               QUERY PLAN                               
------------------------------------------------------------------------
 Hash Join  (cost=32.73..65.41 rows=1 width=16)
   Hash Cond: (t2.key1 = t1.key1)
   ->  Seq Scan on t2 @"lt#0"  (cost=0.00..32.60 rows=20 width=8)
   ->  Hash  (cost=32.60..32.60 rows=10 width=8)
         ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=10 width=8)
(5 rows)

lightdb@postgres=# explain select /*+cardinality(t1 20) cardinality(t2 10)*/* from t1, t2 where t1.key1 = t2.key1;
                               QUERY PLAN                               
------------------------------------------------------------------------
 Hash Join  (cost=32.73..65.41 rows=1 width=16)
   Hash Cond: (t1.key1 = t2.key1)
   ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=20 width=8)
   ->  Hash  (cost=32.60..32.60 rows=10 width=8)
         ->  Seq Scan on t2 @"lt#0"  (cost=0.00..32.60 rows=10 width=8)
(5 rows)

lightdb@postgres=#

1.3 影响子查询

lightdb@postgres=# explain select t1.key1 from t1, (select distinct * from t2 ) x where t1.key1 = x.key1 and x.key2>10;
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Hash Join  (cost=49.23..174.72 rows=2509 width=4)
   Hash Cond: (t1.key1 = x.key1)
   ->  Seq Scan on t1  (cost=0.00..32.60 rows=2260 width=4)
   ->  Hash  (cost=46.45..46.45 rows=222 width=4)
         ->  Subquery Scan on x  (cost=42.02..46.45 rows=222 width=4)
               ->  HashAggregate  (cost=42.02..44.23 rows=222 width=8)
                     Group Key: t2.key1, t2.key2
                     ->  Seq Scan on t2  (cost=0.00..38.25 rows=753 width=8)
                           Filter: (key2 > 10)
(9 rows)

lightdb@postgres=# explain select /*+cardinality(x 100)*/t1.key1 from t1, (select distinct * from t2 ) x where t1.key1 = x.key1 and x.key2>10;
                                     QUERY PLAN                              
        
-----------------------------------------------------------------------------
--------
 Hash Join  (cost=47.70..125.50 rows=1130 width=4)
   Hash Cond: (t1.key1 = x.key1)
   ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=2260 width=4)
   ->  Hash  (cost=46.45..46.45 rows=100 width=4)
         ->  Subquery Scan on x @"lt#0"  (cost=42.02..46.45 rows=100 width=4)
               ->  HashAggregate  (cost=42.02..44.23 rows=222 width=8)
                     Group Key: t2.key1, t2.key2
                     ->  Seq Scan on t2 @"lt#1"  (cost=0.00..38.25 rows=753 w
idth=8)
                           Filter: (key2 > 10)
(9 rows)

lightdb@postgres=#

1.4 影响多表

lightdb@postgres=# explain select * from t1 ,t2 ,t1 a where t1.key1=t2.key1 and t2.key1=a.key1;
                                  QUERY PLAN                                 
 
-----------------------------------------------------------------------------
-
 Merge Join  (cost=475.52..5209.87 rows=288579 width=24)
   Merge Cond: (a.key1 = t1.key1)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: a.key1
         ->  Seq Scan on t1 a  (cost=0.00..32.60 rows=2260 width=8)
   ->  Materialize  (cost=317.01..775.23 rows=25538 width=16)
         ->  Merge Join  (cost=317.01..711.38 rows=25538 width=16)
               Merge Cond: (t1.key1 = t2.key1)
               ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                     Sort Key: t1.key1
                     ->  Seq Scan on t1  (cost=0.00..32.60 rows=2260 width=8)
               ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                     Sort Key: t2.key1
                     ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)
(14 rows)

lightdb@postgres=# explain select /*+cardinality(t1 t2 100)*/* from t1 ,t2 ,t1 a where t1.key1=t2.key1 and t2.key1=a.key1;
                                      QUERY PLAN                             
         
-----------------------------------------------------------------------------
---------
 Hash Join  (cost=712.63..790.43 rows=1130 width=24)
   Hash Cond: (a.key1 = t1.key1)
   ->  Seq Scan on t1 a @"lt#0"  (cost=0.00..32.60 rows=2260 width=8)
   ->  Hash  (cost=711.38..711.38 rows=100 width=16)
         ->  Merge Join  (cost=317.01..711.38 rows=100 width=16)
               Merge Cond: (t1.key1 = t2.key1)
               ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                     Sort Key: t1.key1
                     ->  Seq Scan on t1 @"lt#0"  (cost=0.00..32.60 rows=2260 
width=8)
               ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
                     Sort Key: t2.key1
                     ->  Seq Scan on t2 @"lt#0"  (cost=0.00..32.60 rows=2260 
width=8)
(12 rows)

lightdb@postgres=#

note

cardinality hint是rows hint 的别名，在22.4 我们对rows hint 进行了扩展，支持对单表进行设置。

2. ordered_predicates hint

ordered_predicates 用于强制优化器保留约束的顺序，对索引条件无效, 不考虑连接条件。目前具有如下限制：

对于涉及等价推导的约束不起效，包括实际参与推导，和推导参数的约束。

具体案例如下：

2.0 前置准备

chuhx@postgres=# create table test1 (key1 int, key2 int, key3 int);
CREATE TABLE
chuhx@postgres=# create table test2 (key1 int, key2 int, key3 int);
CREATE TABLE
chuhx@postgres=#

2.1 对于where 或on或having后约束不组合

where

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 where mod(key2, 10) = 2 and key1 = 1;
                   QUERY PLAN                   
------------------------------------------------
 Seq Scan on test1
   Filter: ((key1 = 1) AND (mod(key2, 10) = 2))
(2 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ * from test1 where mod(key2, 10) = 2 and key1 = 1;  
                   QUERY PLAN                   
------------------------------------------------
 Seq Scan on test1 @"lt#0"
   Filter: ((mod(key2, 10) = 2) AND (key1 = 1))
(2 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 where mod(key2, 10) = 2 and key1 > 1;
                   QUERY PLAN                   
------------------------------------------------
 Seq Scan on test1
   Filter: ((key1 > 1) AND (mod(key2, 10) = 2))
(2 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ * from test1 where mod(key2, 10) = 2 and key1 > 1;
                   QUERY PLAN                   
------------------------------------------------
 Seq Scan on test1 @"lt#0"
   Filter: ((mod(key2, 10) = 2) AND (key1 > 1))
(2 rows)

on

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 a join test2 b on a.key1=b.key1 and mod(a.key2, 10) = 2 and  a.key3 = 10; 
                         QUERY PLAN                          
-------------------------------------------------------------
 Hash Join
   Hash Cond: (b.key1 = a.key1)
   ->  Seq Scan on test2 b
   ->  Hash
         ->  Seq Scan on test1 a
               Filter: ((key3 = 10) AND (mod(key2, 10) = 2))
(6 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ * from test1 a join test2 b on a.key1=b.key1 and mod(a.key2, 10) = 2 and  a.key3 = 10; 
                         QUERY PLAN                          
-------------------------------------------------------------
 Hash Join
   Hash Cond: (b.key1 = a.key1)
   ->  Seq Scan on test2 b @"lt#0"
   ->  Hash
         ->  Seq Scan on test1 a @"lt#0"
               Filter: ((mod(key2, 10) = 2) AND (key3 = 10))
(6 rows)

having

lightdb@postgres=# EXPLAIN (COSTS false) select key1 from test1 group by key1, key2 having mod(key2, 10) = 2 and key1 = 1;
                         QUERY PLAN                         
------------------------------------------------------------
 Group
   Group Key: key1, key2
   ->  Sort
         Sort Key: key2
         ->  Seq Scan on test1
               Filter: ((key1 = 1) AND (mod(key2, 10) = 2))
(6 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ key1 from test1 group by key1, key2 having mod(key2, 10) = 2 and key1 = 1;
                         QUERY PLAN                         
------------------------------------------------------------
 Group
   Group Key: key1, key2
   ->  Sort
         Sort Key: key2
         ->  Seq Scan on test1 @"lt#0"
               Filter: ((mod(key2, 10) = 2) AND (key1 = 1))
(6 rows)

2.2 on 和where 结合（无等价推导， having 类同where，不再举例）

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 a join test2 b on a.key1=b.key1 and mod(a.key2, 10) = 2 where mod(b.key2, 10) = 2 and a.key3 = 10; 
                      QUERY PLAN                       
-------------------------------------------------------
 Nested Loop
   Join Filter: (a.key1 = b.key1)
   ->  Seq Scan on test1 a
         Filter: ((key3 = 10) AND (mod(key2, 10) = 2))
   ->  Seq Scan on test2 b
         Filter: (mod(key2, 10) = 2)
(6 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ * from test1 a join test2 b on a.key1=b.key1 and mod(a.key2, 10) = 2 where mod(b.key2, 10) = 2 and a.key3 = 10; 
                      QUERY PLAN                       
-------------------------------------------------------
 Nested Loop
   Join Filter: (a.key1 = b.key1)
   ->  Seq Scan on test1 a @"lt#0"
         Filter: ((mod(key2, 10) = 2) AND (key3 = 10))
   ->  Seq Scan on test2 b @"lt#0"
         Filter: (mod(key2, 10) = 2)
(6 rows)

2.3 不起效情况

有等价类推导a.key2=b.key2 && b.key2 = 1 导致 b 表上的 ((key2 = 1) AND (mod(key2, 10) = 2)) 不能保持sql 中定义顺序。

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 a left join test2 b on a.key1=b.key1 and a.key2=b.key2 where mod(b.key2, 10) = 2 and b.key2 = 1; 
e                      QUERY PLAN                      
y2 where mod(b.key2, 10) = 2 and b.key2 = 1; 
------------------------------------------------------
 Nested Loop
   Join Filter: (a.key1 = b.key1)
   ->  Seq Scan on test2 b
         Filter: ((key2 = 1) AND (mod(key2, 10) = 2))
   ->  Seq Scan on test1 a
         Filter: (key2 = 1)
(6 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select/*+ordered_predicates*/ * from test1 a left join test2 b on a.key1=b.key1 and a.key2=b.key2 where mod(b.key2, 10) = 2 and b.key2 = 1; 
                      QUERY PLAN                      
------------------------------------------------------
 Nested Loop
   Join Filter: (a.key1 = b.key1)
   ->  Seq Scan on test2 b @"lt#0"
         Filter: ((key2 = 1) AND (mod(key2, 10) = 2))
   ->  Seq Scan on test1 a @"lt#0"
         Filter: (key2 = 1)
(6 rows)

有等价类推导a.key2=b.key2和a.key2 = 1 导致 a 表上((key2 = 1) AND (mod(key2, 10) = 2)) 不能保留sql中顺序。

这边单独说明是因为在代码内部与上述不起效的原理不同（由于外连接）。

lightdb@postgres=# EXPLAIN (COSTS false) select * from test1 a left join test2 b on a.key1=b.key1 and a.key2=b.key2 where mod(a.key2, 10) = 2 and a.key2 = 1;
y2 where mod(a.key2, 10) = 2 and a.key2 = 1;
                        QUERY PLAN                        
----------------------------------------------------------
 Nested Loop Left Join
   Join Filter: ((a.key2 = b.key2) AND (a.key1 = b.key1))
   ->  Seq Scan on test1 a
         Filter: ((key2 = 1) AND (mod(key2, 10) = 2))
   ->  Seq Scan on test2 b
         Filter: (key2 = 1)
(6 rows)

lightdb@postgres=# EXPLAIN (COSTS false) select /*+ordered_predicates*/* from test1 a left join test2 b on a.key1=b.key1 and a.key2=b.key2 where mod(a.key2, 10) = 2 and a.key2 = 1;
                        QUERY PLAN                        
----------------------------------------------------------
 Nested Loop Left Join
   Join Filter: ((a.key2 = b.key2) AND (a.key1 = b.key1))
   ->  Seq Scan on test1 a @"lt#0"
         Filter: ((key2 = 1) AND (mod(key2, 10) = 2))
   ->  Seq Scan on test2 b @"lt#0"
         Filter: (key2 = 1)
(6 rows)