Oracle-空值写法导致优化器估算不准确案例

前言:

        最近遇到一个Oracle的语句优化案例,nvl函数在对空值的不同写法null和''进行判断时,优化器给出的估算数据会不准确,进而导致生成的执行计划错误。

     

测试表

        测试表test、testf、testf1三张表的数据量分别为500W、600W、600W,表结构相同,具体列如下。

Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER(10)
 ID1                                                NUMBER(10)
 NAME                                               VARCHAR2(32)
 NAME1                                              VARCHAR2(32)
 NAME2                                              VARCHAR2(32)

测试案例:

        使用null的空值写法,可以看到test表的估算行E-Rows正确为5250K,执行计划采用正确的hash join进行表关联,TEST大表作为被驱动表,执行结果返回9999。

select count(1)
from test.test a
where nvl(null,a.id)=a.id and a.NAME1='bbbbbbbbbbbbbbbb'
and exists (select 1 from testf b where b.id=a.id and b.name1='c' )
and exists (select 1 from testf1 c where c.id=a.id and c.name1='c' );

        使用''的空值写法,可以看到TEST出现了错误的估算E-ROWS行数为1,这个错误的估算也导致优化器使用了错误的连接方式和驱动表,将TEST这张大表作为了驱动表,并采用NESTED LOOP SEMI的连接方式,执行结果返回9999。

select count(1)
from test.test a
where nvl('',a.id)=a.id and a.NAME1='bbbbbbbbbbbbbbbb'
and exists (select 1 from testf b where b.id=a.id and b.name1='c' )
and exists (select 1 from testf1 c where c.id=a.id and c.name1='c' );

案例分析:

        对比空值不同写法的执行计划,可以看到采用''空值写法nvl执行计划过滤条件为"A"."ID"=TO_NUMBER(TO_CHAR("A"."ID")),null空值写法nvl的执行计划过滤条件为("A"."ID" IS NOT NULL),从过滤的条件可以推测''空值写法,优化器将它视为字符类型,而ID列原来为number类型,因此内部做了一个to_char再to_number的类型转化。

        通过10053进一步分析优化器生成执行计划的过程。

---启用10053事件(级别2)
ALTER SESSION SET EVENTS='10053 trace name context forever, level 2';
---执行sql
xxxxxxxx
---关闭10053事件:
ALTER SESSION SET EVENTS '10053 trace name context off';
---trace文件路径
select tracefile from v$process where addr in (select paddr from v$session where sid in (select sid from v$mystat));

        通过10053可以看到正常的执行计划,估算的返回行数为5250001,由于ID列没有null值,所以估算的返回行数计算主要取决于过滤条件name1,"bbbbbbbbbbbbbbbb"所在的值落在频率直方图桶ENDPOINT_NUMBER:5260002,所以估算的返回行数量为((high_value-low_value)/bucket_total_num)*table rows=(5260002-10001)/5260002*

5260002=5250001行。

        name1列的频率直方图信息,有两个桶ENDPOINT_NUMBER:100001,值b,两个桶ENDPOINT_NUMBER:5260002,值bbbbbbbbbbbbbbbb,桶的行总数5260002。

        再看错误的执行计划,估算的返回行数为1,因为name1列的选择率很高,所以ID列的选择率计算应该很低,才会导致估算的数据返回为1行数据。ID列数据没有倾斜并且NDV唯一值很高,所以并没有收集直方图信息,优化器有可能是直接根据Density(Density=1/NDV)数据的密度去计算返回的行数,当前Density为:0.000000。

        使用相同结构不同数据ID的表进行nvl('',id)=id测试,可以看到依然会出现内部的字段类型转化TO_NUMBER(TO_CHAR("A"."ID"),并且估算的行数不准确,从估算的行数推测计算为table rows*Density=6400124*0.00001=64。

select count(1)
from testf1 a
where  nvl('',a.id)=a.id 

        将表id列转成varchar类型再进行测试,修改后表结构如下。

Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 VARCHAR2(32)
 ID1                                                NUMBER(10)
 NAME                                               VARCHAR2(32)
 NAME1                                              VARCHAR2(32)
 NAME2                                              VARCHAR2(32)

        再重新执行''写法的SQL语句,可以看到这次ID列没有再出现内部类型转化,而是转化为A.ID IS NOT NULL,这与null的写法一致,优化器也可以正确的估算出TEST表的行数。

        总结nvl函数空值写法null和'',如果函数判断的列类型为number,''写法会导致内部出现TO_NUMBER(TO_CHAR())的类型转化,并且优化器的估算行数会出现不准确的问题,进而导致生成的执行计划错误。

问题规避修复:

        1 对于nvl函数空值的写法,采用null写法,不采用''写法。

        2 如果采用''的写法,nvl的函数判断列类型需要为varchar2。

        3 SQL采用绑定变量写法,''值会被直接捕获为null值,避免执行计划估算错误。

variable B1 number;
exec :B1:='';
​
select count(1)
from test.test a
where nvl(:B1,a.id)=a.id and a.NAME1='bbbbbbbbbbbbbbbb'
and exists (select 1 from testf b where b.id=a.id and b.name1='c' )
and exists (select 1 from testf1 c where c.id=a.id and c.name1='c' );
​
​
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3ux3n7625yghb, child number 0
-------------------------------------
select count(1) from test.test a where nvl(:B1,a.id)=a.id and
a.NAME1='bbbbbbbbbbbbbbbb' and exists (select 1 from testf b where
b.id=a.id and b.name1='c' ) and exists (select 1 from testf1 c where
c.id=a.id and c.name1='c' )
​
Plan hash value: 3351379063
​
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |      1 |        |       |       | 39840 (100)|          |      1 |00:00:01.91 |     227K|  94378 |       |       |          |
|   1 |  SORT AGGREGATE       |        |      1 |      1 |    67 |       |            |          |      1 |00:00:01.91 |     227K|  94378 |       |       |          |
|*  2 |   HASH JOIN RIGHT SEMI|        |      1 |  20542 |  1344K|  1336K| 39840   (1)| 00:07:59 |   9999 |00:00:01.91 |     227K|  94378 |  3247K|  3056K| 5147K (0)|
|*  3 |    TABLE ACCESS FULL  | TESTF  |      1 |  40184 |   863K|       | 13875   (1)| 00:02:47 |  40184 |00:00:00.14 |   92665 |  50272 |       |       |          |
|*  4 |    HASH JOIN          |        |      1 |  40183 |  1765K|       | 25790   (1)| 00:05:10 |   9999 |00:00:01.77 |     134K|  44106 |  2293K|  2293K| 1939K (0)|
|   5 |     SORT UNIQUE       |        |      1 |  40184 |   863K|       | 13841   (1)| 00:02:47 |   9999 |00:00:00.23 |   50165 |   1625 |   549K|   549K|  487K (0)|
|*  6 |      TABLE ACCESS FULL| TESTF1 |      1 |  40184 |   863K|       | 13841   (1)| 00:02:47 |  40184 |00:00:00.22 |   50165 |   1625 |       |       |          |
|*  7 |     TABLE ACCESS FULL | TEST   |      1 |   5250K|   115M|       | 11664   (1)| 00:02:20 |   5250K|00:00:00.75 |   84778 |  42481 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
​
​
Peeked Binds (identified by position):
--------------------------------------
​
   1 - (NUMBER): (null)  --->''被直接捕获为null值
​
Predicate Information (identified by operation id):
---------------------------------------------------
​
   2 - access("B"."ID"="A"."ID")
   3 - filter("B"."NAME1"='c')
   4 - access("C"."ID"="A"."ID")
   6 - filter("C"."NAME1"='c')
   7 - filter(("A"."NAME1"='bbbbbbbbbbbbbbbb' AND "A"."ID"=NVL(:B1,"A"."ID")))

        4 基数反馈_optimizer_use_feedback功能可以一定程度上,临时规避这个问题。测试''写法语句执行两次,可以看到在执行第二次的时候,TEST表的估算返回被基数反馈修正为5250K,提示cardinality feedback used for this statement。

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  ghm9u7b1my66h, child number 1
-------------------------------------
select count(1) from test.test a where nvl('',a.id)=a.id and
a.NAME1='bbbbbbbbbbbbbbbb' and exists (select 1 from testf b where
b.id=a.id and b.name1='c' ) and exists (select 1 from testf1 c where
c.id=a.id and c.name1='c' )
​
Plan hash value: 3351379063
​
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |      1 |        |       |       | 39825 (100)|          |      1 |00:00:02.49 |     227K|  92753 |       |       |          |
|   1 |  SORT AGGREGATE       |        |      1 |      1 |    67 |       |            |          |      1 |00:00:02.49 |     227K|  92753 |       |       |          |
|*  2 |   HASH JOIN RIGHT SEMI|        |      1 |  20542 |  1344K|  1336K| 39825   (1)| 00:07:58 |   9999 |00:00:02.49 |     227K|  92753 |  3247K|  3056K| 5117K (0)|
|*  3 |    TABLE ACCESS FULL  | TESTF  |      1 |  40184 |   863K|       | 13875   (1)| 00:02:47 |  40184 |00:00:00.14 |   92665 |  50272 |       |       |          |
|*  4 |    HASH JOIN          |        |      1 |  40184 |  1765K|       | 25776   (1)| 00:05:10 |   9999 |00:00:02.33 |     134K|  42481 |  2293K|  2293K| 1956K (0)|
|   5 |     SORT UNIQUE       |        |      1 |  40184 |   863K|       | 13841   (1)| 00:02:47 |   9999 |00:00:00.16 |   50165 |      0 |   549K|   549K|  487K (0)|
|*  6 |      TABLE ACCESS FULL| TESTF1 |      1 |  40184 |   863K|       | 13841   (1)| 00:02:47 |  40184 |00:00:00.15 |   50165 |      0 |       |       |          |
|*  7 |     TABLE ACCESS FULL | TEST   |      1 |   5250K|   115M|       | 11650   (1)| 00:02:20 |   5250K|00:00:01.35 |   84778 |  42481 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
​
Predicate Information (identified by operation id):
---------------------------------------------------
​
   2 - access("B"."ID"="A"."ID")
   3 - filter("B"."NAME1"='c')
   4 - access("C"."ID"="A"."ID")
   6 - filter("C"."NAME1"='c')
   7 - filter(("A"."ID"=TO_NUMBER(TO_CHAR("A"."ID")) AND "A"."NAME1"='bbbbbbbbbbbbbbbb'))
​
Note
-----
   - cardinality feedback used for this statement   ---基数反馈修正了估算错误。

        5 可能潜在的bug,目前在Oracle MOS上并没有发现有相关Bug的描述,并且在19.16版本上面测试的相同的语句,估算错误的现象依然会出现,因此目前无法证明是Bug导致的问题还是优化器自己的算法所决定。

---19.16版本测试
select count(1) from test.test a where nvl('',a.id)=a.id and
a.NAME1='bbbbbbbbbbbbbbbb' and exists (select 1 from testf b where
b.id=a.id and b.name1='c' ) and exists (select 1 from testf1 c where
c.id=a.id and c.name1='c' )
​
Plan hash value: 4194194725
​
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |       | 11801 (100)|          |      1 |00:00:04.27 |   92992 |    102K|   9030 |       |       |          |         |
|   1 |  SORT AGGREGATE      |        |      1 |      1 |    67 |            |          |      1 |00:00:04.27 |   92992 |    102K|   9030 |       |       |          |         |
|*  2 |   HASH JOIN SEMI     |        |      1 |      1 |    67 | 11801   (1)| 00:00:01 |   9999 |00:00:04.27 |   92992 |    102K|   9030 |  2402K|  2402K| 1882K (0)|         |
|*  3 |    HASH JOIN SEMI    |        |      1 |      1 |    45 | 11734   (1)| 00:00:01 |   9999 |00:00:04.26 |   92912 |    101K|   9030 |   251M|    23M|   82M (1)|      76M|
|*  4 |     TABLE ACCESS FULL| TEST   |      1 |      1 |    23 | 11668   (1)| 00:00:01 |   5250K|00:00:01.31 |   42745 |  42741 |      0 |       |       |          |         |
|*  5 |     TABLE ACCESS FULL| TESTF1 |      1 |  40184 |   863K|    66   (0)| 00:00:01 |  40184 |00:00:00.44 |   50167 |  50162 |      0 |       |       |          |         |
|*  6 |    TABLE ACCESS FULL | TESTF  |      1 |  40184 |   863K|    67   (0)| 00:00:01 |  10802 |00:00:00.01 |      80 |    104 |      0 |       |       |          |         |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
​
​
Predicate Information (identified by operation id):
---------------------------------------------------
​
   2 - access("B"."ID"="A"."ID")
   3 - access("C"."ID"="A"."ID")
   4 - filter(("A"."ID"=TO_NUMBER(TO_CHAR("A"."ID")) AND "A"."NAME1"='bbbbbbbbbbbbbbbb'))
   5 - filter("C"."NAME1"='c')
   6 - filter("B"."NAME1"='c')

Tip:欢迎关注公众号:勇敢牛牛的笔记,超100+的原创内容,每周不定期更新数据库技术文章。

  • 14
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值