关于添加rownum条件count stopkey执行计划对于cost成本估算的影响

Posted on by xiaoyu

监控系统发现一个sql语句的相应时间特别长,消耗的IO资源也非常高,但是cost成本却非常低

SQL> SELECT A.ATTRVALUE, B.TYPENAME
  2        FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
  3       WHERE A.ATTRID = 'res_type'
  4         AND A.REGION = 23
  5         AND B.ITEMID = A.ATTRVALUE
  6         AND ROWNUM = 1;

Elapsed: 00:03:18.62

Execution Plan
----------------------------------------------------------
Plan hash value: 1650466411

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                        |     1 |    55 |     7   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY               |                        |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE            |     1 |    36 |     1   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS              |                        |     4 |   165 |     7   (0)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE   |                        |     4 |    76 |     3   (0)| 00:00:01 |     4 |     4 |
|*  5 |      TABLE ACCESS FULL       | GROUP_SUBS_MEMBER_ATTR |     4 |    76 |     3   (0)| 00:00:01 |     4 |     4 |
|*  6 |     INDEX UNIQUE SCAN        | PK_IMRESTYPE           |     1 |       |     0   (0)| 00:00:01 |       |       |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   5 - filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL AND "A"."REGION"=23)
   6 - access("RES_TYPE_ID"="A"."ATTRVALUE")

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
     469240  consistent gets
     469186  physical reads
          0  redo size
        619  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

那么问题是为什么cost的成本这么低,但是逻辑读确如此之高。

首先这里简单的分析下这个执行计划为什么会有如此高的逻辑读,先来看下面的查询:

SQL>      select count(*)
  2         from tbcs.GROUP_SUBS_MEMBER_ATTR A
  3        where A.ATTRID = 'res_type'
  4          AND A.REGION = 23
  5          and "A"."ATTRVALUE" IS NOT NULL
  6  ;

  COUNT(*)
----------
        14

而且这部分数据都在rownum 7千万以上的位置。

SQL> select cn from (select a.*, rownum cn from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23)b where b. ATTRID = 'res_type'  and ATTRVALUE IS NOT NULL

        CN
----------
  72251394
  72253121
  72261116
  72287094
  72292151
  72296915
  72296922
  72304758
  72333694
  72334266
  72334281
  72334924
  72336096
  72336103

14 rows selected.

虽然sql语句加上了rownum=1的限制,但是由于tbcs.GROUP_SUBS_MEMBER_ATTR A中(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的满足这个条件的第一行数据在分区段的较后面的block中(全表扫描也是有顺序的)。 此时这个单分区全表扫描 count stopeky的时候也要遍历这个分区的大部分block才能找到满足(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的第一行数据,然后将这行数据同时与IM_RES_TYPE表做nested loop,当发现这行数据满足”RES_TYPE_ID”=”A”.”ATTRVALUE”即终止查询。 如果不满足再去单分区全扫描找第二行满足谓词条件5的数据行,然后再去和IM_RES_TYPE表比对,一直到找到符合条件的数据行为止。

如果全表扫描前几次IO能够扫描的block刚好能够满足谓词条件,则加上rownum限制条件确实是能够减少大部分的IO消耗。

sys@CRMDB1>select attrid from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23 and  ATTRVALUE IS NOT NULL rownum=1;

ATTRID
--------------------------------
Flag1

1 row selected.

sys@CRMDB1>select * from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23 and attrid='Flag1' and ATTRVALUE IS NOT NULL and rownum=1;

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 484799315

------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                        |     1 |    41 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY          |                        |       |       |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|                        |     2 |    82 |     2   (0)| 00:00:01 |     4 |     4 |
|*  3 |    TABLE ACCESS FULL    | GROUP_SUBS_MEMBER_ATTR |     2 |    82 |     2   (0)| 00:00:01 |     4 |     4 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("ATTRID"='Flag1' AND "A"."ATTRVALUE" IS NOT NULL AND "REGION"=23)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        968  bytes sent via SQL*Net to client
        521  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

由于满足谓词5的数据都在分区段的后面的block中,所以这个sql会消耗特别多的IO资源。

下面我们来解释下为什么在单分区全表扫描然后count stopkey时cbo的评估的cost成本只有3,先来看下表的统计信息:

关于GROUP_SUBS_MEMBER_ATTR表的统计信息:

Table                                  Number                        Empty    Chain Average Global         Sample Date
Name                                  of Rows          Blocks       Blocks    Count Row Len Stats            Size MM-DD-YYYY
------------------------------ -------------- --------------- ------------ -------- ------- ------ -------------- ----------
GROUP_SUBS_MEMBER_ATTR            344,752,080       22,395,19            0        0      41 YES        17,237,604 09-10-2014


Column                             Distinct              Number       Number         Sample Date
Name                                 Values     Density Buckets        Nulls           Size MM-DD-YYYY
------------------------------ ------------ ----------- ------- ------------ -------------- ----------
REGION                                    4   .25000000       1            0     17,237,604 09-10-2014
ATTRID                                  166   .00602410       1            0     17,237,604 09-10-2014
ATTRVALUE                           189,682   .00000527       1   60,824,860     14,196,361 09-10-2014


Index                                      Leaf       Distinct         Number      AV      Av      Cluster Date
Name                           BLV         Blks           Keys        of Rows     LEA    Data       Factor MM-DD-YYYY
------------------------------ --- ------------ -------------- -------------- ------- ------- ------------ ----------
PK_CM_GROUP_MEMBERATTR           3    3,462,820    341,230,660    341,230,660       1       1  108,506,400 09-10-2014

index                          Column                          Col Column
Name                           Name                            Pos Details
------------------------------ ------------------------------ ---- ------------------
PK_CM_GROUP_MEMBERATTR         GRPSUBSMEMOID                     1 NUMBER(18,0) NOT NULL
                               ATTRID                            2 VARCHAR2(32) NOT NULL
                               REGION                            3 NUMBER(5,0) NOT NULL

**********************************************************
Partition Level
**********************************************************

Partition              Number                        Empty Average    Chain Average Global Date
Name                  of Rows          Blocks       Blocks   Space    Count Row Len Stats  MM-DD-YYYY
-------------- -------------- --------------- ------------ ------- -------- ------- ------ ----------
P_R_20            107,562,800        7,002,86            0       0        0      41 YES    10-10-2014
P_R_21             65,051,340        4,220,31            0       0        0      41 YES    07-07-2014
P_R_22             89,764,040        5,816,18            0       0        0      41 YES    08-07-2014
P_R_23             77,962,200        5,069,40            0       0        0      41 YES    07-26-2014
P_R_99                      0            0,00            0       0        0       0 YES    09-15-2013

对于attrid的num_distinct是166,num_nulls是0,attrvalue的num_distinct是189682,num_nulls是60824860,而且他们的number buckets都是1,也就是没有直方图,此时cbo要根据(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)谓词条件取出满足条件的数据,由于根据统计信息得出大部分的数据都是满足于这个条件的,cbo会认为只需要全表扫描的前几次IO都能够取出满足条件的数据。(关于选择率和cost成本的计算太过于复杂,不属于本篇blog讨论的范围)

单独分离出来这部分查询cbo估算的执行成本也很低,但是消耗的逻辑读还是特别高。

sys@CRMDB1>select * from tbcs.GROUP_SUBS_MEMBER_ATTR A where rownum=1 and region=23 and ATTRID = 'res_type'  AND "A"."ATTRVALUE" IS NOT NULL

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 484799315

------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                        |     1 |    41 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY          |                        |       |       |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|                        |     2 |    82 |     2   (0)| 00:00:01 |     4 |     4 |
|*  3 |    TABLE ACCESS FULL    | GROUP_SUBS_MEMBER_ATTR |     2 |    82 |     2   (0)| 00:00:01 |     4 |     4 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("ATTRID"='Flag1' AND "A"."ATTRVALUE" IS NOT NULL AND "REGION"=23)


Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
     469266  consistent gets
     468242  physical reads
          0  redo size
        984  bytes sent via SQL*Net to client
        521  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这里由于(“ATTRID”=’Flag1′ AND “A”.”ATTRVALUE” IS NOT NULL AND “REGION”=23)的数据具有较大的倾斜性且没有直方图,导致了cbo全表扫描count stopkey时错误的估算了成本。

下面xiaoyu新建一个表来测试有直方图和没直方图时估算类似rownum=1 and column1=A这类查询的成本区别。

SQL> select owner,count(*) from tab01 group by owner;

OWNER                COUNT(*)
------------------------------ ----------
OWBSYS_AUDIT                   24
MDSYS                    4022
QWE                 1
PUBLIC                  67990
OUTLN                      20
CTXSYS                    778
OLAPSYS                  1442
FLOWS_FILES                26
OWBSYS                  4
SYSTEM                   1236
ORACLE_OCM                 16
EXFSYS                    624
APEX_030200              5122
DBSNMP                    114
ORDSYS                   5026
ORDPLUGINS                 20
SYSMAN                   7108
APPQOSSYS                  10
XDB                  2336
ORDDATA                   514
XIAOYU                    154
SYS                 75434
WMSYS                     666

23 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TAB01',method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.
SQL> set autotrace traceonly;
SQL> select * from tab01 where rownum=1 and owner='SYS';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
      6  consistent gets
     45  physical reads
      0  redo size
       1605  bytes sent via SQL*Net to client
    523  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      1  rows processed

SQL> select * from tab01 where rownum=1 and owner='XIAOYU';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='XIAOYU')


Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
       2576  consistent gets
       2673  physical reads
      0  redo size
       1615  bytes sent via SQL*Net to client
    523  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      1  rows processed

SQL> select * from tab01 where rownum=1 and owner='IMP';

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='IMP')


Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
       3359  consistent gets
       3263  physical reads
      0  redo size
       1343  bytes sent via SQL*Net to client
    512  bytes received via SQL*Net from client
      1  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      0  rows processed

这里看出来在num bucket等于1时,全表扫描然后count stop key成本估算都是2,但是由于数据的分布问题实际的逻辑读是有量变的。

SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TAB01',method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL> select * from tab01 where rownum=1 and owner='SYS';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |     2   (0)| 00:00:01 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
      6  consistent gets
     45  physical reads
      0  redo size
       1605  bytes sent via SQL*Net to client
    523  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      1  rows processed

SQL> select * from tab01 where rownum=1 and owner='XIAOYU';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |    25   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |    25   (0)| 00:00:01 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |    25   (0)| 00:00:01 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='XIAOYU')


Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
       2576  consistent gets
       2673  physical reads
      0  redo size
       1615  bytes sent via SQL*Net to client
    523  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      1  rows processed

SQL> select * from tab01 where rownum=1 and owner='IMP';

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    98 |   161   (0)| 00:00:02 |       |       |
|*  1 |  COUNT STOPKEY       |       |       |       |        |      |       |       |
|   2 |   PARTITION RANGE ALL|       |     2 |   196 |   161   (0)| 00:00:02 |     1 |    40 |
|*  3 |    TABLE ACCESS FULL | TAB01 |     2 |   196 |   161   (0)| 00:00:02 |     1 |    40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   3 - filter("OWNER"='IMP')


Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
       3359  consistent gets
       3263  physical reads
      0  redo size
       1343  bytes sent via SQL*Net to client
    512  bytes received via SQL*Net from client
      1  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      0  rows processed

如果收集该列的直方图,此时优化器清楚的知道谓词条件的数据分布,对于这类查询优化器就能够较准确的评估cost的成本。

优化这个sql并不难,有一个思路就是利用小表IM_RES_TYPE去做驱动表驱动大表GROUP_SUBS_MEMBER_ATTR,可以利用attrvalue和attrid的等值条件创建索引来避免对大表的单分区全扫描。

直接创建(ATTRVALUE+ATTRID)的索引。

explain plan for
SELECT A.ATTRVALUE, B.TYPENAME
  FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
 WHERE A.ATTRID = 'res_type'
   AND A.REGION = 23
   AND B.ITEMID = A.ATTRVALUE
   AND ROWNUM = 1;

Plan hash value: 1650466411
 
-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                        |     1 |    55 |     7   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY               |                        |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE            |     1 |    36 |     1   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS              |                        |     4 |   165 |     7   (0)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE   |                        |     4 |    76 |     3   (0)| 00:00:01 |     4 |     4 |
|*  5 |      TABLE ACCESS FULL       | GROUP_SUBS_MEMBER_ATTR |     4 |    76 |     3   (0)| 00:00:01 |     4 |     4 |
|*  6 |     INDEX UNIQUE SCAN        | PK_IMRESTYPE           |     1 |       |     0   (0)| 00:00:01 |       |       |
-----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(ROWNUM=1)
   5 - filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL AND "A"."REGION"=23)
   6 - access("RES_TYPE_ID"="A"."ATTRVALUE")

这里默认情况下cbo根据cost成本估算并不会选择先用tbcs.RESOURCE_TYPE B去做驱动表做nested loop关联,这是因为两个sql的执行成本太接近,优化器既有可能选择全表扫描count stopkey,也有可能选择索引扫描count stopkey,但是消耗的IO资源是存在量变的。

SQL> SELECT /*+leading(B A)*/A.ATTRVALUE, B.TYPENAME
  2    FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
  3   WHERE A.ATTRID = 'res_type'
  4     AND A.REGION = 23
  5     AND B.ITEMID = A.ATTRVALUE
  6     AND ROWNUM = 1;

Elapsed: 00:00:00.06

Execution Plan
----------------------------------------------------------
Plan hash value: 458037665

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                        |     1 |    55 |    14   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY                     |                        |       |       |            |          |       |       |
|*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR |     1 |    19 |    12   (0)| 00:00:01 |     4 |     4 |
|   3 |    NESTED LOOPS                    |                        |    10 |    55 |    14   (0)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS FULL              | IM_RES_TYPE            |     1 |    36 |     2   (0)| 00:00:01 |       |       |
|   5 |     PARTITION RANGE SINGLE         |                        |    10 |       |     3   (0)| 00:00:01 |     4 |     4 |
|*  6 |      INDEX RANGE SCAN              | IND_ATTRVALUE_ATTRID   |    10 |       |     3   (0)| 00:00:01 |     4 |     4 |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   2 - filter("A"."REGION"=23)
   6 - access("RES_TYPE_ID"="A"."ATTRVALUE" AND "A"."ATTRID"='res_type')
       filter("A"."ATTRVALUE" IS NOT NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        370  consistent gets
          0  physical reads
          0  redo size
        619  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这里需要强制加上leading(B A)让优化器先以tbcs.RESOURCE_TYPE B去做驱动表,然后驱动GROUP_SUBS_MEMBER_ATTR A表。

我们这里还可以有个想法就是,既然原sql消耗的成本主要在全表扫描取(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的数据,那么我们能否用另外的一个取数据的办法就是通过index range scan的方式了,那我们能否让cbo走(ATTRVALUE+ATTRID)的索引来取数据。

SQL> SELECT /*+index(A ind_attrvalue_attrid)*/A.ATTRVALUE, B.TYPENAME
  2    FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
  3   WHERE A.ATTRID = 'res_type'
  4     AND A.REGION = 23
  5     AND B.ITEMID = A.ATTRVALUE
  6     AND ROWNUM = 1;

Elapsed: 00:02:21.41

Execution Plan
----------------------------------------------------------
Plan hash value: 485372855

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                        |     1 |    55 |    11   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY                        |                        |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID         | IM_RES_TYPE            |     1 |    36 |     1   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                       |                        |     4 |   165 |    11   (0)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE            |                        |     4 |    76 |     7   (0)| 00:00:01 |     4 |     4 |
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR |     4 |    76 |     7   (0)| 00:00:01 |     4 |     4 |
|*  6 |       INDEX FULL SCAN                 | IND_ATTRVALUE_ATTRID   | 98189 |       |     6   (0)| 00:00:01 |     4 |     4 |
|*  7 |     INDEX UNIQUE SCAN                 | PK_IMRESTYPE           |     1 |       |     0   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   5 - filter("A"."REGION"=23)
   6 - access("A"."ATTRID"='res_type')
       filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL)
   7 - access("RES_TYPE_ID"="A"."ATTRVALUE")


Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
     227934  consistent gets
     227981  physical reads
          0  redo size
        619  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

cob采取比较糟糕的index full scan然会回表的方式,并没有采取我们所认为的index range scan的方式来回表,那我们想想为什么优化器不能够采用index range scan的方式了,这里是因为我们创建索引是(ATTRVALUE+ATTRID)的组合索引,在这个索引取数据时由于”A”.”ATTRVALUE” IS NOT NULL并不是一个等值的条件,而这个attrvalue又是前导列,这个导致优化器如果想走index range scan需要走类似的index range scan然后 INLIST ITERATOR迭代的方式,由于attrvalue有很大一部分的null value,这将会导致这部分执行cbo估算时较高,而不选择这种执行计划,进而选择了更糟糕的index full scan回表的方式,由于attrid=’res_type’ 具有很大的倾斜性,刚好这部分数据又在索引的后面的几个leaf block中。

再来想想(ATTRID+ATTRVALUE)复合索引,通过attrid是索引前导列,(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)这个谓词条件利用attrid是索引前导列的复合索引可以很精准的从root到branch再到leaf block,从而在leaf block时index range scan。

SQL> SELECT A.ATTRVALUE, B.TYPENAME
  2    FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
  3   WHERE A.ATTRID = 'res_type'
  4     AND A.REGION = 23
  5     AND B.ITEMID = A.ATTRVALUE
  6     AND ROWNUM = 1;

Elapsed: 00:00:00.12

Execution Plan
----------------------------------------------------------
Plan hash value: 2801988880

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                        |     1 |    55 |     9   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY                        |                        |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID         | IM_RES_TYPE            |     1 |    36 |     1   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                       |                        |     4 |   165 |     9   (0)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE            |                        |     4 |    76 |     5   (0)| 00:00:01 |     4 |     4 |
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR |     4 |    76 |     5   (0)| 00:00:01 |     4 |     4 |
|*  6 |       INDEX RANGE SCAN                | IND_ATTRID_ATTRVALUE   | 98189 |       |     4   (0)| 00:00:01 |     4 |     4 |
|*  7 |     INDEX UNIQUE SCAN                 | PK_IMRESTYPE           |     1 |       |     0   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)
   5 - filter("A"."REGION"=23)
   6 - access("A"."ATTRID"='res_type')
       filter("A"."ATTRVALUE" IS NOT NULL)
   7 - access("RES_TYPE_ID"="A"."ATTRVALUE")


Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
         10  consistent gets
          5  physical reads
          0  redo size
        619  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这里我们需要注意的是在表中添加了rownum的限制又添加了部分谓词的条件时,刚好满足这个谓词条件对应的数据块在表段较后面的数据块中,且对应的列没有直方图又存在数据倾斜,此时cbo估算这个全表扫描 count stopkey成本往往是不准确的,当然上面这个例子确实比较特殊,没有直方图且数据有倾斜性,这里借助这个sql case分析只是为了让我们更好的理解nested loop的原理、以及rownum对于表扫描索引扫描时执行计划和成本估算的影响。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值