建立联合函数索引解决top sql性能问题

最新推荐文章于 2018-05-28 17:54:28 发布

congdiku5262

最新推荐文章于 2018-05-28 17:54:28 发布

阅读量117

点赞数

今天捕获到一条top sql,其大致内容如下
select count(*) from archive partition (1105) where done = 2 and name like '%@qq.com';
其中archive为分区表，按月分区，分区键为create_time列；其总数据为6千万，分区1105数据有600多万，且每日递增；分区1104数据有1千万，预计到月底1105数据也要有千万
再来看两个查询列，done的选择性极差，其distinct值只有3个；而name的distinct值有20多万，其实就是用户的邮箱地址；两个列上目前都没有索引；
目前该sql的执行计划如下，逻辑读有60多万
---------------------------------------------------------------------------------------------------------
| Id | Operation               | Name          | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |               |     1 |    23 |   164K (1)| 00:32:59 |       |       |
|   1 | SORT AGGREGATE         |               |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|               |   243K| 5460K|   164K (1)| 00:32:59 |     6 |     6 |
|* 3 |    TABLE ACCESS FULL    | archive       |   243K| 5460K|   164K (1)| 00:32:59 |     6 |     6 |
---------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          8 recursive calls
          0 db block gets
     605310 consistent gets
     530101 physical reads
          0 redo size
        517 bytes sent via SQL*Net to client
        487 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)

sql结构极为简单，在无法修改应用逻辑的前提下，为了降低cost，只能考虑创建索引，而done列肯定是不合适的；
name列的选择性倒是还可以，可是使用的是模糊查询，而且%位于前面，直接创建索引显然是不合适的，前面提过name是用户的邮箱地址，而邮箱地址的格式是有规律的，
为*****@.**;我们可以通过instr获取@的位置，联合substr可以得到@.**部分的内容，即substr(name,instr(name,'@')，于是name like '%@qq.com'可以改写为
substr(name,instr(name,'@')='@qq.com',现在则可以基于substr(name,instr(name,'@')创建函数索引

SQL> Create index ind_func_name on archive(substr(name,instr(name,'@'))) online;

Index created

创建后的执行计划
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |     1 |    23 |   112K (1)| 00:22:34 |       |       |
|   1 | SORT AGGREGATE                     |                          |     1 |    23 |            |          |       |       |
|* 2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| archive                  | 48626 | 1092K|   112K (1)| 00:22:34 |     6 |     6 |
|* 3 |    INDEX RANGE SCAN                 | ind_func_name            |   247K|       |     8 (25)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          8 recursive calls
          0 db block gets
     192078 consistent gets
          0 physical reads
          0 redo size
        517 bytes sent via SQL*Net to client
        487 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

执行代价有所降低，但是此时索引不是local的，执行计划中也没有出现partition字样，重新建立一个local索引

SQL> Create index ind_func_name on archive(substr(name,instr(name,'@'))) online local;
再看执行计划
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |     1 |    23 | 12490   (1)| 00:02:30 |       |       |
|   1 | SORT AGGREGATE                     |                          |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE            |                          | 48626 | 1092K| 12490   (1)| 00:02:30 |     6 |     6 |
|* 3 |    TABLE ACCESS BY LOCAL INDEX ROWID| archive                  | 48626 | 1092K| 12490   (1)| 00:02:30 |     6 |     6 |
|* 4 |     INDEX RANGE SCAN                | ind_func_name            | 25856 |       |     3   (0)| 00:00:01 |     6 |     6 |
--------------------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
        106 recursive calls
          0 db block gets
     171296 consistent gets
       1485 physical reads
          0 redo size
        517 bytes sent via SQL*Net to client
        487 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

可以看到consistent gets创建索引前为605310，创建函数索引后下降为192078，函数索引修改成分区索引后变为171296；
虽然性能得到了提升，但是执行代价仍然有点大，此时可以考虑组合索引了
SQL> create index ind_func_name_done on archive(substr(name,instr(name,'@')),done) online local;

Index created.
再看一把执行计划
-------------------------------------------------------------------------------------------------------------------
| Id | Operation               | Name                    | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                         |     1 |    23 |     3   (0)| 00:00:01 |       |       |
|   1 | SORT AGGREGATE         |                         |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|                         | 48626 | 1092K|     3   (0)| 00:00:01 |     6 |     6 |
|* 3 |    INDEX RANGE SCAN     | ind_func_name_done      | 48626 | 1092K|     3   (0)| 00:00:01 |     6 |     6 |
-------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
         15 recursive calls
          0 db block gets
       1186 consistent gets
       1181 physical reads
          0 redo size
        517 bytes sent via SQL*Net to client
        487 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed
这样一来，sql的consistent gets骤降到1186，而它也从top sql的队列中消失了