11g New Feature dbms_stats Pending and Published Statistics

 Prior to Oracle 11g, the default behaviour was to make statistics available for the optimizer to use as soon as they were gathered.

While statistics are required to enable the optimizer to generate optimal execution plans, sometimes just by gathering fresh statistics tried and trusted execution plans can abruptly change and thereby adversely affect application performance.

In Oracle 11g we can now ‘defer’ the publication of statistics until they have been tried and tested and once we have confirmed that the execution plans are correct and optimal. Statistics remain in the pending state until they are published and the parameter optimizer_use_pending_statistics (default value FALSE) which when set at the session level will enables us to test the pending statistics independently of other database sessions.

Let us look at a test case using the SALES table in the SH schema.

If we see the query below, it appears that the data for the column CHANNEL_ID is skewed where majority of the rows have the value 3 and a very small majority have the value 9.


SQL> conn sh/sh
Connected.
SQL> select channel_id,count(*) from sales group by channel_id order by 2;

CHANNEL_ID   COUNT(*)
---------- ----------
         9       2074
         4     118416
         2     258025
         3     540328

But both the queries below are performing a full table scan of the SALES table when ideally it should be performing an index scan when the value 9 is used in the WHERE clause considering it accounts for a very small proportion of the rows in the SALES table.

SQL> set autot trace exp stat
SQL> set linesize 130
SQL> SELECT
S.cust_id
,S.prod_id
,SUM(S.amount_sold)
FROM sh.sales S
WHERE
channel_id=9
GROUP BY S.cust_id, S.prod_id
ORDER BY S.cust_id, S.prod_id
/

  2    3    4    5    6    7    8    9   10 
2074 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4109827725

----------------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |  1746 | 29682 |   490   (2)| 00:00:06 |       |       |
|   1 |  SORT GROUP BY       |       |  1746 | 29682 |   490   (2)| 00:00:06 |       |       |
|   2 |   PARTITION RANGE ALL|       |  1746 | 29682 |   489   (2)| 00:00:06 |     1 |    28 |
|*  3 |    TABLE ACCESS FULL | SALES |  1746 | 29682 |   489   (2)| 00:00:06 |     1 |    28 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("CHANNEL_ID"=9)

Note
-----
   - SQL plan baseline "SQL_PLAN_6k40mr9y0bzzyebe69af4" used for this statement


Statistics
----------------------------------------------------------
        559  recursive calls
         24  db block gets
       1827  consistent gets
       1716  physical reads
      11328  redo size
      54398  bytes sent via SQL*Net to client
       2041  bytes received via SQL*Net from client
        140  SQL*Net roundtrips to/from client
         18  sorts (memory)
          0  sorts (disk)
       2074  rows processed

SQL> SELECT
S.cust_id
,S.prod_id
,SUM(S.amount_sold)
FROM sh.sales S
WHERE
channel_id=3
GROUP BY S.cust_id, S.prod_id
ORDER BY S.cust_id, S.prod_id
/

  2    3    4    5    6    7    8    9   10 

258968 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4109827725

------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |   359K|  5966K|       |  2831   (1)| 00:00:34 |       |       |
|   1 |  SORT GROUP BY       |       |   359K|  5966K|    16M|  2831   (1)| 00:00:34 |       |       |
|   2 |   PARTITION RANGE ALL|       |   543K|  9029K|       |   490   (2)| 00:00:06 |     1 |    28 |
|*  3 |    TABLE ACCESS FULL | SALES |   543K|  9029K|       |   490   (2)| 00:00:06 |     1 |    28 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("CHANNEL_ID"=3)

Note
-----
   - SQL plan baseline "SQL_PLAN_gyqybj20pgvzcebe69af4" used for this statement


Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
       1722  consistent gets
       1647  physical reads
          0  redo size
    6582470  bytes sent via SQL*Net to client
     190427  bytes received via SQL*Net from client
      17266  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
     258968  rows processed

As earlier mentioned, statistics for a table are published immediately by default, so we use the DBMS_STATS package to change this default behaviour.

SQL> select sysdate from dual;

SYSDATE
-------------------
2012-01-13 14:27:05

SQL> exec dbms_stats.set_table_prefs('SH','SALES','PUBLISH','FALSE');

PL/SQL procedure successfully completed.

SQL> select dbms_stats.get_prefs('PUBLISH', 'SH', 'SALES' ) FROM DUAL;

DBMS_STATS.GET_PREFS('PUBLISH','SH','SALES')
----------------------------------------------------------------------------------------------------------------------------------
FALSE

Since we observed that the data in the table is skewed, we will gather histograms which may help the optimizer make more informed decisions when generating an execution plan.

Note that since we have disabled the publishing of statistics for the table SALES, even though we have analyzed the table, the LAST_ANALYZED column shows that the table has not been recently analyzed.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES',METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

SQL> SELECT LAST_ANALYZED FROM DBA_TABLES WHERE TABLE_NAME='SALES';

LAST_ANALYZED
-------------------
2012-01-13 13:59:32

However, we query the DBA_TAB_PENDING_STATS view, it shows that the SALES table which is partitioned has been analyzed now.

SQL> SELECT TABLE_NAME,PARTITION_NAME ,LAST_ANALYZED  FROM DBA_TAB_PENDING_STATS;

TABLE_NAME                     PARTITION_NAME                 LAST_ANALYZED
------------------------------ ------------------------------ -------------------
SALES                          SALES_Q4_2003                  2012-01-13 14:29:05
SALES                          SALES_Q4_2002                  2012-01-13 14:29:05
SALES                          SALES_Q4_2001                  2012-01-13 14:29:05
SALES                          SALES_Q4_2000                  2012-01-13 14:29:04
SALES                          SALES_Q4_1999                  2012-01-13 14:29:04
SALES                          SALES_Q4_1998                  2012-01-13 14:29:04
SALES                          SALES_Q3_2003                  2012-01-13 14:29:04
SALES                          SALES_Q3_2002                  2012-01-13 14:29:04
SALES                          SALES_Q3_2001                  2012-01-13 14:29:03
SALES                          SALES_Q3_2000                  2012-01-13 14:29:03
SALES                          SALES_Q3_1999                  2012-01-13 14:29:03

TABLE_NAME                     PARTITION_NAME                 LAST_ANALYZED
------------------------------ ------------------------------ -------------------
SALES                          SALES_Q3_1998                  2012-01-13 14:29:03
SALES                          SALES_Q2_2003                  2012-01-13 14:29:03
SALES                          SALES_Q2_2002                  2012-01-13 14:29:03
SALES                          SALES_Q2_2001                  2012-01-13 14:29:03
SALES                          SALES_Q2_2000                  2012-01-13 14:29:03
SALES                          SALES_Q2_1999                  2012-01-13 14:29:01
SALES                          SALES_Q2_1998                  2012-01-13 14:29:01
SALES                          SALES_Q1_2003                  2012-01-13 14:29:01
SALES                          SALES_Q1_2002                  2012-01-13 14:29:01
SALES                          SALES_Q1_2001                  2012-01-13 14:29:01
SALES                          SALES_Q1_2000                  2012-01-13 14:29:01

TABLE_NAME                     PARTITION_NAME                 LAST_ANALYZED
------------------------------ ------------------------------ -------------------
SALES                          SALES_Q1_1999                  2012-01-13 14:29:01
SALES                          SALES_Q1_1998                  2012-01-13 14:29:01
SALES                          SALES_H2_1997                  2012-01-13 14:29:00
SALES                          SALES_H1_1997                  2012-01-13 14:29:00
SALES                          SALES_1996                     2012-01-13 14:29:00
SALES                          SALES_1995                     2012-01-13 14:29:00
SALES                                                         2012-01-13 14:29:05

29 rows selected.

We would like to see if the query execution plan has changed or improved after the recent gathering of statistics. We can use the parameter optimizer_use_pending_statistics for this by setting the value to TRUE (default is FALSE)

SQL> alter session set optimizer_use_pending_statistics=TRUE;

Session altered.

We now find that the optimizer is performing an index scan when the value 9 is used as a predicate value for CHANNEL_ID while it performs a full table scan when the value 3 is used which is a much more optimal plan than the original one before gathering fresh statistics.

SQL> explain plan
for
SELECT
     S.cust_id
    ,S.prod_id
    ,SUM(S.amount_sold)
  FROM sh.sales S
 WHERE
channel_id=3
 GROUP BY S.cust_id, S.prod_id
 ORDER BY S.cust_id, S.prod_id
/  2    3    4    5    6    7    8    9   10   11   12 

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4109827725

------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |   359K|  5966K|       |  2823   (1)| 00:00:34 |       |       |
|   1 |  SORT GROUP BY       |       |   359K|  5966K|    16M|  2823   (1)| 00:00:34 |       |       |
|   2 |   PARTITION RANGE ALL|       |   540K|  8965K|       |   490   (2)| 00:00:06 |     1 |    28 |
|*  3 |    TABLE ACCESS FULL | SALES |   540K|  8965K|       |   490   (2)| 00:00:06 |     1 |    28 |
------------------------------------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("CHANNEL_ID"=3)

Note
-----
   - SQL plan baseline "SQL_PLAN_gyqybj20pgvzcebe69af4" used for this statement

19 rows selected.

SQL> explain plan
for
SELECT
     S.cust_id
    ,S.prod_id
    ,SUM(S.amount_sold)
  FROM sh.sales S
 WHERE
channel_id=9
 GROUP BY S.cust_id, S.prod_id
 ORDER BY S.cust_id, S.prod_id
/
  2    3    4    5    6    7    8    9   10   11   12 
Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1760806683

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |   921 |  2420 |    17   (6)| 00:00:01 |       |       |
|   1 |  SORT GROUP BY                      |                   |   921 |  2420 |    17   (6)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE SINGLE            |                   |   921 |  2420 |    16   (0)| 00:00:01 |     5 |     5 |
|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID| SALES             |   921 |  2420 |    16   (0)| 00:00:01 |     5 |     5 |
|   4 |     BITMAP CONVERSION TO ROWIDS     |                   |       |       |            |          |       |       |
|*  5 |       BITMAP INDEX SINGLE VALUE     | SALES_CHANNEL_BIX |       |       |            |          |     5 |     5 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CHANNEL_ID"=9)

16 rows selected.

Since we have now confirmed that the plan is acceptable and the new statistics can be used by the optimizer, we PUBLISH the statistics now for the SALES table. Note that the LAST_ANALYZED column is also updated and the DBA_TAB_PENDING_STATS shows that there are no more pending statistics which need to be published.

SQL> EXEC DBMS_STATS.PUBLISH_PENDING_STATS ('SH','SALES');

PL/SQL procedure successfully completed.

SQL> SELECT LAST_ANALYZED FROM DBA_TABLES WHERE TABLE_NAME='SALES';


LAST_ANALYZED
-------------------
2012-01-13 14:29:05

SQL> SELECT COUNT(*) FROM DBA_TAB_PENDING_STATS;

  COUNT(*)
----------
         0

SQL> alter session set optimizer_use_pending_statistics=FALSE;

Session altered.

 


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值