Extended Optimizer Statistics in Oracle 11g Improve Performance

oracle优化器会利用统计信息来选择最优或者次优的执行计划,在确定执行计划的过程中,oracle可以非常很好的分析每个表列的数据分布特征,但是对于列与列之间的某些关联关系,oracle是无能为力的,例如城市名称和邮编之间的对应关系。值得庆幸的是,在oracle11g中,引入了基于列组合和关系表达式的统计分析功能,这无疑为优化器做出更正确的抉择提供了有力的支持。我们称扩展统计信息功能。

对于扩展统计分析功能,我们大概可以分为两类:基于列组合基于表达式。而实现的方法也存在两种:通过DBMS_STATS.CREATE_EXTENDED_STATS通过method_opt参数

DBMS_STATS.CREATE_EXTENDED_STATS和METHOD_OPT

创建扩展的统计信息后,我们可以在DBA_STAT_EXTENSIONS和dba_tab_col_statistics视图中查询到相关信息。示例如下:

首先看一下METHOD_OPT方法

SQL> show user
USER 为 "SH"
SQL> exec dbms_stats.delete_table_stats('SH','CUSTOMERS');

PL/SQL 过程已成功完成。

SQL> exec  Dbms_Stats.drop_Extended_Stats('SH','CUSTOMERS','(cust_state_province,country_id)');

PL/SQL 过程已成功完成。

SQL> Select * From User_Stat_Extensions;

未选定行

SQL> select * from user_tab_col_statistics where table_name='CUSTOMERS';

未选定行

SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1,for columns (cust_state_province,country_id) size auto ');

PL/SQL 过程已成功完成。

SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_
CUSTOMERS		       CUST_ID
CUSTOMERS		       CUST_FIRST_NAME
CUSTOMERS		       CUST_LAST_NAME
CUSTOMERS		       CUST_GENDER
CUSTOMERS		       CUST_YEAR_OF_BIRTH
CUSTOMERS		       CUST_MARITAL_STATUS
CUSTOMERS		       CUST_STREET_ADDRESS
CUSTOMERS		       CUST_POSTAL_CODE
CUSTOMERS		       CUST_CITY
CUSTOMERS		       CUST_CITY_ID

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       CUST_STATE_PROVINCE
CUSTOMERS		       CUST_STATE_PROVINCE_ID
CUSTOMERS		       COUNTRY_ID
CUSTOMERS		       CUST_MAIN_PHONE_NUMBER
CUSTOMERS		       CUST_INCOME_LEVEL
CUSTOMERS		       CUST_CREDIT_LIMIT
CUSTOMERS		       CUST_EMAIL
CUSTOMERS		       CUST_TOTAL
CUSTOMERS		       CUST_TOTAL_ID
CUSTOMERS		       CUST_SRC_ID
CUSTOMERS		       CUST_EFF_FROM

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       CUST_EFF_TO
CUSTOMERS		       CUST_VALID

已选择24行。

SQL> Select * From User_Stat_Extensions;

TABLE_NAME		       EXTENSION_NAME
------------------------------ ------------------------------
EXTENSION									 CREATO DRO
-------------------------------------------------------------------------------- ------ ---
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_
("CUST_STATE_PROVINCE","COUNTRY_ID")						 USER	YES

使用method_opt会生成扩展列并直接统计扩展信息

再来看看DBMS_STATS.CREATE_EXTENDED_sTATS

SQL> exec dbms_stats.delete_table_stats('SH','CUSTOMERS');

PL/SQL 过程已成功完成。

SQL> exec  Dbms_Stats.drop_Extended_Stats('SH','CUSTOMERS','(cust_state_province,country_id)');

PL/SQL 过程已成功完成。

SQL> Select * From User_Stat_Extensions;

未选定行

SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';

未选定行

SQL> SELECT DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(cust_state_province,country_id)') FROM DUAL; 

DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(CUST_STATE_PROVINCE,COUNTRY_ID)')
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_

SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';

未选定行

SQL> Select * From User_Stat_Extensions;

TABLE_NAME		       EXTENSION_NAME		      EXTENSION 							          CREATO DRO
------------------------------ ------------------------------ -------------------------------------------------------------------------------- ------ ---
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID")				          USER   YES

SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ');

PL/SQL 过程已成功完成。

SQL> select table_name,column_name from user_tab_col_statistics where table_name='CUSTOMERS';

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_
CUSTOMERS		       CUST_ID
CUSTOMERS		       CUST_FIRST_NAME
CUSTOMERS		       CUST_LAST_NAME
CUSTOMERS		       CUST_GENDER
CUSTOMERS		       CUST_YEAR_OF_BIRTH
CUSTOMERS		       CUST_MARITAL_STATUS
CUSTOMERS		       CUST_STREET_ADDRESS
CUSTOMERS		       CUST_POSTAL_CODE
CUSTOMERS		       CUST_CITY
CUSTOMERS		       CUST_CITY_ID

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       CUST_STATE_PROVINCE
CUSTOMERS		       CUST_STATE_PROVINCE_ID
CUSTOMERS		       COUNTRY_ID
CUSTOMERS		       CUST_MAIN_PHONE_NUMBER
CUSTOMERS		       CUST_INCOME_LEVEL
CUSTOMERS		       CUST_CREDIT_LIMIT
CUSTOMERS		       CUST_EMAIL
CUSTOMERS		       CUST_TOTAL
CUSTOMERS		       CUST_TOTAL_ID
CUSTOMERS		       CUST_SRC_ID
CUSTOMERS		       CUST_EFF_FROM

TABLE_NAME		       COLUMN_NAME
------------------------------ ------------------------------
CUSTOMERS		       CUST_EFF_TO
CUSTOMERS		       CUST_VALID

已选择24行。

可以看出,DBMS_STATS.CREATE_EXTEND_sTATS不会直接创建统计信息,需要我们手工调用DBMS_STATS.GATHER_TABLE_STATS过程。


EXPRESSION

除了可以按照列组合的方式创建扩展统计信息,还可以按照表达式的方式来创建扩展统计信息,这对于基于函数的索引的非常有用的。

SQL> select dbms_Stats.create_extended_Stats('SH','CUSTOMERS','(MOD(CUST_ID,10))') FROM DUAL;

DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(MOD(CUST_ID,10))')
----------------------------------------------------------------------------------------------------
SYS_STU1D2S2K6$TFSJ$24PUR2SN1E

SQL> Select * From User_Stat_Extensions;

TABLE_NAME		       EXTENSION_NAME
------------------------------ ------------------------------
EXTENSION									 CREATO DRO
-------------------------------------------------------------------------------- ------ ---
CUSTOMERS		       SYS_STU1D2S2K6$TFSJ$24PUR2SN1E
(MOD("CUST_ID",10))								 USER	YES


扩展统计是如何帮助优化器的?

我们通过下面的例子来演示一下扩展统计信息是如何帮助优化器正确评估sql语句的选择性的,

SQL> Select * From User_Stat_Extensions;   --没有开启扩展统计信息

未选定行

SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ,for columns cust_state_province size 250, for columns country_id size 250');


PL/SQL 过程已成功完成。

SQL> set autotrace on explain
SQL> Select Count(*) From Customers Where Cust_State_Province='CA';

  COUNT(*)
----------
      3341   --实际存在数量为3341


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |    11 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |    11 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  3359 | 36949 |   413	 (1)| 00:00:05 |   --优化器估计的数量为3359,与3341基本接近
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CUST_STATE_PROVINCE"='CA')

SQL> Select Count(*) From Customers Where Country_Id=52790;

  COUNT(*)
----------
     18520  --实际存在数量为 18520


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |     5 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |     5 |	    |	       | 
|*  2 |   TABLE ACCESS FULL| CUSTOMERS | 18863 | 94315 |   413	 (1)| 00:00:05 |   --优化器估计数量为18863,与18520基本接近
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COUNTRY_ID"=52790)

SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52790;

  COUNT(*)
----------
      3341  --实际值数量为3341,这是因为country_id和CUST_STAT_PROVINCE之间存在某种数值关系,而这种关系,oracle数据库是不知的


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |    16 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |    16 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  1142 | 18272 |   413	 (1)| 00:00:05 | 
 --由于oracle不知道country_id和CUST_STAT_PROVINCE之间隐含关系,oracle估算的方式为3359*18863/55500(表customer的记录总数)=1141.63

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)


SQL> Select Count(*) From Customers Where Country_Id=52770;

  COUNT(*)
----------
      7780


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |     5 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |     5 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  7380 | 36900 |   413	 (1)| 00:00:05 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COUNTRY_ID"=52770)

SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;

  COUNT(*)
----------
	 0  --实际值为0


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |    16 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |    16 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |   447 |  7152 |   413	 (1)| 00:00:05 |
--------------------------------------------------------------------------------
--由于oracle不知道country_id和CUST_STAT_PROVINCE之间隐含关系,oracle估算的方式为3359*7380/55500(表customer的记录总数)=446.65



Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)

可以看出,如果oracle不了解列country_id和CUST_STAT_PROVINCE之间的数量关系,优化器估计的结果和实际值之间是存在差距的,下面看一下添加扩展统计信息后的结果

SQL> SELECT DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(cust_state_province,country_id)') FROM DUAL; 

DBMS_STATS.CREATE_EXTENDED_STATS('SH','CUSTOMERS','(CUST_STATE_PROVINCE,COUNTRY_ID)')
----------------------------------------------------------------------------------------------------
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_

SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1 ,for columns cust_state_province size 250, for columns country_id size 250');

PL/SQL 过程已成功完成。

SQL> select table_name,column_name,histogram from user_tab_col_statistics where table_name='CUSTOMERS' AND column_name like 'SYS%';

TABLE_NAME		       COLUMN_NAME		      HISTOGRAM
------------------------------ ------------------------------ ---------------
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ NONE

SQL> set autotrace on explain
SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;

  COUNT(*)
----------
	 0


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |    16 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |    16 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |   432 |  6912 |   413	 (1)| 00:00:05 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)

SQL> set autotrace off
SQL> exec dbms_stats.gather_table_stats('SH','CUSTOMERS',METHOD_OPT=>'FOR ALL COLUMNS SIZE auto ,for columns cust_state_province size 250, for columns country_id size 250');

PL/SQL 过程已成功完成。

SQL> select table_name,column_name,histogram from user_tab_col_statistics where table_name='CUSTOMERS' AND column_name like 'SYS%';

TABLE_NAME		       COLUMN_NAME		      HISTOGRAM
------------------------------ ------------------------------ ---------------
CUSTOMERS		       SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ FREQUENCY

SQL> set autotrace on explain
SQL> SELECT COUNT(*) FROM CUSTOMERS WHERE CUST_STATE_PROVINCE='CA' AND COUNTRY_ID=52770;

  COUNT(*)
----------
	 0


执行计划
----------------------------------------------------------
Plan hash value: 296924608

--------------------------------------------------------------------------------
| Id  | Operation	   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	       |     1 |    16 |   413	 (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |	       |     1 |    16 |	    |	       |
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |     5 |    80 |   413	 (1)| 00:00:05 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52770)

扩展的统计信息发挥作用,帮助优化器做出了正确的评估。


Extended Statistics Usage Notes:  

The Oracle documentation notes these limitations on the dbms_stats.create_extended_stats extension argument:

- The extension cannot contain a virtual column.

- Extensions cannot be created on tables owned by SYS.

- Extensions cannot be created on cluster tables, index organized tables, temporary tables or external tables.

- The total number of extensions in a table cannot be greater than a maximum of (20, 10% of number of non-virtual columns in the table).

- The number of columns in a column group must be in the range [2, 32].

- A column cannot appear more than once in a column group.

- A column group can not contain expressions.

- An expression must contain at least one column.

- An expression cannot contain a subquery.

- The COMPATIBLE parameter needs to be 11.0.0.0.0 or greater


  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值