oracle 谓词表达式对基数的影响(及11G改进)


 

简单的选择率与函数问题,下面只用col>字面值的 选择率,计算 比较方便

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production


SQL> conn xh/a831115
Connected.
SQL> show user
USER is "XH"
SQL> create table t1 (a int);

Table created.

SQL> declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t1 values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('XH','T1');

PL/SQL procedure successfully completed.

SQL> select owner,num_rows from dba_tables where table_name='T1'
  2  ;

OWNER                            NUM_ROWS
------------------------------ ----------
XH                                  10000


SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statistics where table_name='T1'
  2  ;

TABLE_NAME                        DENSITY  NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T1                                  .0001          0        10000

 

SQL> SQL> select count(*) from t1 where a>2;

Execution Plan
----------------------------------------------------------

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  9999 | 29997 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">2)
  
   这个选择率  计算公式为:
   分析a>2 是一个无边 开区间 (所谓无边界 就是一侧没有边界)
   a>2选择率=(high_value – limit) / (high_value – low_value )
  
   SQL> select (10000-2)/(10000-1) from dual;

(10000-2)/(10000-1)
-------------------
          .99989999
card=num_rows*选择率
SQL> select  10000*.99989999 from dual;

10000*.99989999
---------------
      9998.9999~~~~~~~~~~~~~~~可以看到10gr2是向上舍入(ceil)
     
      最后的card(rows)= num_rows*选择率=9999 rows
     
     
      现在使用函数
     
     
  
SQL> select count(*) from t1 where to_char(a,'xx')>'2';

Execution Plan
----------------------------------------------------------

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7  (15)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   500 |  1500 |     7  (15)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("A",'xx')>'2')
  
选择率 是5%(跟使用bind(!=条件) or sql中用like 时选择率一样)   CARD=NUM_ROW*5%=500  

 


这样信息是不准确的  11g对此进行了改善

看下11g情况

11g 提供扩展列统计信息收集(基于表达式or组合列),原理就是为表达式子或一组列创建一个隐藏列,在这个隐藏列上收集信息.
这个虚拟列并不实际存储数据,只在运行时根据一个基于其他字段的表达式生成虚拟列的信息.(感觉有点象view的意思,无实际内容,节省空间)

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production


SQL> conn xh/a831115
Connected.
SQL> create table t1 (a int);

Table created.

SQL> declare
  2  begin
  3  for i in 1..10000 loop
  4  insert into t1 values(i);
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.


SQL> set autotrace trace exp
SQL> select count(*) from t1 where a>2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  9999 | 29997 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">2)


SQL> select count(*) from t1 where to_char(a)>'2';

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   500 |  1500 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("A")>'2')

 

SQL> select dbms_stats.create_extended_stats(ownname=>'XH',tabname=>'T1',extension=>'(to_char(a))') from dual;

DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>'XH',TABNAME=>'T1',EXTENSION=>'(TO_CHA
--------------------------------------------------------------------------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF  ~~~~~~~~~~~~~~~~~~~~~~~~~~返回这个扩展列名(sys_stu开头)

另外drop_extended_stats可以删除扩展统计信息(一个procedure)

SQL> desc user_stat_extensions;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 TABLE_NAME                                NOT NULL VARCHAR2(30)
 EXTENSION_NAME                            NOT NULL VARCHAR2(30)
 EXTENSION                                          CLOB
 CREATOR                                            VARCHAR2(6)
 DROPPABLE                                          VARCHAR2(3)


SQL> col extension for a40
SQL> select extension_name,extension from user_stat_extensions where table_name='T1';

EXTENSION_NAME                 EXTENSION
------------------------------ ----------------------------------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF (TO_CHAR("A"))

 


SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T1';

COLUMN_NAME                    DATA_TYPE           HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF VARCHAR2            YES TO_CHAR("A
                                                       ")

A                              NUMBER              NO                    10000

]]
SQL> set autotrace trace exp
SQL> select count(*) from t1 where to_char(a)>'2';

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   500 |  1500 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T1"."SYS_STUQ8Q3O$I37YENJXKXE0CSTFF">'2')


SQL> exec dbms_stats.gather_table_stats('XH','T1');

PL/SQL procedure successfully completed.

SQL> set autotrace off
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T1';

COLUMN_NAME                    DATA_TYPE           HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF VARCHAR2            YES TO_CHAR("A        10000 ~~~~~~~~~~~~~~~~~~~~~要重新收集下 统计信息
                                                       ")

A                              NUMBER              NO                    10000

 

SQL> set autotrace trace exp
SQL> select count(*) from t1 where to_char(a)>'2';

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     8 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  8902 | 71216 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T1"."SYS_STUQ8Q3O$I37YENJXKXE0CSTFF">'2')


有改进 但信息还是不正确,查了下扩展列统计信息基于hash函数(sys_op_combined_hash),所以仅适用与=,(!=)的都不适用无法利用扩展统计信息,不过测试看到还是影响了card且card接近实际值,(也许没有做足够做详细的测试)

 

看下 = 
SQL> show user
USER is "XH"
SQL> create table t2( a varchar2(10));

Table created.

SQL> declare
  2  begin
  3  for i in 1..1000 loop
  4  insert into t2 values('a');
  5  end loop;
  6  commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('XH','T2');

PL/SQL procedure successfully completed.


SQL> set autotrace trace exp
SQL> select * from t2 where a='a';

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000 |  2000 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |  1000 |  2000 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"='a')

SQL> select * from t2 where upper(a)='A';

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |    20 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |    10 |    20 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(UPPER("A")='A')

 

SQL> select dbms_stats.create_extended_stats(ownname=>'XH',tabname=>'T2',extension=>'(upper(a))') from dual;

DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>'XH',TABNAME=>'T2',EXTENSION=>'(UPPER(
--------------------------------------------------------------------------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY

SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T2';

COLUMN_NAME                    DATA_TYPE           HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY VARCHAR2            YES UPPER("A")
A                              VARCHAR2            NO                        1

 

SQL> exec dbms_stats.gather_table_stats('XH','T2');

PL/SQL procedure successfully completed.

SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T2';

COLUMN_NAME                    DATA_TYPE           HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY VARCHAR2            YES UPPER("A")            1
A                              VARCHAR2            NO                        1

SQL>

 

SQL> set autotrace trace exp
SQL> select * from t2 where upper(a)='A';

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000 |  4000 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |  1000 |  4000 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T2"."SYS_STUWSDP8IKNLLR9QA4XVKCO0BY"='A')


解决问题了
如果频繁使用表达式,也可以考虑在定义table时 定义虚拟列 (频繁使用的前提)

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> show user
USER is "XH"
SQL> create table t2 (a int ,a_to_char as (to_char(a)));
create table t2 (a int ,a_to_char as (to_char(a)))
                                  *
ERROR at line 1:
ORA-00902: invalid datatype


11g才支持

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

SQL> show user
USER is "XH"


SQL> create table t3 (a int ,a_to_char as (to_char(a)));

Table created.

建立 表时候定义虚拟列

 

SQL> col data_type for a10
SQL> col data_default for a10
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T3';

COLUMN_NAME                    DATA_TYPE  HID DATA_DEFAU NUM_DISTINCT
------------------------------ ---------- --- ---------- ------------
A_TO_CHAR                      VARCHAR2   NO  TO_CHAR("A
                                              ")

A                              NUMBER     NO

\
SQL> execute dbms_stats.gather_table_stats('XH','T3');

PL/SQL procedure successfully completed.

SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T3';

COLUMN_NAME                    DATA_TYPE  HID DATA_DEFAU NUM_DISTINCT
------------------------------ ---------- --- ---------- ------------
A_TO_CHAR                      VARCHAR2   NO  TO_CHAR("A            1
                                              ")

A                              NUMBER     NO                        1

 

SQL> set autotrace trace exp
SQL> select a from t3 where a_to_char='1';

Execution Plan
----------------------------------------------------------
Plan hash value: 4161002650

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     5 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T3   |     1 |     5 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A_TO_CHAR"='1')

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-624110/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/12020513/viewspace-624110/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值