简单的选择率与函数问题,下面只用col>字面值的 选择率,计算 比较方便
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> conn xh/a831115
Connected.
SQL> show user
USER is "XH"
SQL> create table t1 (a int);
Table created.
SQL> declare
2 begin
3 for i in 1..10000 loop
4 insert into t1 values(i);
5 end loop;
6 commit;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats('XH','T1');
PL/SQL procedure successfully completed.
SQL> select owner,num_rows from dba_tables where table_name='T1'
2 ;
OWNER NUM_ROWS
------------------------------ ----------
XH 10000
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statistics where table_name='T1'
2 ;
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T1 .0001 0 10000
SQL> SQL> select count(*) from t1 where a>2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 9999 | 29997 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">2)
这个选择率 计算公式为:
分析a>2 是一个无边 开区间 (所谓无边界 就是一侧没有边界)
a>2选择率=(high_value – limit) / (high_value – low_value )
SQL> select (10000-2)/(10000-1) from dual;
(10000-2)/(10000-1)
-------------------
.99989999
card=num_rows*选择率
SQL> select 10000*.99989999 from dual;
10000*.99989999
---------------
9998.9999~~~~~~~~~~~~~~~可以看到10gr2是向上舍入(ceil)
最后的card(rows)= num_rows*选择率=9999 rows
现在使用函数
SQL> select count(*) from t1 where to_char(a,'xx')>'2';
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (15)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 500 | 1500 | 7 (15)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("A",'xx')>'2')
选择率 是5%(跟使用bind(!=条件) or sql中用like 时选择率一样) CARD=NUM_ROW*5%=500
这样信息是不准确的 11g对此进行了改善
看下11g情况
11g 提供扩展列统计信息收集(基于表达式or组合列),原理就是为表达式子或一组列创建一个隐藏列,在这个隐藏列上收集信息.
这个虚拟列并不实际存储数据,只在运行时根据一个基于其他字段的表达式生成虚拟列的信息.(感觉有点象view的意思,无实际内容,节省空间)
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production
SQL> conn xh/a831115
Connected.
SQL> create table t1 (a int);
Table created.
SQL> declare
2 begin
3 for i in 1..10000 loop
4 insert into t1 values(i);
5 end loop;
6 commit;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL> set autotrace trace exp
SQL> select count(*) from t1 where a>2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 9999 | 29997 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">2)
SQL> select count(*) from t1 where to_char(a)>'2';
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 500 | 1500 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("A")>'2')
SQL> select dbms_stats.create_extended_stats(ownname=>'XH',tabname=>'T1',extension=>'(to_char(a))') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>'XH',TABNAME=>'T1',EXTENSION=>'(TO_CHA
--------------------------------------------------------------------------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF ~~~~~~~~~~~~~~~~~~~~~~~~~~返回这个扩展列名(sys_stu开头)
另外drop_extended_stats可以删除扩展统计信息(一个procedure)
SQL> desc user_stat_extensions;
Name Null? Type
----------------------------------------- -------- ----------------------------
TABLE_NAME NOT NULL VARCHAR2(30)
EXTENSION_NAME NOT NULL VARCHAR2(30)
EXTENSION CLOB
CREATOR VARCHAR2(6)
DROPPABLE VARCHAR2(3)
SQL> col extension for a40
SQL> select extension_name,extension from user_stat_extensions where table_name='T1';
EXTENSION_NAME EXTENSION
------------------------------ ----------------------------------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF (TO_CHAR("A"))
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T1';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF VARCHAR2 YES TO_CHAR("A
")
A NUMBER NO 10000
]]
SQL> set autotrace trace exp
SQL> select count(*) from t1 where to_char(a)>'2';
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 500 | 1500 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T1"."SYS_STUQ8Q3O$I37YENJXKXE0CSTFF">'2')
SQL> exec dbms_stats.gather_table_stats('XH','T1');
PL/SQL procedure successfully completed.
SQL> set autotrace off
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T1';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUQ8Q3O$I37YENJXKXE0CSTFF VARCHAR2 YES TO_CHAR("A 10000 ~~~~~~~~~~~~~~~~~~~~~要重新收集下 统计信息
")
A NUMBER NO 10000
SQL> set autotrace trace exp
SQL> select count(*) from t1 where to_char(a)>'2';
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| T1 | 8902 | 71216 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T1"."SYS_STUQ8Q3O$I37YENJXKXE0CSTFF">'2')
有改进 但信息还是不正确,查了下扩展列统计信息基于hash函数(sys_op_combined_hash),所以仅适用与=,(!=)的都不适用无法利用扩展统计信息,不过测试看到还是影响了card且card接近实际值,(也许没有做足够做详细的测试)
看下 =
SQL> show user
USER is "XH"
SQL> create table t2( a varchar2(10));
Table created.
SQL> declare
2 begin
3 for i in 1..1000 loop
4 insert into t2 values('a');
5 end loop;
6 commit;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats('XH','T2');
PL/SQL procedure successfully completed.
SQL> set autotrace trace exp
SQL> select * from t2 where a='a';
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 2000 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T2 | 1000 | 2000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"='a')
SQL> select * from t2 where upper(a)='A';
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 20 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T2 | 10 | 20 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(UPPER("A")='A')
SQL> select dbms_stats.create_extended_stats(ownname=>'XH',tabname=>'T2',extension=>'(upper(a))') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>'XH',TABNAME=>'T2',EXTENSION=>'(UPPER(
--------------------------------------------------------------------------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T2';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY VARCHAR2 YES UPPER("A")
A VARCHAR2 NO 1
SQL> exec dbms_stats.gather_table_stats('XH','T2');
PL/SQL procedure successfully completed.
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T2';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ------------------- --- ---------- ------------
SYS_STUWSDP8IKNLLR9QA4XVKCO0BY VARCHAR2 YES UPPER("A") 1
A VARCHAR2 NO 1
SQL>
SQL> set autotrace trace exp
SQL> select * from t2 where upper(a)='A';
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 4000 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T2 | 1000 | 4000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."SYS_STUWSDP8IKNLLR9QA4XVKCO0BY"='A')
解决问题了
如果频繁使用表达式,也可以考虑在定义table时 定义虚拟列 (频繁使用的前提)
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> show user
USER is "XH"
SQL> create table t2 (a int ,a_to_char as (to_char(a)));
create table t2 (a int ,a_to_char as (to_char(a)))
*
ERROR at line 1:
ORA-00902: invalid datatype
11g才支持
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production
SQL> show user
USER is "XH"
SQL> create table t3 (a int ,a_to_char as (to_char(a)));
Table created.
建立 表时候定义虚拟列
SQL> col data_type for a10
SQL> col data_default for a10
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T3';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ---------- --- ---------- ------------
A_TO_CHAR VARCHAR2 NO TO_CHAR("A
")
A NUMBER NO
\
SQL> execute dbms_stats.gather_table_stats('XH','T3');
PL/SQL procedure successfully completed.
SQL> select column_name,data_type,hidden_column,data_default,NUM_DISTINCT from user_tab_cols where table_name='T3';
COLUMN_NAME DATA_TYPE HID DATA_DEFAU NUM_DISTINCT
------------------------------ ---------- --- ---------- ------------
A_TO_CHAR VARCHAR2 NO TO_CHAR("A 1
")
A NUMBER NO 1
SQL> set autotrace trace exp
SQL> select a from t3 where a_to_char='1';
Execution Plan
----------------------------------------------------------
Plan hash value: 4161002650
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T3 | 1 | 5 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A_TO_CHAR"='1')
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-624110/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/12020513/viewspace-624110/