表分析的作用
简单的说,就是收集表和索引的信息,如果没有直方图信息,CBO根据这些信息决定SQL执行计划。
若都没有收集,CBO采取动态采样方式选择执行计划。
测试
--建立测试数据(基于11g)
CREATE TABLE test_fan AS SELECT * FROM dba_tables WHERE ROWNUM<=10000;
CREATE INDEX idx_owner ON test_fan(owner) NOLOGGING PARALLEL 4;
ALTER INDEX idx_owner NOPARALLEL;
--查看列owner分布情况
SELECT owner,COUNT(1) FROM test_fan GROUP BY owner;
OWNER COUNT(1)
PUBLIC 2851
OUTLN 10
SYSTEM 549
ORACLE_OCM 3
SYS 6587
查看是否分析表
SELECT d.LAST_ANALYZED,d.TABLE_NAME,d.NUM_ROWS,d.BLOCKS,d.* FROM dba_tables d WHERE d.TABLE_NAME='TEST_FAN';
查看是否有直方图信息:
SELECT * FROM Dba_Tab_Col_Statistics t WHERE t.TABLE_NAME='TEST_FAN';
都没有的情况下查看如下SQL执行计划
SELECT * FROM test_fan t WHERE t.owner='SYS';--sql1
执行计划:
Plan hash value: 3747397153
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6374 | 1288K| 40 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_FAN | 6374 | 1288K| 40 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."OWNER"='SYS')
Note
-----
SELECT * FROM test_fan t WHERE t.owner='OUTLN';--SQL2
Plan hash value: 1852869688
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 2070 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_FAN | 10 | 2070 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_OWNER | 10 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."OWNER"='OUTLN')
Note
-----
- dynamic sampling used for this statement (level=2)
从dynamic sampling used for this statement (level=2)可以看出,在没有任何信息情况下,采用动态采样方式获取执行计划而且非常准确。关于动态采样优缺点这里不讨论,有兴趣可以自己查资料学习。
接着分析表,对执行计划影响
对表进行分析:
方式1:
BEGIN
dbms_stats.gather_table_stats(‘PROD_DEV’, ‘TEST_FAN’);
END;
方式2:
ANALYZE TABLE TEST_FAN COMPUTE STATISTICS;
看看表分析对执行计划的影响:
--查看列owner分布情况
SELECT owner,COUNT(1) FROM test_fan GROUP BY owner;
OWNER COUNT(1)
PUBLIC 2851
OUTLN 10
SYSTEM 549
ORACLE_OCM 3
SYS 6587
Plan hash value: 3747397153
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2000 | 177K| 40 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_FAN | 2000 | 177K| 40 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."OWNER"='OUTLN')
CBO预估有2000条记录,20%的数据量走了全表扫描。看看表分析是如何计算的。
SELECT t.table_name
,t.column_name
,t.num_distinct
,t.density
,t.last_analyzed
,t.sample_size
,t.HISTOGRAM
FROM Dba_Tab_Col_Statistics t
,Dba_Tab_Columns c
WHERE t.table_name = 'TEST_FAN'
AND t.column_name = c.COLUMN_NAME
AND t.table_name = c.TABLE_NAME
AND t.column_name='OWNER';
TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY LAST_ANALYZED SAMPLE_SIZE HISTOGRAM
TEST_FAN OWNER 5 0.2 2019/10/15 14:04 10000 NONE
CB0 rows计算公式:
rows=NUM_DISTINCT* SAMPLE_SIZE
例子中预估的rows=0.2*10000=2000
表分析实际统计了每个列唯一值的个数,这样它认为数据平均分配不存在倾斜,每个值得选择率NUM_DISTINCT即为1/COUNT(DISTINCT VALUE)
删除表分析
方式1:
BEGIN
dbms_stats.delete_table_stats(‘PROD_DEV’, ‘TEST_FAN’);
END;
方式2:
ANALYZE TABLE TEST_FAN DELETE STATISTICS;
附上没有任何统计信息切不采用动态采样的执行计划:
EXPLAIN PLAN FOR SELECT/*+dynamic_sampling(0)*/ * FROM test_fan t WHERE t.owner='SYS';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 1852869688
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 116 | 24012 | 5 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_FAN | 116 | 24012 | 5 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_OWNER | 46 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."OWNER"='SYS')
有个疑问,CBO这种情况是如何工作的?