本文章为网络笔记,看了warehouse老师的视频受益匪浅,更是感觉自己技术太过初级,特写了本笔记,方便以后反复学习!
如有任何不妥,请发邮件至102448567@qq.com删除文章!
关于warehouse:
http://blog.itpub.net/19602/viewspace-1059211/
11gR2视频第四版 8_05_优化器统计信息_高水位_柱状图等
optimizer statistics
:
优化器作用:负责sql
执行的时候解析sql
并生成执行计划
sql
访问的对象如果没有优化器统计信息的话而且又把动态取样屏蔽了这个时候就很容易产生错误的执行计划,从而这个sql
的性能就很差
optimizer statistics分为三种
1.table
select * from dba_tabes where table_name = ‘TT2’;
其中的blocks:表示存放这个表里的数据用了多少个块
num_rows:表里一共有多少行数据
avg_row_len:平均每一行的长度是多少个字节
这三个信息能宏观的说明这个表大不大
2.index
select * from dba_indexes where table_name = ‘TT2’
blevel:树的深度=树的高度减1(高度就是只有branch和leaf就是2层高)
leaf_blocks:叶子块的数量
distinct_keys:这个index有多少个不同的值
SQL> select count(distinct object_id) from tt2;
COUNT(DISTINCTOBJECT_ID)
------------------------
14726
avg_leaf_blocks_per_key:平均每个介质在几个叶子上,意思就是所有的object_id=2在1个leaf上存放着,3在1个叶子上存放着…
下面可以看到在索引中所有相同的数据都在同一个块中
row#0[4600] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 00 8b 00 30
row#1[4612] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 01 e9 00 30
row#2[4624] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 02 ed 00 01
row#3[4636] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 03 d3 00 1c
row#4[4648] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 04 4b 00 30
row#5[4660] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 05 78 00 18
row#6[4672] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 06 45 00 3a
row#7[4684] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 07 18 00 30
row#8[4696] flag: ------, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 00 07 57 00 0b
而表中的相同的数据是分散在不同的块中的
SQL> select object_id,rowid,dbms_rowid.rowid_block_number(rowid) from tt2 where object_id = 2;
OBJECT_ID ROWID DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ------------------ ------------------------------------
2 AAAD9tAAEAAAACDAAw 131
2 AAAD9tAAEAAAAFwAAw 368
2 AAAD9tAAEAAAAJnAAX 615
2 AAAD9tAAEAAAAN9ABE 893
2 AAAD9tAAEAAAAPJAAs 969
2 AAAD9tAAEAAAATnAAK 1255
2 AAAD9tAAEAAAAX+AA+ 1534
2 AAAD9tAAEAAAAaeAAE 1694
2 AAAD9tAAEAAAAbYAAc 1752
2 AAAD9tAAEAAAAewAA2 1968
2 AAAD9tAAEAAAAiMAAb 2188
2 AAAD9tAAEAAAAmpABD 2473
2 AAAD9tAAEAAAApQAAr 2640
2 AAAD9tAAEAAAAsMAAA 2828
2 AAAD9tAAEAAAAwtAAv 3117
2 AAAD9tAAEAAAAxGABG 3142
2 AAAD9tAAEAAAA0wABD 3376
2 AAAD9tAAEAAAA1MAAz 3404
2 AAAD9tAAEAAAA5JAAm 3657
2 AAAD9tAAEAAAA9kAAJ 3940
2 AAAD9tAAEAAABA/AAv 4159
2 AAAD9tAAEAAABDuAAY 4334
2 AAAD9tAAEAAABHgABE 4576
2 AAAD9tAAEAAABKEAAf 4740
2 AAAD9tAAEAAABM9AAq 4925
2 AAAD9tAAEAAABPnABG 5095
2 AAAD9tAAEAAABQLAAV 5131
2 AAAD9tAAEAAABUwABJ 5424
2 AAAD9tAAEAAABYuAAc 5678
2 AAAD9tAAEAAABdLAAN 5963
2 AAAD9tAAEAAABewAAw 6064
2 AAAD9tAAEAAABi+AAO 6334
32 rows selected.
avg_data_blocks_per_key:平均每个介质在几个数据块上,意思是表中的数据是分散的,杂乱的不像是索引中的数据是排好序的,反应数据存放的密集程度
验证一下:
SQL> create table tt3 tablespace users as select * from tt2 order by object_id;
Table created.
SQL> create index idx_tt3 on tt3(object_id);
Index created.
SQL> exec dbms_stats.gather_table_stats('sys','tt3');
PL/SQL procedure successfully completed.
SQL> select object_id,rowid,dbms_rowid.rowid_block_number(rowid) from tt3 where object_id = 2;
OBJECT_ID ROWID DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ------------------ ------------------------------------
2 AAAD9vAAEAAABj7AAA 6395
2 AAAD9vAAEAAABj7AAB 6395
2 AAAD9vAAEAAABj7AAC 6395
2 AAAD9vAAEAAABj7AAD 6395
2 AAAD9vAAEAAABj7AAE 6395
2 AAAD9vAAEAAABj7AAF 6395
2 AAAD9vAAEAAABj7AAG 6395
2 AAAD9vAAEAAABj7AAH 6395
2 AAAD9vAAEAAABj7AAI 6395
2 AAAD9vAAEAAABj7AAJ 6395
2 AAAD9vAAEAAABj7AAK 6395
2 AAAD9vAAEAAABj7AAL 6395
2 AAAD9vAAEAAABj7AAM 6395
2 AAAD9vAAEAAABj7AAN 6395
2 AAAD9vAAEAAABj7AAO 6395
2 AAAD9vAAEAAABj7AAP 6395
2 AAAD9vAAEAAABj7AAQ 6395
2 AAAD9vAAEAAABj7AAR 6395
2 AAAD9vAAEAAABj7AAS 6395
2 AAAD9vAAEAAABj7AAT 6395
2 AAAD9vAAEAAABj7AAU 6395
2 AAAD9vAAEAAABj7AAV 6395
2 AAAD9vAAEAAABj7AAW 6395
2 AAAD9vAAEAAABj7AAX 6395
2 AAAD9vAAEAAABj7AAY 6395
2 AAAD9vAAEAAABj7AAZ 6395
2 AAAD9vAAEAAABj7AAa 6395
2 AAAD9vAAEAAABj7AAb 6395
2 AAAD9vAAEAAABj7AAc 6395
2 AAAD9vAAEAAABj7AAd 6395
2 AAAD9vAAEAAABj7AAe 6395
2 AAAD9vAAEAAABj7AAf 6395
32 rows selected.
因为tt3
是排好序的,所以object_id
相同的数据就放在了一起,也就是avg_data_blocks_per_key
会很小
对比tt2
和tt3
的逻辑读,因为tt3
数据存放的密集,所以他访问的块就少,逻辑读就少,效率就高
SQL> select count(object_name) from tt2 where object_id = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 4 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 25 | | |
|* 2 | TABLE ACCESS FULL| TT2 | 1 | 25 | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
39 consistent gets
0 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select count(object_name) from tt3 where object_id = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 409595855
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 4 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 25 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TT3 | 32 | 800 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_TT3 | 32 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
clustering_factor:聚集因子/集群因子(和avg_data_blocks_per_key类似反应的也是表中数据的密集程度)
聚集因子算法:上图中五个蓝色小方块相当于一个数据块,C、B、A等相当表中数据,索引中的rowid指向数据块中的数据,如果我们找C假设当前的聚集因子是5,当找到第一个C的时候,聚集因子加1,如果找到下一个C的时候也是在相同的块中那聚集因子不增加了,如果是在别的块中就再加1
降低聚集因子的方法就是重新建表,在建表的时候按照索引字段排好序
所以tt3的聚集因子比tt2低很多
最理想的状态是dba_indexes中的clustering_factor跟dba_tables中的blocks非常接近,也就是说所有相同的值都是放在了同一个块中,想要找哪个值直接在一个块中就找出来了。
从10g开始create index的时候自动收集了统计信息
手动收集dbms_stats.gather_index_stats('owner','index_name')
手动删除dbms_stats.delete_index_stats('owner','index_name')
3.column
在收集表的统计信息的时候自动收集列的统计信息,也可以排除列的统计信息但是这种操作没有意义
select * from dba_tab_col_statistics where table_name = ‘对应的表’
分为两种:
a.basic(基本的)
因为object_id上有索引,就拿这个字段为例
其中NUM_DISTINCT
:多少个不同值
LOW_VALUE
:最小值
HIGH_VALUE
:最大值
NUM_NULLS
:多少个空值
验证一下:
SQL> select min(object_id),max(object_id),count(distinct object_id) from tt2;
MIN(OBJECT_ID) MAX(OBJECT_ID) COUNT(DISTINCTOBJECT_ID)
-------------- -------------- ------------------------
2 15000 14726
SQL> select utl_raw.cast_to_number('c103') from dual;
UTL_RAW.CAST_TO_NUMBER('C103')
------------------------------
2
SQL> select utl_raw.cast_to_number('C30233') from dual;
UTL_RAW.CAST_TO_NUMBER('C30233')
--------------------------------
15000
DENSITY
(密度):不同值(num_distinct
)的倒数
dba_tab_col_statistics中的值是科学计数法,下面的结果是对的
SQL> select 1/14726 from dual;
1/14726
----------
.000067907
为什么要收集列的统计信息,是因为Oracle
要判断where
子句中的字段是走索引效率高还是不走效率高
b.histogram(柱状图/直方图):用来表述列的数据的分布程度
有没有直方图就是看histogram这个列,要是none就是没有直方图
先把表tt2备份一下
SQL> create table tt4 tablespace users as select * from tt2;
Table created.
SQL> create index idx_tt4 on tt4(object_id);
Index created.
修改一下表,为什么是大于等于255
是因为buket
的数量最大是254
SQL> update tt2 set object_id = 99999 where object_id >= 255;
463136 rows updated.
SQL> commit;
Commit complete.
重新收集统计信息
SQL> exec dbms_stats.gather_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
最后这个字段刚刚是none
,现在是FREQUENCY
(频率)就是直方图的一种,每一个不同的值占用一个bucket
,现在是一共有254
个不同的值,生成频率直方图的要求就是不同值的数量要小于等于最大bucket
的数量也就是<=254
SQL> select min(object_id),max(object_id),count(distinct object_id) from tt2;
MIN(OBJECT_ID) MAX(OBJECT_ID) COUNT(DISTINCTOBJECT_ID)
-------------- -------------- ------------------------
2 99999 254
想要理解好直方图bucket
(桶)概念很重要
直方图的原理:
如果一个列上有直方图查这个视图dba_tab_histograms
分析Oracle
是怎么知道99999
比2
占的比例多的很多,所以他不走索引
注意上面的数据是取样的,我们有tt2
表中有254
个不同而objec_id
,这里只有74
个,想办法所有的都展示出来(estimate_percent=>null
就是不要取样收集,100%
的收集)
SQL> exec dbms_stats.gather_table_stats('sys','tt2',method_opt=>'for columns object_id size 254',estimate_percent=>null);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.55
都显示出来了
把ENDPOINT_NUMBER
(用来计算值的个数)和ENDPOINT_VALUE
(表中不同的值)拿到excel
中,看下图也就能理解为什么叫直方图了,Oracle
就是根据只直方图来判断列上数据的均匀程度
下面的直方图中的一个柱子就理解成一个bucket
看一下列的分布程度
SQL> select object_id,count(1) from tt2 group by object_id order by object_id;
OBJECT_ID COUNT(1)
---------- ----------
2 32
3 32
4 32
5 32
--...中间略,中间的count(1)都是32
253 32
254 32
99999 463136
254 rows selected.
查询object_id=2
的数据是走索引的
SQL> set timing on --显示执行的时间
SQL> select * from tt2 where object_id = 2;
32 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3665371926
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 85 | 7650 | 6 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT2 | 85 | 7650 | 6 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TT2 | 85 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
14 recursive calls
0 db block gets
77 consistent gets --逻辑读是77个块,效率很快
0 physical reads
0 redo size
4575 bytes sent via SQL*Net to client
546 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32 rows processed
查询object_id=99999
,这次不走索引了,全表扫描了
SQL> select * from tt2 where object_id = 99999;
463136 rows selected.
Elapsed: 00:00:03.23
Execution Plan
----------------------------------------------------------
Plan hash value: 1248358058
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 463K| 39M| 1675 (1)| 00:00:21 |
|* 1 | TABLE ACCESS FULL| TT2 | 463K| 39M| 1675 (1)| 00:00:21 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=99999)
Statistics
----------------------------------------------------------
1 recursive calls --有递归调用
0 db block gets
36694 consistent gets --逻辑读很大,效率差
0 physical reads
0 redo size
21500501 bytes sent via SQL*Net to client
340149 bytes received via SQL*Net from client
30877 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
463136 rows processed
加上注释强制走索引效率反而更差了,这是为啥?
SQL> select /*+ index(tt2 idx_tt2) */ * from tt2 where object_id = 99999;
463136 rows selected.
Elapsed: 00:00:03.68
Execution Plan
----------------------------------------------------------
Plan hash value: 3665371926
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 463K| 39M| 14946 (1)| 00:03:00 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT2 | 463K| 39M| 14946 (1)| 00:03:00 |
|* 2 | INDEX RANGE SCAN | IDX_TT2 | 463K| | 935 (1)| 00:00:12 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=99999)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
68408 consistent gets --逻辑读很大,跟不走索引差了将近一倍
0 physical reads
0 redo size
49712329 bytes sent via SQL*Net to client
340149 bytes received via SQL*Net from client
30877 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
463136 rows processed
说明索引效率不一定高,但是Oracle
凭什么知道object_id=2
的时候走索引效率高,object_id=99999
不走索引效率高靠的就是直方图
删除直方图11g
之前有一种办法就是在收集优化器统计信息的时候加上method_opt
选项
这个size
指定的就是bucket
的数量,11g
之前就是这种方法只收集列的basic
优化器统计信息而不收集直方图
SQL> exec dbms_stats.gather_table_stats('sys','tt2',method_opt=>'for all columns size 1');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.31
没有直方图了
这时候object_id=2
或者=99999
的时候都走索引了,但是这不是我们希望看到的,出现这种错误就是因为没有了直方图
SQL> select * from tt2 where object_id = 2;
32 rows selected.
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3665371926
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1855 | 163K| 63 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT2 | 1855 | 163K| 63 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TT2 | 1855 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
42 consistent gets
0 physical reads
0 redo size
4575 bytes sent via SQL*Net to client
546 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32 rows processed
SQL> select * from tt2 where object_id = 99999;
463136 rows selected.
Elapsed: 00:00:03.24
Execution Plan
----------------------------------------------------------
Plan hash value: 3665371926
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1855 | 163K| 63 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT2 | 1855 | 163K| 63 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TT2 | 1855 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=99999)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
36694 consistent gets
0 physical reads
0 redo size
21500501 bytes sent via SQL*Net to client
340149 bytes received via SQL*Net from client
30877 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
463136 rows processed
重新收集统计信息
SQL> exec dbms_stats.gather_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.30
有直方图了
不走索引了
SQL> select * from tt2 where object_id = 99999;
463136 rows selected.
Elapsed: 00:00:03.19
Execution Plan
----------------------------------------------------------
Plan hash value: 1248358058
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 462K| 39M| 1675 (1)| 00:00:21 |
|* 1 | TABLE ACCESS FULL| TT2 | 462K| 39M| 1675 (1)| 00:00:21 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=99999)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
36694 consistent gets
0 physical reads
0 redo size
21500501 bytes sent via SQL*Net to client
340149 bytes received via SQL*Net from client
30877 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
463136 rows processed
还有一种直方图叫HEIGHT BALANCED
(高度均衡直方图)叫高度均衡的意思就是每一个bucket
上的值都是差不多但是因为bucket
数量可能不够用,那就有可能多个不同的值在一个bucket
上,这种直方图的条件就是不同值的数量要超过254
下面的sql
使得不同值的数量超过了bucket
的最大值254
SQL> update tt4 set object_id=99999 where object_id >400;
458496 rows updated.
Elapsed: 00:00:05.86
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL> exec dbms_stats.gather_table_stats('sys','tt4');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.35
有了高度直方图,如果没有高度直方图手机统计信息的时候就加上estimate_percent=>null
参数,不要取样收集,全部收集
有了直方图之后无论是object_id=2
还是=99999
都走了正确的执行计划
SQL> select * from tt4 where object_id = 2;
32 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 2707091987
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 28 | 2520 | 5 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT4 | 28 | 2520 | 5 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TT4 | 28 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
36 recursive calls
0 db block gets
96 consistent gets
0 physical reads
0 redo size
4575 bytes sent via SQL*Net to client
546 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
32 rows processed
SQL> select * from tt4 where object_id = 99999;
458496 rows selected.
Elapsed: 00:00:03.22
Execution Plan
----------------------------------------------------------
Plan hash value: 3980746492
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 459K| 39M| 1708 (1)| 00:00:21 |
|* 1 | TABLE ACCESS FULL| TT4 | 459K| 39M| 1708 (1)| 00:00:21 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=99999)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
36374 consistent gets
7 physical reads
0 redo size
21283052 bytes sent via SQL*Net to client
336750 bytes received via SQL*Net from client
30568 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
458496 rows processed
生成高度直方图的时候没有指定bucket
的数量,这里Oracle
一共用了8
个桶,其中0
号桶是一个特殊的桶
注意:频率直方图和高度直方图ENDPOINT_NUMBER
和ENDPOINT_VALUE
代表的意义不一样
ENDPOINT_NUMBER:
桶的标号,0
号桶上是这一列的最小值我们这里也就是2
ENDPOINT_VALUE
:1号桶上放的是2-77
之间的值(2,3,4
等等到77
),2
号桶上就是77-125
区间的值后面的以此类推1-6
号桶保存的区间的值大概是相同的,但是254
号桶放的值是不均衡的,频率直方图准确的反映出了百分比,而高度直方图比较模糊
指定bucket
的数量
SQL> exec dbms_stats.gather_table_stats('sys','tt4',method_opt=>'for columns object_id size 254',estimate_percent=>null);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.54
指定了也没用
当for all columns size
在2-254
之间时Oracle
会自己判断哪些列上需要统计直方图,1
的时候就是不收集直方图
SQL> exec dbms_stats.gather_table_stats('sys','tt2',method_opt=>'for columns object_id size 1');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.23
SQL> exec dbms_stats.gather_table_stats('sys','tt2',method_opt=>'for columns object_id size 2');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.23
删除直方图:
11g
之前:
要先把列的统计信息删除
SQL> exec dbms_stats.delete_column_stats('sys','tt2','object_id');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.02
然后再重新收集列的统计信息但是要加上for all columns size 1
参数,显然要是表很大的话这种方式不好
11g
开始:
收集一下直方图
SQL> exec dbms_stats.gather_table_stats('sys','tt2',method_opt=>'for columns object_id size 2');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.24
有了高度均衡的直方图
删除直方图,HISTOGRAM
是只删除直方图,要是ALL
就是删除直方图和列的统计信息
SQL> exec dbms_stats.delete_column_stats('sys','tt2',colname=>'object_id',col_stat_type=>'HISTOGRAM');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
已经没有了直方图
optimizer statistics
手动收集有两种办法
1.8i
之前:这个语句后面还可以加很多参数
SQL> select num_rows,blocks,empty_blocks,avg_row_len from dba_tables where table_name = 'TT2';
NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_ROW_LEN
---------- ---------- ------------ -----------
SQL> analyze table tt2 compute statistics;
Table analyzed.
SQL> select num_rows,blocks,empty_blocks,avg_row_len from dba_tables where table_name = 'TT2';
NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_ROW_LEN
---------- ---------- ------------ -----------
247104 3229 98 92
上面的blocks
中不一定有数据,但是曾经一定存放过数据,其中empty_blocks
是分配给这个对象但是从来没有用过的块,db_tables
中的avg_space
是平均每个块空闲了多少空间(单位:字节),一个8k的块不是都放满了默认预留了pct_free
的空间为了防止比如varchar2
这种长度可变的数据类型定义的varchar2(50)
刚开始存了一个a
,但是他后期可能update
成aaaaaaaaa
如果预留的空间也不够放update
的数据这个时候要再占用另一个数据块也就是chain_cnt
的个数,这种情况就是行链接或者行迁移,如果avg_row_len
很大那么可能意味着这个表中有很多的字段并且每个字段定义的很宽,last_analyxzed
最后一次分析的时间,sample_size
取样数据的大小,analyzed
的时候如果表很大就不会全部分析,要是表很小就会全部分析
从8i
开始就有一个包来分析统计信息(dbms_stats
功能非常强大)
先删除之前使用analyzed
收集的信息
SQL> exec dbms_stats.delete_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
可以收集table
的统计信息
SQL> exec dbms_stats.gather_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
上图中的blocks
叫做segment hwm(high water mark)
段的高水位表示数据段中存放数据使用块的历史最大值,这个值越大不能说明数据越多因为使用delete的时候blocks
不会变小但是数据行(num_rows
)会变小,但是当访问一个表的时候如果这个表上没有索引走的是TABLE ACCESS FULL
的时候,那就要把blocks
上所有的块都扫描一遍这样就会大大影响性能
例子:
注意做实验的tt2
表不要放在system
表空间
先看一下目前有多少个空块,要在段头块的dump
文件中看,在4
号数据文件的130
号块上是段头块
SQL> alter system dump datafile 4 block 130;
System altered.
SQL> select spid from v$process
2 where addr = (select paddr from v$session
3 where sid = (select distinct sid from v$mystat));
SPID
------------------------
2108
只粘贴一小部分
vim sales_ora_2108.trc
Extent Control Header
-----------------------------------------------------------------
Extent Header:: spare1: 0 spare2: 0 #extents: 41 #blocks: 3328
last map 0x00000000 #maps: 0 offset: 2716
Highwater:: 0x01000d62 ext#: 40 blk#: 98 ext size: 128
#blocks in seg. hdr's freelists: 0
#blocks below: 3298
mapblk 0x00000000 offset: 40
Unlocked
--------------------------------------------------------
Low HighWater Mark :
Highwater:: 0x01000d62 ext#: 40 blk#: 98 ext size: 128
#blocks in seg. hdr's freelists: 0
#blocks below: 3298
mapblk 0x00000000 offset: 40
Level 1 BMB for High HWM block: 0x01000d01
Level 1 BMB for Low HWM block: 0x01000d01
--------------------------------------------------------
Segment Type: 1 nl2: 1 blksz: 8192 fbsz: 0
L2 Array start offset: 0x00001434
First Level 3 BMB: 0x00000000
L2 Hint for inserts: 0x01000081
Last Level 1 BMB: 0x01000d01
Last Level II BMB: 0x01000081
Last Level III BMB: 0x00000000
Map Header:: next 0x00000000 #extents: 41 obj#: 16220 flag: 0x10000000
Inc # 0
Extent Map
在上面没有看到有多少个空块,那就是在Last Level 1
(第一个块)和Last Level II
(第二个块)中找(他们都是十六进制数)
转换成十进制数
SQL> select to_number('01000d01','xxxxxxxx') from dual;
TO_NUMBER('01000D01','XXXXXXXX')
--------------------------------
16780545
SQL> select to_number('01000081','xxxxxxxx') from dual;
TO_NUMBER('01000081','XXXXXXXX')
--------------------------------
16777345
通过这两个块的十进制地址找到这两个块的数据文件和具体位置
SQL> select dbms_utility.data_block_address_file(16780545) from dual;
DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(16780545)
----------------------------------------------
4
SQL> select dbms_utility.data_block_address_block(16780545) from dual;
DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(16780545)
-----------------------------------------------
3329
SQL> select dbms_utility.data_block_address_file(16777345) from dual;
DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(16777345)
----------------------------------------------
4
SQL> select dbms_utility.data_block_address_block(16777345) from dual;
DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(16777345)
-----------------------------------------------
129
SQL> alter system dump datafile 4 block 3329;
System altered.
vim sales_ora_2108.trc
,要的就是下面的这些信息
Start dump data blocks tsn: 4 file#:4 minblk 3329 maxblk 3329
中间省略。。。
Dump of First Level Bitmap Block
--------------------------------
nbits : 4 nranges: 1 parent dba: 0x01000081 poffset: 57
unformatted: 30 total: 64 first useful block: 0
owning instance : 1
instance ownership changed at
Last successful Search
Freeness Status: nf1 0 nf2 0 nf3 0 nf4 0
Extent Map Block Offset: 4294967295
First free datablock : 34
Bitmap block lock opcode 2
Locker xid: : 0x000a.01d.000001bc
Dealloc scn: 655408.0
Flag: 0x00000001 (-/-/-/-/-/HWM)
Inc #: 0 Objd: 16220
HWM Flag: HWM Set
Highwater:: 0x01000d62 ext#: 40 blk#: 98 ext size: 128
#blocks in seg. hdr's freelists: 0
#blocks below: 3298
mapblk 0x00000000 offset: 40
--------------------------------------------------------
DBA Ranges :
--------------------------------------------------------
0x01000d40 Length: 64 Offset: 0
--五种状态full(满的);25-50% free(百分只25到50是满的);50-75% free(百分之50到75是满的);75-100% free(百分之75到100是满的);unformatted(未定义)
0:FULL 1:FULL 2:FULL 3:FULL
4:FULL 5:FULL 6:FULL 7:FULL
8:FULL 9:FULL 10:FULL 11:FULL
12:FULL 13:FULL 14:FULL 15:FULL
16:FULL 17:FULL 18:FULL 19:FULL
20:FULL 21:FULL 22:FULL 23:FULL
24:FULL 25:FULL 26:FULL 27:FULL
28:FULL 29:FULL 30:FULL 31:FULL
32:FULL 33:FULL 34:unformatted 35:unformatted
36:unformatted 37:unformatted 38:unformatted 39:unformatted
40:unformatted 41:unformatted 42:unformatted 43:unformatted
44:unformatted 45:unformatted 46:unformatted 47:unformatted
48:unformatted 49:unformatted 50:unformatted 51:unformatted
52:unformatted 53:unformatted 54:unformatted 55:unformatted
56:unformatted 57:unformatted 58:unformatted 59:unformatted
60:unformatted 61:unformatted 62:unformatted 63:unformatted
--------------------------------------------------------
End dump data blocks tsn: 4 file#: 4 minblk 3329 maxblk 3329
再看第二个块
SQL> alter system dump datafile 4 block 129;
System altered.
vim sales_ora_2108.trc
没有看到块使用情况的信息也就是说第一个块中记录了块的使用信息
Start dump data blocks tsn: 4 file#:4 minblk 129 maxblk 129
中间省略。。。。
--------------------------------------------------------
End dump data blocks tsn: 4 file#: 4 minblk 129 maxblk 129
现在tt2
的优化器统计信息是
先删除一小部分数据,delete
不会降低高水位线(blocks
)
SQL> delete from tt2 where object_id > 15000;
11488 rows deleted.
SQL> commit;
Commit complete.
看一下统计信息
SQL> set autotrace traceonly exp stat
SQL> select count(1) from tt2; --不加where条件为的是把高水位下的块都找一遍
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 896 (1)| 00:00:11 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TT2 | 247K| 896 (1)| 00:00:11 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3244 consistent gets --发生了3244次逻辑读
--逻辑读:从内存中访问的块的次数,一个块访问一千次逻辑读是1,每个块访问一次访问一千个块,逻辑读也是1
0 physical reads --没有物理读说明我们给的sga足够大,所以分给data buffer cache的空间也足够大
--物理读:把一个块从磁盘上读到内存中来,一次有可能读一个或多个块
0 redo size
528 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
重新收集一下统计信息
SQL> exec dbms_stats.gather_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
num_rows
变了,blocks
没有变
统计信息变化的不明显是因为我们的机器性能还是算是ok
,数据量太小,没有变化
SQL> select count(1) from tt2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 896 (1)| 00:00:11 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TT2 | 235K| 896 (1)| 00:00:11 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3244 consistent gets
0 physical reads
0 redo size
528 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
下面降低高水位也就是dba_tables
中的blocks
,其实就是整理这个表中的数据段,最古老的办法就是exp/imp
导入导出,这种方法太粗放
2.move
(表空间变化,数据文件变化,物理位置变化,rowid
变化)
SQL> alter table tt2 move tablespace xiha;
Table altered.
但是这种方法会使表上的index
失效,这是因为index
中有rowid
,现在rowid
变化了
重新收集统计信息(不收集index
上的统计信息)
SQL> exec dbms_stats.gather_table_stats('sys','tt2',cascade=>false);
PL/SQL procedure successfully completed.
现在blocks
就变小了,空块被释放了
看逻辑读也变小了
SQL> set autotrace traceonly exp stat
SQL> select count(1) from tt2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 564 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TT2 | 156K| 564 (1)| 00:00:07 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
25 recursive calls --因为刚刚执行了DDL命令(move),所以优化器统计信息也老化了,Oracle在执行sql的时候进行了重新解析所以这里有递归调用,递归调用是解析sql的时候要进行系统资源的消耗,比如查看数据字典有没有这个对象,有没有权限等
0 db block gets
2054 consistent gets --在内存中访问了多少块
0 physical reads
116 redo size
528 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed
在执行一次就没有recursive calls
(递归调用)了
SQL> select count(1) from tt2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 564 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TT2 | 156K| 564 (1)| 00:00:07 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2037 consistent gets
0 physical reads
0 redo size
528 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
现在重建index
这种方式比drop
之后create
要快一点因为已经有了idx_tt2
这个段,建索引加锁会阻塞DML
操作,加上online
参数不会阻塞DML
操作,建议避免在业务高峰期加索引
SQL> alter index idx_tt2 rebuild online;
Index altered.
总结:所以要在频繁删除而又没有及时插入的表想办法降低他的高水位
段的管理方式
assm:auto segment space managment
mssm:manual segment space managment
区的管理方式
local management tablespace
dictionary management tablespace
http://www.itpub.net/thread-909312-1-1.html中的LMT就是local,DMT就是dictionary
从10G
开始:
3.shrink
(收缩),有条件:dba_segments
中的segment_subtype
必须是ASSM
,shrink
的原理就是delete
然后insert
,在大表中操作的时候要注意undo
表空间够不够大
实验:
删除一部分数据肯定又产生了空块
SQL> select count(1) from tt2;
COUNT(1)
----------
494272
SQL> delete from tt2 where object_id > 15000;
23040 rows deleted.
SQL> commit;
Commit complete.
现在高水位线(blocks
)是6556
下面报错是因为在创建表的时候在元数据中是有记录的
SQL> alter table tt2 shrink space;
alter table tt2 shrink space
*
ERROR at line 1:
ORA-10636: ROW MOVEMENT is not enabled
默认情况下行移动是disabled
的,disabled
的意思是数据段中的数据在磁盘上没有特殊情况不能移动除非是删除,也就是一旦放在哪个地方就不能动了,因为变化了rowid
就变化了为了防止index
失效所以在元数据中标识不能行移动
激活行移动
SQL> alter table tt2 enable row movement;
Table altered.
因为上面说的shrink
原理是DML
操作,DML
操作会维护index
SQL> alter table tt2 shrink space;
Table altered.
index
没有失效
重新收集一下优化器统计信息
SQL> exec dbms_stats.gather_table_stats('sys','tt2');
PL/SQL procedure successfully completed.
blocks
降下来了
之前的实验都是全表扫描为的是访问所有块验证高水位
要是走index
呢,下面的性能非常好,没有物理读,逻辑读是19
也就是在内存中访问了19
个块就结束了
SQL> set autotrace traceonly exp stat
SQL> select count(1) from tt2 where object_id = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1151720287
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | INDEX RANGE SCAN| IDX_TT2 | 32 | 160 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
22 recursive calls
0 db block gets
19 consistent gets
0 physical reads
0 redo size
526 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed
Oracle
在生成执行计划的时候依赖统计信息比如dba_tables
中的num_rows、blocks、avg_row_len
下面通过实验验证手动修改了优化器统计信息之后对执行计划的影响
手动修改之前正确的执行计划是先走索引再走表
SQL> select count(object_name) from tt2 where object_id = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 3949934342
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 35 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 25 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TT2 | 32 | 800 | 35 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_TT2 | 32 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
38 consistent gets
0 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
手动修改优化器统计信息
SQL> exec dbms_stats.set_table_stats('sys','tt2',numrows=>10);
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.set_table_stats('sys','tt2',numblks=>10);
PL/SQL procedure successfully completed.
已经修改了
这个时候再看执行计划,因为他依赖统计信息,所以他以为这个表很小没有必要走index
,所有产生了错误的执行计划
SQL> select count(object_name) from tt2 where object_id = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1424956034
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 4 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 25 | | |
|* 2 | TABLE ACCESS FULL| TT2 | 1 | 25 | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID"=2)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
39 consistent gets
0 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
注意:上面的执行计划都是估算的