Oracle当number类型超过一定长度直方图限制
背景
生产系统,监控巡检发现某个SQL逻辑读非常高,通过查看执行计划,存在三个执行计划,最高cost 9w多,较低的两个位100左右。
经过分析,把具体分析以及发现虽然查询where条件列拥有直方图,且存在数据倾斜,但是CBO依然无法判断数据分布情况通过
模拟测试如下。
1. 11.2.0.4构造测试环境
1.1 创建测试表,包含两列,均为number
create table test_hist (id1 number(30) primary key,id2 number(20) not null);
1.2 插入数据
10000条数据,均为15位加1~10000数字构成,也就是两列的distinct均与行数相同
begin for i in 1..10000 loop insert into test_hist values(111021111112345||i,111021111112345||i); end loop; commit; end; /
1.3 再次插入1w条数据
第一列在上面1w的基础上递增,第二列与上面中第一行数据相同,也就是第二列的分布已经倾斜,1110211111123451值在2w总数据中有10001条。
begin for i in 1..10000 loop insert into test_hist values(1110211111123451000||i,1110211111123451); end loop; commit; end; / SCOTT@honor1 > select count(*) from test_hist; COUNT(*) ---------------------------------------- 20000 SCOTT@honor1 > select count(*) from test_hist where id2=1110211111123451; COUNT(*) ---------------------------------------- 10001
1.4 在第二列创建普通索引
create index idx_test_hist_id2 on test_hist(id2);
1.5 收集统计信息,不收集直方图
exec dbms_stats.gather_table_stats(user,'TEST_HIST',method_opt=>'for all columns size 1',estimate_percent=>100,cascade=>true,no_invalidate=>false);
1.6 查看数据分布
set lines 200 pages 200 col table_name for a15 col num_distinct for 9999999 col density for 999.9999999 col num_nulls for 9999 col num_buckets for 9999 col low_value for a25 col high_value for a25 select table_name,column_name,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM from user_tab_col_statistics where table_name='TEST_HIST'; TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM ---------- -------------------- ------------ ------------------------------ ------------------------- ------------ --------- ----------- --------------- TEST_HIST ID1 20000 C80C0B160C0C0D2334 CC0C0B160C0C0D23340102 .0000500 0 1 NONE TEST_HIST ID2 10000 C80C0B160C0C0D2334 CA0C0B160C0C0D2334 .0001000 0 1 NONE col ENDPOINT_ACTUAL_VALUE for a25 col ENDPOINT_ACTUAL_VALUE_RAW for a25 select * from user_tab_histograms where table_name='TEST_HIST'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU ----------- ---------------------------- ------------------------- -------------------- TEST_HIST ID1 0 1110211111123450 TEST_HIST ID2 0 1110211111123450 TEST_HIST ID1 1 111021111112345000000000 TEST_HIST ID2 1 11102111111234500000
2. 测试SQL语句
2.1 查看执行计划
SCOTT@honor1 > set autotrace traceonly SCOTT@honor1 > select /*+ gather_plan_statistics */ id2 from test_hist where id2=1110211111123451; 10001 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1411376830 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 22 | 1 (0)| 00:00:01 | |* 1 | INDEX RANGE SCAN| IDX_TEST_HIST_ID2 | 2 | 22 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------
2.2 查看真实执行计划
SYS@honor1 > select sql_id,sql_text,child_number,plan_hash_value from v$sql where sql_text like 'select /*+ gather_plan_statistics */ id2 from test_hist%'; SYS@honor1 > select * from table(dbms_xplan.display_cursor('02v238rmgtfg8',0,'allstats last')); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------- SQL_ID 02v238rmgtfg8, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ id2 from test_hist where id2=1110211111123451 Plan hash value: 1411376830 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 10001 |00:00:00.01 | 696 | 31 | |* 1 | INDEX RANGE SCAN| IDX_TEST_HIST_ID2 | 1 | 2 | 10001 |00:00:00.01 | 696 | 31 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("ID2"=1110211111123451) 19 rows selected.
# 可以看到由于CBO优化器不知道ID2存在倾斜,导致estimate-rows与actual-rows存在非常大偏差,导致选择了错误执行计划,走了索引。
2.3 收集直方图
# 11g最大直方图buckets为254,所以按照最大收集直方图。
exec dbms_stats.gather_table_stats(user,'TEST_HIST',method_opt=>'for all columns size 254', estimate_percent=>100,cascade=>true,no_invalidate=>false); set lines 200 pages 200 col table_name for a15 col num_distinct for 9999999 col density for 999.9999999 col num_nulls for 9999 col num_buckets for 9999 col low_value for a25 col high_value for a25 select table_name,column_name,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM from user_tab_col_statistics where table_name='TEST_HIST' TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM --------------- -------------------- ------------ ------------------------- ------------------------- ------------ --------- ----------- --------------- TEST_HIST ID1 20000 C80C0B160C0C0D2334 CC0C0B160C0C0D23340102 .0000500 0 254 HEIGHT BALANCED TEST_HIST ID2