Oracle histograms -- interpreting frequency histogram

最新推荐文章于 2023-07-06 15:40:45 发布

chncaesar

最新推荐文章于 2023-07-06 15:40:45 发布

阅读量1.3k

点赞数

分类专栏： Oracle 文章标签： Frequency histogram

Oracle 专栏收录该内容

33 篇文章 0 订阅

订阅专栏

  http://jonathanlewis.wordpress.com/2010/09/20/frequency-histograms-2/ 

  原文请自备梯子。 
 
 
 

  Frequency histogram的概念请看官方文档： 

  http://docs.oracle.com/cd/E11882_01/server.112/e41573/stats.htm#i41591 
 
 
 
 
 

  Jonathan在这篇文章里介绍了怎么去解读数字型字段的Frequency Histogram。下面给出源代码： 
 
 create table t1(n1 number not null);
--insert 100 rows, 10 values (10,20...100) 
 insert into t1
select 10* round(dbms_random.value(1,10)) from dual connect by level <= 100;
commit; 
 select n1, count(*) from t1 group by n1 order by n1; 
         N1   COUNT(*)
---------- ----------
        10          9 
        20         12 
        30         16 
        40         11 
        50         11 
        60          8 
        70          7 
        80         12 
        90          8 
       100          6  
 --Gathering statistics, size 254 guarantees the creation of frequency histogram 
exec dbms_stats.gather_table_stats(user, 'T1', method_opt=>'FOR COLUMNS N1 SIZE 254',cascade=>true);  
--Viewing histogram info 
 select *
from user_tab_histograms
where table_name='T1' and column_name='N1'
order by endpoint_number; 
 --Getting the occurancy of each value.
select
    endpoint_value              column_value,
    endpoint_number - nvl(prev_endpoint,0)  frequency
from (
    select
        endpoint_number,
        lag(endpoint_number,1) over(order by endpoint_number)  prev_endpoint,
        endpoint_value
    from user_tab_histograms
    where table_name  = 'T1' and column_name = 'N1'
    )
order by endpoint_number; 
 COLUMN_VALUE  FREQUENCY
------------ ----------
          10          9 
          20         12 
          30         16 
          40         11 
          50         11 
          60          8 
          70          7 
          80         12 
          90          8 
         100          6 
 
 
 
 提示： 

  endpoint value - 字段的值 

  endpoint number - 累加型值重复次数。oracle保证顺序性。 
 
 
 
 
 为何Frequency Histogram的数据会失真？ 

  http://jonathanlewis.wordpress.com/2010/09/24/frequency-histogram-3/ 
 
 

  这篇文章中,Jonathan介绍了一个原因。收集统计数据时取样大小。 
 
 
 

  Date型字段的Frequency Histogram的分析代码： 
 
 create table t2(d1 date not null);
insert into t2
select sysdate - round(dbms_random.value(1,10)) from dual connect by level <= 10;
commit;
exec dbms_stats.gather_table_stats(user, 'T2', method_opt=>'FOR COLUMNS D1 SIZE 254',cascade=>true); 
 select d1, count(*) from t2
group by d1
order by d1; 
 D1          COUNT(*)
--------- ----------
17-DEC-13          2 
19-DEC-13          1 
20-DEC-13          2 
21-DEC-13          1 
22-DEC-13          3 
23-DEC-13          1 
 
 select ep_value column_value, endpoint_number - nvl(prev_endpoint,0) frequency
from (
    select endpoint_number,
        lag(endpoint_number,1) over(order by endpoint_number)  prev_endpoint,
        to_date(to_char(endpoint_value,'FM99999999') || '.' || to_char(86400 * mod(endpoint_value,1),'FM99999'),'J.sssss') ep_value
    from user_tab_histograms
    where table_name  = 'T2' and column_name = 'D1'
    )
order by endpoint_number; 
 COLUMN_VALUE  FREQUENCY
------------ ----------
17-DEC-13             2 
19-DEC-13             1 
20-DEC-13             2 
21-DEC-13             1 
22-DEC-13             3 
23-DEC-13             1 
DATE类型在内部存储为数字类型，小数点后是小时分秒部分。故而需要将其解析成可读的类型。 
 
 
 

chncaesar

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Oracle histograms -- interpreting frequency histogram

http://jonathanlewis.wordpress.com/2010/09/20/frequency-histograms-2/原文请自备梯子。Frequency histogram的概念请看官方文档：http://docs.oracle.com/cd/E11882_01/server.112/e41573/stats.htm#i41591Jonath
复制链接

扫一扫