cume_dist()和percent_rank()这两个函数了,现在把他们的区别在这用个例子说明一下
SQL> create table cume (
2 id integer,
3 value number(8,2),
4 name varchar2(30));
Table created.
SQL>
SQL> select * from cume;
ID VALUE NAME
---------- ---------- ------------------------------------------------------------
1 123 t1
2 234 t2
2 234 t21
3 345 t3
4 456 t4
5 567 t5
6 567 t6
7 rows selected.
SQL>
SQL> select id, value, cume_dist() over (order by value desc) as cume_dist, percent_rank() over (order by value desc) as percent_rank from cume;
ID VALUE CUME_DIST PERCENT_RANK
---------- ---------- ---------- ------------
5 567 .285714286 0
6 567 .285714286 0
4 456 .428571429 .333333333
3 345 .571428571 .5
2 234 .857142857 .666666667
2 234 .857142857 .666666667
1 123 1 1
7 rows selected.
数字看着不是很整齐,我们处理下
SQL> delete from cume where id = 2;
2 rows deleted.
SQL> select id, value, cume_dist() over (order by value desc) as cume_dist, percent_rank() over (order by value desc) as percent_rank from cume;
ID VALUE CUME_DIST PERCENT_RANK
---------- ---------- ---------- ------------
5 567 .4 0
6 567 .4 0
4 456 .6 .5
3 345 .8 .75
1 123 1 1
SQL>
从上面的例子我们可以看出cume_dist()函数统计的范围是0< cume_dist() <=1,但是取不到0。percent_rank()也是0<= percent_rank() <=1,0和1都可以取道的。
cume_disk()的计算公式应该是 = ((统计的列的种类)/(统计的列的总数))*每个重复种类中的个数;如value值为567的两个值,公式就是(4/5)*2=0.4。而最大就是1了
percetn_rank()的计算公式应该是 = ((统计的列的种类)/(统计的列的总数))*100%,但是第一个总是从0开始的,不管是重复几次,要是567有4次,那就是有四个都是0,最后一个是1。