关键字:
count、distinct、人大金仓、KingbaseES
语法
COUNT函数返回查询结果集的行数,它即可以用作聚集函数,也可以用作窗口函数。
聚集函数:
aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
窗口函数:
function_name ([[DISTINCT] expression [, expression ... ]]) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ([[DISTINCT] expression [, expression ... ]]) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
其中window_definition的语法是
[ existing_window_name ]
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]
规则
(1)COUNT(expression)分支返回表达式expression为非空的行数, DISTINCT指定返回expression非重复值行数,支持视图和dump/restore工具。
(2)当COUNT用作聚集函数时,如果在order_by_clause之外指定了DISTINCT,那么所有的ORDER BY表达式必须匹配聚集的常规参数。也就是说,不能在DISTINCT列表没有包括的表达式上排序。
(3)当COUNT用作窗口函数时,OVER子句中的PARTITION BY子句、ORDER BY子句和FRAME子句可以全部省略或任选。
(4)当指定DISTINCT时,窗口函数仅支持基本的OVER子句,即PARTITION BY子句是OVER子句的唯一有效部分。这也同样适用于使用WINDOW子句单独定义的窗口。
(5)当指定DISTINCT时,返回结果集按“PARTITION BY表达式+DISTINCT表达式”排序(不指定SELECT的ORDER BY子句),窗口函数的计算范围是分区中的所有行,聚集计算忽略分区中DISTINCT列的重复值。
数据准备
(1)创建数据表
test=# create table distinct_tab(a int, b int, c int, d int);
CREATE TABLE
(2)插入数据
test=# insert into distinct_tab values (1, 1, 1, 5);
INSERT 0 1
test=# insert into distinct_tab values (1, 1, 2, 2);
INSERT 0 1
test=# insert into distinct_tab values (1, 2, 2, 4);
INSERT 0 1
test=# insert into distinct_tab values (1, 2, 3, 1);
INSERT 0 1
test=# insert into distinct_tab values (1, 2, 3, 1);
INSERT 0 1
test=# insert into distinct_tab values (2, 3, 1, 3);
INSERT 0 1
test=# insert into distinct_tab values (2, 3, 1, 5);
INSERT 0 1
test=# insert into distinct_tab values (2, 3, 2, 2);
INSERT 0 1
test=# insert into distinct_tab values (2, 4, 2, 4);
INSERT 0 1
test=# insert into distinct_tab values (2, 4, 3, 1);
INSERT 0 1
test=# insert into distinct_tab values (2, 4, 3, 1);
INSERT 0 1
(3)查看数据表
test=# select * from distinct_tab;
a | b | c | d
---+---+---+---
1 | 1 | 1 | 3
1 | 1 | 1 | 5
1 | 1 | 2 | 2
1 | 2 | 2 | 4
1 | 2 | 3 | 1
1 | 2 | 3 | 1
2 | 3 | 1 | 3
2 | 3 | 1 | 5
2 | 3 | 2 | 2
2 | 4 | 2 | 4
2 | 4 | 3 | 1
2 | 4 | 3 | 1
(12 rows)
测试用例
(1)不带DISTINCT的COUNT函数。在不带DISTINCT时,返回结果集按PARTITION BY表达式排序,计算结果包含DISTINCT列重复值。
test=# select a, b, c, d, count(c) over (partition by a)
from distinct_tab order by a, b, c, d;
a | b | c | d | count
---+---+---+---+-------
1 | 1 | 1 | 3 | 6
1 | 1 | 1 | 5 | 6
1 | 1 | 2 | 2 | 6
1 | 2 | 2 | 4 | 6
1 | 2 | 3 | 1 | 6
1 | 2 | 3 | 1 | 6
2 | 3 | 1 | 3 | 6
2 | 3 | 1 | 5 | 6
2 | 3 | 2 | 2 | 6
2 | 4 | 2 | 4 | 6
2 | 4 | 3 | 1 | 6
2 | 4 | 3 | 1 | 6
(12 rows)
(2)带DISTINCT的COUNT函数。带DISTINCT时,返回结果集按“PARTITION BY表达式+DISTINCT表达式”排序的,COUNT函数的计算范围是分区中的所有行,计算结果不包含DISTINCT列重复值。
test=# select a, b, c, d, count(distinct c) over (partition by a)
test-# from distinct_tab order by a, b, c, d;
a | b | c | d | count
---+---+---+---+-------
1 | 1 | 1 | 3 | 3
1 | 1 | 1 | 5 | 3
1 | 1 | 2 | 2 | 3
1 | 2 | 2 | 4 | 3
1 | 2 | 3 | 1 | 3
1 | 2 | 3 | 1 | 3
2 | 3 | 1 | 3 | 3
2 | 3 | 1 | 5 | 3
2 | 3 | 2 | 2 | 3
2 | 4 | 2 | 4 | 3
2 | 4 | 3 | 1 | 3
2 | 4 | 3 | 1 | 3
(12 rows)
(3)不带DISTINCT,且OVER子句中带ORDER BY子句的COUNT函数。
当在OVER子句中省略ORDER BY子句时,无论是否带DISTINCT,COUNT函数返回结果集都是只按PARTITION BY表达式排序,COUNT函数的计算范围是分区中的所有行。当在OVER子句中指定ORDER BY子句时,无论是否带DISTINCT,COUNT函数返回结果集都是按“PARTITION BY表达式+ORDER BY表达式”排序,COUNT函数计算的范围是ORDER BY子句和WINDOW子句共同指定的窗口帧内的行。
test=# select a, b, c, d, count(c) over (partition by a order by b) as c1, count(d) over (partition by b order by a) as c2 from distinct_tab;
a | b | c | d | c1 | c2
---+---+---+---+----+----
1 | 1 | 1 | 3 | 3 | 3
1 | 1 | 1 | 5 | 3 | 3
1 | 1 | 2 | 2 | 3 | 3
1 | 2 | 2 | 4 | 6 | 3
1 | 2 | 3 | 1 | 6 | 3
1 | 2 | 3 | 1 | 6 | 3
2 | 3 | 1 | 3 | 3 | 3
2 | 3 | 1 | 5 | 3 | 3
2 | 3 | 2 | 2 | 3 | 3
2 | 4 | 2 | 4 | 6 | 3
2 | 4 | 3 | 1 | 6 | 3
2 | 4 | 3 | 1 | 6 | 3
(12 rows)