窗口函数的使用:(over就是使用窗口函数)
具体语法如下:over (partition by xxx order by xxx rows between xxx and xxx )
设置窗口大小:rows between xx and xx
preceding:往前 3 preceding
following:往后 3 following
current row:当前行
unbounded:起点
unbounded preceding:从前面的起点
unbouned following :表示到后面的终点
#先前3个窗口
over(partition by xx order by xxx rows between 3 preceding and current row)
#向后3个窗口
over(partition by xx order by xxx rows between 3 following and current row)
或者over(partition by xx order by xxx rows between current row and 3 following)
#开始到当前组
over(partition by xx order by xxx row between unbounded preceding and current row)
#当前到结束
over(partition by xx order by xxx row between current row and unbounded following)
设定好窗口之后,就是各种函数的使用
#sum,avg,min,max 函数
#row_number() 12345
#rank() 12234
#dense_rank() 122345
#ntile(x):x表示分组的组数
#
percent_rank() :row_number-1/total_number-1
cume_dist : row_number/total_number
# lead() leg() first_value() last_value()
参考教程#聚合 grouping sets with rollup with cube
#()表示所有
group by a,b,c grouping sets((a,b,c),(a,b),())
# with rollup 相当于((a,b,c),(a,b),(a),())
group by a,b,c with rollup
#with cube 相单与 ((a),(b),(c),(a,b),(a,c),(b,c),(a,b,c),())
# 并集
#并集并去重
union [distinct]
union all
#交集
intersect [distinct]
intersect all
#补集
except
except [all]
minus
minus [all]
语法:LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)