使用到的函数
lpad(string str, int len, pad):
返回值:string,
说明:将str进行用pad进行左补到len位
如:hive> select lpad(‘abc’,10,‘td’) from lxw1234;
tdtdtdtabc
over(partition by class order by sroce):
按照sroce排序进行累计,order by是个默认的开窗函数,按照class分区。
row_number 与over结合使用
cast:
类型转换
concat_ws(SEP,b,c…):
SEP表示分隔符,将后面的参数连接成字符串
collect_list:
结合group by返回数组
WITH tbl_hs AS
(SELECT t1.a,
t1.b,
row_number() over(partition BY t1.a
ORDER BY t1.c ASC,lpad(t1.d,5,'0') ASC) AS rn
FROM tabel1 t1
WHERE t1.day='20200607'
AND info_type='10'
AND t1.b NOT IN ('111',
'222',
'333') ),
TEMP AS
(SELECT a,
collect_list(concat_ws(':',lpad(cast(rn AS string),5,'0'),cast(b AS string))) as all_list
FROM tbl_hs
GROUP BY a)
SELECT * FROM TEMP WHERE size(all_list)>2