Hive
Hive技巧
chenzhh25
Python、Hive、数据分析、爬虫
展开
-
Hive和Presto行转列、列转行
1、行转列数据源:要实现的效果:Hive:collect_set转为数组并去重,concat_ws将数组用逗号间隔连接成字符串select user_id , concat_ws(',', collect_set(order_id)) as order_idsfrom tmp.tmp_row_to_colwhere 1 = 1group by user_id...原创 2020-04-24 18:50:50 · 3763 阅读 · 0 评论 -
greatest和least函数,实现多列取最大、最小值
select greatest(1, 2, 3, 4) ; 结果:4select least(1, 2, 3, 4) ; 结果:1select greatest(1, 2, null, 3, 4) ; 结果:空select least(1, 2, null, 3, 4) ; 结果:空Hive和Presto通用,值得注意的是,用这两个函数时,各列不能存在null值,也最好都...原创 2019-12-27 14:18:14 · 4127 阅读 · 0 评论 -
row_number、rank、dense_rank窗口函数
row_numberrow_number() over(partititon by col1 order by col2) as rank_num1结果:1,2,3,4,5rankrank() over(partititon by col1 order by col2) as rank_num2结果:1,2,2,4,5dense_rankdense_rank() ove...原创 2019-12-27 14:01:05 · 142 阅读 · 0 评论 -
Presto、Hive、Mysql出现单引号
-- 1、方式一select country_nmfrom dw_pub_country_tdwhere country_nm = 'Lao People''s Democratic Republic'-- 2、方式二select country_nmfrom dw_pub_country_tdwhere country_nm = "Lao People's Democratic...原创 2019-12-27 11:29:09 · 2268 阅读 · 2 评论 -
Hive sql实现查询连续n天登录的用户
用个排序的窗口函数即可实现:select distinct user_idfrom ( select user_id , continue_date , count(login_date) as continue_day_cnt from ( select user_id , login_date ...原创 2019-12-26 18:03:46 · 3390 阅读 · 3 评论