- 博客(10)
- 资源 (4)
- 收藏
- 关注
原创 Hive进行数据抽样& 随机抽取
1. 直接抽样-- 原来的50%select * from liyang tablesample(50 percent)-- 30Mselect * from liyang tablesample(30M)-- 200行 每个map200行select * from liyang tablesample(200 rows)2. 分桶抽样Hive中的分桶表(Bucket Tab...
2019-12-18 16:28:52 2874
原创 Hive书写正则表达式
1. RLIKE先引入文档中的一段话NULL if A or B is NULL, TRUE if any (possibly empty) substring of A matches the Java regular expression B, otherwise FALSE. For example, 'foobar' RLIKE 'foo' evaluates to TRUE and ...
2019-12-18 15:32:27 624
原创 Hive计算日环比 周同比
-- (本周期 - 上周期) / 上周期select date_time,val,(val-val1)/val1 as `日环比`,(val-val7)/val7 as `周同比`from( select date_time ,val ,lag(val,1) over(order by date_time) as val1 ,lag(val,7) over(order b...
2019-12-17 13:58:24 5044
原创 Mysql实现开窗函数
原始oracle代码select row_number() over(partition by a.statis_day,a.app_version order by to_number(a.all_consume/a.cnt) desc) as rnfrom( select statis_day ,app_version ,cast(action_consume+gl_cons...
2019-12-17 12:58:16 624
原创 Hive&Mysql开窗函数
SUM AVG MIN MAXSELECT id,date_time,pv,-- 从第一行到当前行进行sumSUM(pv) OVER(PARTITION BY id ORDER BY date_time) AS pv1,-- 从第一行到当前行进行sumSUM(pv) OVER(PARTITION BY id ORDER BY date_time ROWS BETWEEN UNBOU...
2019-12-16 18:01:40 761
原创 Hive取整 或者说 四舍五入
大于等于SQL> select ceil(23.33) from dual;24小于等于SQL> select floor(23.33) from dual;23四舍五入SQL> select round(23.33) from dual; -- round(x,[y])23截断SQL> select trunc(23.33) from ...
2019-12-16 15:12:31 3185
原创 cube rollup 和 grouping sets合计
cubeselect if(grouping(a)=1, 'ALL', a) as a,if(grouping(b)=1, 'ALL', b) as b,count(1)from 表名group bycube(a, b)等价于select 'ALL' as a,'ALL' as b,count(1) from 表名 -- 整体进行聚合union allse...
2019-12-16 14:05:30 120
原创 Hive解析 字符串和json串
解析字符串-- A23=02001&A5=591&A33=CTRadstr_to_map(字段名,'&','=')['参数名'] as x
2019-12-16 13:46:56 456
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人