注:技术交流可以加我VX:k-loop,昵称:默读者。
1,表tmp_ma中,存放了用户A,出现在(北京市,长春市,吉林市)3个城市。
uid | stage |
A | 北京市,长春市,吉林市 |
需求:查看每个城市出现的会员
分析:将(北京市,长春市,吉林市)按照(,)分隔,然后转换成3行数据展示
语句:
select uid, stage_someone as stage
from tmp_ma
lateral view explode(split(stage,',')) tmp_stage as stage_someone
where uid='A'
;
结果:
uid | stage |
A | 北京市 |
A | 长春市 |
A | 吉林市 |
需求:查看A在每个城市出现的次数
分析:将(北京市,长春市,吉林市)按照(,)分隔,然后统计每个城市A出现的次数
语句:
select count(uid) as num ,stage_someone as stage
from tmp_ma
lateral view explode(split(stage,',')) tmp_stage as stage_someone
where uid='A'
group by stage_someone
;
结果:
num | stage |
1 | 北京市 |
1 | 吉林市 |
1 | 长春市 |
2,表tmp_ma中,存放了用户B,出现的城市,一个城市一条记录。
uid | stage |
B | 北京市 |
B | 长春市 |
需求:将B出现城市合并到一条数据上,城市与城市之间用(,)分隔
分析:按照B分组统计
语句:
select uid,concat_ws(',',collect_set(stage) ) as stage
from tmp_ma
where uid='B'
group by uid
;
结果:
uid | stage |
B | 北京市,长春市 |
3,表tmp_ma中,存放了用户B,出现的城市,一个城市一条记录,用户A,出现的城市,一个用户一条记录。
uid | stage |
A | 北京市,长春市,吉林市 |
B | 北京市 |
B | 长春市 |
需求:每个城市出现的用户,一个城市一个用户一条记录
语句:
select uid, stage_someone as stage
from tmp_ma
lateral view explode(split(stage,',')) tmp_stage as stage_someone
;
结果:
uid | stage |
A | 北京市 |
A | 长春市 |
A | 吉林市 |
B | 北京市 |
B | 长春市 |
需求:每个城市出现的用户,一个城市一条记录(用户与用户之间用英文逗号分隔)
语句:
select concat_ws(',',collect_set(uid) ) uid, stage_someone as stage
from tmp_ma
lateral view explode(split(stage,',')) tmp_stage as stage_someone
group by stage_someone
;
结果:
uid | stage |
A,B | 北京市 |
A | 吉林市 |
A,B | 长春市 |
需求:每个城市出现的不同的用户数,一个城市一条记录。
语句:
select count(distinct uid) num, stage_someone as stage
from tmp_ma
lateral view explode(split(stage,',')) tmp_stage as stage_someone
group by stage_someone
;
结果:
num | stage |
2 | 北京市 |
1 | 吉林市 |
2 | 长春市 |