一、多行转单行
样例:表test存储了以下数据
hotel search id poi
90663029 5 15306667 巫山县教委关心下一代工作委员会
90663029 2 15381967 文体广场
90663029 2 15381967 文体广场
90663029 1 15433199 巫山翠屏汽车站
90808282 9 15451636 巫山县教育委员会
90808282 5 15306667 巫山县教委关心下一代工作委员会
90808282 1 15367082 南峰小学
1.collect_set(x)
返回值: 字段的罗列
说明:对聚合范围内某字段所有值的非重复罗列。
select mhotel,concat_ws(',',collect_set(id))
from ggg group by mhotel;
2.collect_list(x)
返回值: 字段的罗列
说明:对聚合范围内某字段所有值的罗列(可重复)。
select mhotel,concat_ws(',',collect_list(id))
from ggg group by mhotel;
二、单行转多行(Array或map)
explode(ARRAY):参数列为数组类型,将数组数据中的每个元素做为一行返回
explode(MAP):将输入map中的每个键值对转换为两列,然后返回新行
posexplode(ARRAY):与explode(ARRAY)相似,但包含元素在数组中的位置(pos,value)
样例:
id mvt
3 NULL
2 [{"val":"1","vid":"142","eid":"154"},{"val":"0","vid":"141","eid":"152"},{"val":"0","vid":"190","eid":"201"},{"val":"1","vid":"147","eid":"160"},{"val":"0","vid":"302","eid":"315"},{"val":"0","vid":"211","eid":"224"},{"val":"1","vid":"133","eid":"144"},{"val":"0","vid":"268","eid":"281"},{"val":"0","vid":"271","eid":"284"},{"val":"0","vid":"258","eid":"271"},{"val":"1","vid":"91","eid":"104"},{"val":"1","vid":"231","eid":"244"}]
1 [{"val":"1","vid":"142","eid":"154"},{"val":"0","vid":"141","eid":"152"},{"val":"0","vid":"190","eid":"201"},{"val":"1","vid":"147","eid":"160"},{"val":"0","vid":"302","eid":"315"},{"val":"0","vid":"211","eid":"224"},{"val":"1","vid":"133","eid":"144"},{"val":"0","vid":"268","eid":"281"},{"val":"0","vid":"271","eid":"284"},{"val":"0","vid":"258","eid":"271"},{"val":"1","vid":"91","eid":"104"},{"val":"1","vid":"231","eid":"244"}]
1.explode(x)
mvt:array<map<string,string>>
select explode(mvt) from hhh;
2.lateral VIEW explode
mvt:array<map<string,string>>
select id,mvt_info from hhh
lateral VIEW explode(mvt) mvts AS mvt_info;
解析ARRAY并获取map的value:
select id,mvt_info['eid'],mvt_info['val'],mvt_info['vid'] from hhh
lateral VIEW explode(mvt) mvts AS mvt_info
解析ARRAY并解析包含的map:
select id,explode(mvt_info)
from hhh lateral VIEW explode(mvt) mvts AS mvt_info
3.lateral VIEW posexplode
mvt:array<map<string,string>>
select id,i,mvt_info from hhh
lateral VIEW posexplode(mvt) mvts AS i,mvt_info;
4.lateral VIEW outer explode
mvt:array<map<string,string>>
explode的列为空时,lateral VIEW不会生成行。为避免该数据行不显示可使用outer关键字,它能够保留原数据行内容并将扩展列的列值置为空。
select id,mvt_info from hhh
lateral VIEW outer explode(mvt) mvts AS mvt_info;
三、单行转多行(string)
样例: pois:string
mhotel pois
90663029 [{"search": 5, "id": 15306667, "poi": "巫山县教委关心下一代工作委员会"}, {"search": 2, "id": 15381967, "poi": "文体广场"}, {"search": 1, "id": 15433199, "poi": "巫
山翠屏汽车站"}]
90808282 [{"search": 9, "id": 15451636, "poi": "巫山县教育委员会"}, {"search": 5, "id": 15306667, "poi": "巫山县教委关心下一代工作委员会"}, {"search": 1, "id": 15367082, "poi": "南峰小学"}]
1.hive String处理
SELECT
mhotel,
json_tuple(poi_json,'search','id','poi') AS (search,id,poi)
FROM
(SELECT
mhotel,
regexp_replace(regexp_replace(pois, '\\[|\\]', ''), '\\}, \\{', '\\}&\\{') pois
FROM iii) t
lateral VIEW explode(split(pois, '\\&')) pois AS poi_json
2.Shell String处理
#!/bin/bash
y='2018-09-15'
for i in $(seq 1 6)
do
x=`date -d "$i day $dt" "+%Y-%m-%d"`
y=$y,$x
done
echo $y
hive -e "select mhotel,dt
from iii lateral view explode(split('$y',',')) dts as dt"