简单用法
举个例子
select collect_list(name) from t_order;
结果
+------------------------------------------------------------------------------------------------------+
|c0 |
+------------------------------------------------------------------------------------------------------+
|["saml","saml","tony","saml","tony","tony","saml","mart","saml","mart","mart","mart","neil","neil",""]|
+------------------------------------------------------------------------------------------------------+
select explode(collect_list(name)) from t_order;
+----+
|col |
+----+
|saml|
|saml|
|tony|
|saml|
|tony|
|tony|
|saml|
|mart|
|saml|
|mart|
|mart|
|mart|
|neil|
|neil|
| |
+----+
select explode(collect_set(name)) from t_order;
+----+
|col |
+----+
|saml|
|tony|
|mart|
|neil|
| |
+----+
与lateral view结合等的复杂用法
设某表有一个字段extinfo
数据类似如下,类型为json
,需要提取其中target_ids并展开
{"target_type":"4","target_keys":"20402","target_order":"31","target_ids":"[\"4320402595801\",\"4320402133801\",\"4320402919201\",\"4320402238501\"]"}
分布
首先提取对应json字段的值
select device_num,get_json_object(extinfo,'$.target_ids')
+----------+-----------------------------------------------------------------+
|device_num|c1 |
+----------+-----------------------------------------------------------------+
|586344 |["4320402595801","4320402133801","4320402919201","4320402238501"]|
+----------+-----------------------------------------------------------------+
利用正则替换去掉多余的符号
select device_num,regexp_replace(get_json_object(extinfo, '$.target_ids'),
'[\\[\\"\\]]','')
结果
+----------+-------------------------------------------------------+
|device_num|c1 |
+----------+-------------------------------------------------------+
|586344 |4320402595801,4320402133801,4320402919201,4320402238501|
+----------+-------------------------------------------------------+
利用,进行拆分成数组
select device_num,split(regexp_replace(get_json_object(extinfo, '$.target_ids'),
'[\\[\\"\\]]',''),",")
结果
+----------+-----------------------------------------------------------------+
|device_num|c1 |
+----------+-----------------------------------------------------------------+
|586344 |["4320402595801","4320402133801","4320402919201","4320402238501"]|
+----------+-----------------------------------------------------------------+
数组就可以用explode
结合lateral view
炸开了.
注意explode的位置在from xx的后面,此外,表别名subview,炸开的列别名target_id
select device_num,target_id
from xx
lateral view explode(split(regexp_replace(get_json_object(extinfo, '$.target_ids'),
'[\\[\\"\\]]',''),",")) subview as target_id
结果如下
+----------+-------------+
|device_num|target_id |
+----------+-------------+
|586344 |4320402595801|
|586344 |4320402133801|
|586344 |4320402919201|
|586344 |4320402238501|
+----------+-------------+
总结
list
,set
可以使用explode
炸开- 如果炸开后还要和原来的表连接,需要搭配使用lateral view
explode
结合lateral view
需要给表和列起别名split
函数生成的是数组- 使用
explode
需要先构建list
或者set
等等