Hive中,regexp_replace函数的第2个参数是正则表达式,第3个参数是字符串
select split(regexp_replace(data,'\\},\\{','}||{'),'\\|\\|')[0]as test
from
(select '[{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jd","monthSales":2090,"userCount":78981,"score":"9.8"},{"source":"jdmart","monthSales":6987,"userCount":1600,"score":"9.0"}]'
as data) a
split函数解析也是正则
因此上面如果写成这样:
select split(regexp_replace(data,'\\},\\{','}||{'),'||')[0]as test #这里做了修改
from
(select '[{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jd","monthSales":2090,"userCount":78981,"score":"9.8"},{"source":"jdmart","monthSales":6987,"userCount":1600,"score":"9.0"}]'
as data) a
就得不到想要的结果了
参考自:https://blog.csdn.net/longshenlmj/article/details/49027145