一、 JSON对象,数组
JSON是在开发中经常会遇到的一种轻量级的数据交换格式比xml效率要高,XML需要很多的标签,这占据了网络流量。
JSON可以有两种格式:
一种是对象格式
{
"currentConfirmedCount": 249608,
"confirmedCount": 405399,
"suspectedCount": 16432,
"curedCount": 146023,
"deadCount": 9768
}
另一种是数组对象格式
{
"currentConfirmedCount": 249608,
"confirmedCount": 405399,
"suspectedCount": 16432,
"curedCount": 146023,
"deadCount": 9768
}
二、 JSON数据的处理方式
在数仓中我们可以通过sql来处理json数据,主要通过get_json_object和json_tuple来处理。下面就举下这两个函数的使用案例。
get_json_object()
SELECT get_json_object('{"currentConfirmedCount": 249608,"confirmedCount": 405399,"suspectedCount": 16432,"curedCount": 146023, "deadCount": 9768}'
,"$.currentConfirmedCount")
-- 结果为 249608
SELECT get_json_object('[
{"imgUrl": "https://img1.dxycdn.com/2020/0324/278/3403801351376518263-135.png","title": "新增疑似/新增确诊"}
,{"imgUrl": "https://img1.dxycdn.com/2020/0324/812/3403847865872006755-135.png","title": "境外输入"}
,{"imgUrl": "https://img1.dxycdn.com/2020/0324/547/3403847848692336947-135.png","title": "现存确诊"}
]'
,"$.[1,2].title")
-- 结果为 "境外输入","现存确诊"
json_tuple()
SELECT json_tuple('{"age":18, "name": "lili", "gender": "female"}', "age", "name", "gender");
-- 结果为 18, "lili", "female"
-- 也可以配合hive的侧视图使用
SELECT *
FROM table1
LATERAL VIEW json_tuple('{"age":18, "name": "lili", "gender": "female"}', "age", "name", "gender") state_json
AS age,name,gender