hive查勘表结构_如何使用Hive查询结构数组(get_json_object)?

I store the following JSON objects in a Hive table:

{

"main_id": "qwert",

"features": [

{

"scope": "scope1",

"name": "foo",

"value": "ab12345",

"age": 50,

"somelist": ["abcde","fghij"]

},

{

"scope": "scope2",

"name": "bar",

"value": "cd67890"

},

{

"scope": "scope3",

"name": "baz",

"value": [

"A",

"B",

"C"

]

}

]

}

"features" is an array of varying length, i.e. all objects are optional. The objects have arbitrary elements, but all of them contain "scope", "name" and "value".

This is the Hive table I created:

CREATE TABLE tbl(

main_id STRING,features array,age:INT,somelist:array>>

)

I need a Hive query that returns the main_id and the value of the struct with the name "baz", i.e.,

main_id baz_value

qwert ["A","B","C"]

My problem is that the Hive UDF "get_json_object" supports only a limited version of JSONPath. It does not support a path like get_json_object(features, '$.features[?(@.name='baz')]').

How can query the wanted result with Hive? Is it maybe easier with another Hive table structure?

解决方案

I found a solution for this:

Use the Hive explode UDTF to explode the struct array, i.e., create a second (temporary) table with one record for each struct in the array "features".

CREATE TABLE tbl_exploded as

select main_id,

f.name as f_name,

f.value as f_value

from tbl

LATERAL VIEW explode(features) exploded_table as f

-- optionally filter here instead of in 2nd query:

-- where f.name = 'baz';

The result of this is:

qwert, foo, ab12345

qwert, bar, cd67890

qwert, baz, ["A","B","C"]

Now you can select the main_id and value like this:

select main_id, f_value from tbl_exploded where f_name = 'baz';

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值