python 多层嵌套 json,如何在python中标准化复杂的嵌套json?

I am trying to normalize complex nested json in python but I am unable to parse all the objects out.

sample_object = {'Name':'John', 'Location':{'City':'Los Angeles','State':'CA'}, 'hobbies':['Music', 'Running']}

def flatten_json(y):

out = {}

def flatten(x, name=''):

if type(x) is dict:

for a in x:

flatten(x[a], name + a + '_')

elif type(x) is list:

for a in x:

flatten(a, name)

else:

out[name[:-1]] = x

flatten(y)

return out

flat = flatten_json(sample_object)

print json_normalize(flat)

Return Result:

Name | Location_City | Location_State | Hobbies

-----+---------------+----------------+--------

John | Los Angeles | CA | Running

Expected Result:

Name | Location_City | Location_State | Hobbies

-----+---------------+----------------+--------

John | Los Angeles | CA | Running

John | Los Angeles | CA | Music

解决方案

The problem you are having originates in the following section

elif type(x) is list:

for a in x:

flatten(a, name)

Because you do not change the name for every element of the list, every next element will override the assignment of the previous element and thus only the last element will show in the output.

Applied to this example, when the flattening function reaches the list 'hobbies' it will first assign the name 'hobbies' to the element 'Music' and send it to the output. After the element 'Music', the next element in the list is 'Running', which will also be asigned the name 'hobbies'. When this element is send to the output it will notice that the name 'hobbies' already exists and it will override the value 'Music' with the value 'Running'.

To prevent this the script from the link you referenced uses the following piece of code to append de array's index to the name, thus creating a unique name for every element of the array.

elif type(x) is list:

i = 0

for a in x:

flatten(a, name + str(i) + ' ')

i += 1

This would create an extra 'columns' to the data however rather then a new row. If the latter is what you want you would have to change the way the functions is set up. One way could be to adapt the function to return an list of json's (one for each list element in the original json).

An extra note: I would recommend beeing a bit more carefull with coppying code when submitting a question. The indenting is a bit of in this case and since you left out the part where you import json_normalize it might not be completely clear for everyone that you are importing it from pandas

from pandas.io.json import json_normalize

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值