pythoncsv转json_使用Python将CSV转换为JSON(以特定格式)

I would like to convert a csv file into a json file using python 2.7. Down below is the python code I tried but it is not giving me expected result. Also, I would like to know if there is any simplified version than mine. Any help is appreciated.

Here is my csv file (SampleCsvFile.csv):

zipcode,date,state,val1,val2,val3,val4,val5

95110,2015-05-01,CA,50,30.00,5.00,3.00,3

95110,2015-06-01,CA,67,31.00,5.00,3.00,4

95110,2015-07-01,CA,97,32.00,5.00,3.00,6

Here is the expected json file (ExpectedJsonFile.json):

{

"zipcode": "95110",

"state": "CA",

"subset": [

{

"date": "2015-05-01",

"val1": "50",

"val2": "30.00",

"val3": "5.00",

"val4": "3.00",

"val5": "3"

},

{

"date": "2015-06-01",

"val1": "67",

"val2": "31.00",

"val3": "5.00",

"val4": "3.00",

"val5": "4"

},

{

"date": "2015-07-01",

"val1": "97",

"val2": "32.00",

"val3": "5.00",

"val4": "3.00",

"val5": "6"

}

]

}

Here's the python code I tried:

import pandas as pd

from itertools import groupby

import json

df = pd.read_csv('SampleCsvFile.csv')

names = df.columns.values.tolist()

data = df.values

master_list2 = [ (d["zipcode"], d["state"], d) for d in [dict(zip(names, d)) for d in data] ]

intermediate2 = [(k, [x[2] for x in list(v)]) for k,v in groupby(master_list2, lambda t: (t[0],t[1]) )]

nested_json2 = [dict(zip(names,(k[0][0], k[0][1], k[1]))) for k in [(i[0], i[1]) for i in intermediate2]]

#print json.dumps(nested_json2, indent=4)

with open('ExpectedJsonFile.json', 'w') as outfile:

outfile.write(json.dumps(nested_json2, indent=4))

解决方案

Since you are using pandas already, I tried to get as much mileage as I could out of dataframe methods. I also ended up wandering fairly far afield from your implementation. I think the key here, though, is don't try to get too clever with your list and/or dictionary comprehensions. You can very easily confuse yourself and everyone who reads your code.

import pandas as pd

from itertools import groupby

from collections import OrderedDict

import json

df = pd.read_csv('SampleCsvFile.csv', dtype={

"zipcode" : str,

"date" : str,

"state" : str,

"val1" : str,

"val2" : str,

"val3" : str,

"val4" : str,

"val5" : str

})

results = []

for (zipcode, state), bag in df.groupby(["zipcode", "state"]):

contents_df = bag.drop(["zipcode", "state"], axis=1)

subset = [OrderedDict(row) for i,row in contents_df.iterrows()]

results.append(OrderedDict([("zipcode", zipcode),

("state", state),

("subset", subset)]))

print json.dumps(results[0], indent=4)

#with open('ExpectedJsonFile.json', 'w') as outfile:

# outfile.write(json.dumps(results[0], indent=4))

The simplest way to have all the json datatypes written as strings, and to retain their original formatting, was to force read_csv to parse them as strings. If, however, you need to do any numerical manipulation on the values before writing out the json, you will have to allow read_csv to parse them numerically and coerce them into the proper string format before converting to json.

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值