python处理文件中有多个json对象,在Python中读取具有多个对象的JSON文件

I'm a bit idiot in programming and Python. I know that these are a lot of explanations in previous questions about this but I carefully read all of them and I didn't find the solution.

I'm trying to read a JSON file which contains about 1 billion of data like this:

334465|{"color":"33ef","age":"55","gender":"m"}

334477|{"color":"3444","age":"56","gender":"f"}

334477|{"color":"3999","age":"70","gender":"m"}

I was trying hard to overcome that 6 digit numbers at the beginning of each line, but I dont know how can I read multiple JSON objects?

Here is my code but I can't find why it is not working?

import json

T =[]

s = open('simple.json', 'r')

ss = s.read()

for line in ss:

line = ss[7:]

T.append(json.loads(line))

s.close()

And the here is the error that I got:

ValueError: Extra Data: line 3 column 1 - line 5 column 48 (char 42 - 138)

Any suggestion would be very helpful for me!

解决方案

There are several problems with the logic of your code.

ss = s.read()

reads the entire file s into a single string. The next line

for line in ss:

iterates over each character in that string, one by one. So on each loop line is a single character. In

line = ss[7:]

you are getting the entire file contents apart from the first 7 characters (in positions 0 through 6, inclusive) and replacing the previous content of line with that. And then

T.append(json.loads(line))

attempts to convert that to JSON and store the resulting object into the T list.

Here's some code that does what you want. We don't need to read the entire file into a string with .read, or into a list of lines with .readlines, we can simply put the file handle into a for loop and that will iterate over the file line by line.

We use a with statement to open the file, so that it will get closed automatically when we exit the with block, or if there's an IO error.

import json

table = []

with open('simple.json', 'r') as f:

for line in f:

table.append(json.loads(line[7:]))

for row in table:

print(row)

output

{'color': '33ef', 'age': '55', 'gender': 'm'}

{'color': '3444', 'age': '56', 'gender': 'f'}

{'color': '3999', 'age': '70', 'gender': 'm'}

We can make this more compact by building the table list in a list comprehension:

import json

with open('simple.json', 'r') as f:

table = [json.loads(line[7:]) for line in f]

for row in table:

print(row)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值