python逐行读取json中的一行_如何从大文件中读取以行分隔的JSON(逐行)

针对2GB大小、由换行符分隔的JSON字符串文件,目前采用的方法是读取整个文件内容,然后用替换操作构造JSON列表。但这种方式可能导致内存问题。更好的方法是使用`with open`逐行读取文件,每读一行就构造一个JSON对象,避免一次性加载大量数据。这样可以有效解决内存分配问题,并确保每个JSON对象正确解析。
摘要由CSDN通过智能技术生成

I'm trying to load a large file (2GB in size) filled with JSON strings, delimited by newlines. Ex:

{

"key11": value11,

"key12": value12,

}

{

"key21": value21,

"key22": value22,

}

The way I'm importing it now is:

content = open(file_path, "r").read()

j_content = json.loads("[" + content.replace("}\n{", "},\n{") + "]")

Which seems like a hack (adding commas between each JSON string and also a beginning and ending square bracket to make it a proper list).

Is there a better way to specify the JSON delimiter (newline \n instead of comma ,)?

Also, Python can't seem to properly allocate memory for an object built from 2GB of data, is there a way to construct each JSON object as I'm reading the file line by line? Thanks!

解决方案

Just read each line and construct a json object at this time:

with open(file_path) as f:

for line in f:

j_content = json.loads(line)

This way, you load proper complete json object (provided there is no \n in a json value somewhere or in the middle of your json object) and you avoid memory issue as each object is created when needed.

There is also this answer.:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值