python逐行读取json中的一行_如何从大文件中读取以行分隔的JSON（逐行）

最新推荐文章于 2022-08-31 10:07:48 发布

weixin_39608657

最新推荐文章于 2022-08-31 10:07:48 发布

阅读量1.6k

点赞数

文章标签： python逐行读取json中的一行

本文链接：https://blog.csdn.net/weixin_39608657/article/details/111442930

版权

针对2GB大小、由换行符分隔的JSON字符串文件，目前采用的方法是读取整个文件内容，然后用替换操作构造JSON列表。但这种方式可能导致内存问题。更好的方法是使用`with open`逐行读取文件，每读一行就构造一个JSON对象，避免一次性加载大量数据。这样可以有效解决内存分配问题，并确保每个JSON对象正确解析。

摘要由CSDN通过智能技术生成

I'm trying to load a large file (2GB in size) filled with JSON strings, delimited by newlines. Ex:

{

"key11": value11,

"key12": value12,

}

{

"key21": value21,

"key22": value22,

}

…

The way I'm importing it now is:

content = open(file_path, "r").read()

j_content = json.loads("[" + content.replace("}\n{", "},\n{") + "]")

Which seems like a hack (adding commas between each JSON string and also a beginning and ending square bracket to make it a proper list).

Is there a better way to specify the JSON delimiter (newline \n instead of comma ,)?

Also, Python can't seem to properly allocate memory for an object built from 2GB of data, is there a way to construct each JSON object as I'm reading the file line by line? Thanks!

解决方案

Just read each line and construct a json object at this time:

with open(file_path) as f:

for line in f:

j_content = json.loads(line)

This way, you load proper complete json object (provided there is no \n in a json value somewhere or in the middle of your json object) and you avoid memory issue as each object is created when needed.

There is also this answer.:

weixin_39608657

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫