I have a large json file that contains thousands of documents:
[
{
"_id": "document1",
"fields": [ ... ]
},
{
"_id": "document2",
"fields": [ ... ]
},
...
]
I'd like to split this json file so that each json file contains a single document, and name them accordingly:
document1.json, document2.json, ...
For example, document1.json will contain:
{
"_id": "document1",
"fields": [ ... ]
}
I have no knowledge of jq API, and I'm struggling to find an answer (I've find a similar question, but slightly different :( )
解决方案
Here is a Python solution to your problem.
Don't forget to change the in_file_path to the location of your big JSON file.
import json
in_file_path='path/to/file.json' # Change me!
with open(in_file_path,'r') as in_json_file:
# Read the file and convert it to a dictionary
json_obj_list = json.load(in_json_file)
for json_obj in json_obj_list:
filename=json_obj['_id']+'.json'
with open(filename, 'w') as out_json_file:
# Save each obj to their respective filepath
# with pretty formatting thanks to `indent=4`
json.dump(json_obj, out_json_file, indent=4)
Side Note: I ran this in Python3, it should work in Python2 as well