python json文件比较快,如何加快在Python中加载和读取JSON文件的过程?

I am running a script (in multiprocessing mode) that extract some parameters from a bunch of JSON files but currently it is very slow. Here is the script:

from __future__ import print_function, division

import os

from glob import glob

from os import getpid

from time import time

from sys import stdout

import resource

from multiprocessing import Pool

import subprocess

try:

import simplejson as json

except ImportError:

import json

path = '/data/data//*.A.1'

print("Running with PID: %d" % getpid())

def process_file(file):

start = time()

filename =file.split('/')[-1]

print(file)

with open('/data/data/A.1/%s_DI' %filename, 'w') as w:

with open(file, 'r') as f:

for n, line in enumerate(f):

d = json.loads(line)

try:

domain = d['rrname']

ips = d['rdata']

for i in ips:

print("%s|%s" % (i, domain), file=w)

except:

print (d)

pass

if __name__ == "__main__":

files_list = glob(path)

cores = 12

print("Using %d cores" % cores)

pp = Pool(processes=cores)

pp.imap_unordered(process_file, files_list)

pp.close()

pp.join()

Does any body know how to speed this up?

解决方案

First, find out where your bottlenecks are.

If it is on the json decoding/encoding step, try switching to ultrajson:

UltraJSON is an ultra fast JSON encoder and decoder written in pure C

with bindings for Python 2.5+ and 3.

The changes would be as simple as changing the import part:

try:

import ujson as json

except ImportError:

try:

import simplejson as json

except ImportError:

import json

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值