项目中需要将json字符串反序列化成Python字典对象,但发现原生的json库速度非常慢,就想使用第三方库取代它,发现了orjson和rapidjson两个比较流行的第三方库,测试一下
环境
- Python 3.8.6
- macOS 11.1
- orjson 3.4.6
- rapidjson 1.0
介绍
orjson
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
号称Python下最快的json库,项目地址:https://github.com/ijl/orjson#install
源码显示底层使用了rust,难怪那么快
rapidjson
rapidjson是腾讯开源的json库,项目地址:https://github.com/Tencent/rapidjson
底层用的C++
Python中使用需要python封装库python-rapidjson,项目地址:https://github.com/python-rapidjson/python-rapidjson
安装
orjson
>>> pip3 install orjson
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting orjson
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/dc/be/96126a886573ebbb979d67f5bc4000acb7907042b8aa3494f70d118f9395/orjson-3.4.6-cp38-cp38-macosx_10_7_x86_64.whl (231 kB)
|████████████████████████████████| 231 kB 1.5 MB/s
Installing collected packages: orjson
Successfully installed orjson-3.4.6
rapidjson
>>> pip3 install python-rapidjson
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting python-rapidjson
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fb/57/503810343cea48df3978ac4b34eb132c10204f0efa2a695d74c49b183004/python_rapidjson-1.0-cp38-cp38-macosx_10_9_x86_64.whl (202 kB)
|████████████████████████████████| 202 kB 5.8 MB/s
Installing collected packages: python-rapidjson
Successfully installed python-rapidjson-1.0
对比
dumps:
import time
import json
import orjson
import rapidjson
m = {
"timestamp": 1556283673.1523004,
"task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7",
"task_level": [1, 2, 1],
"action_status": "started",
"action_type": "main",
"key": "value",
"another_key": 123,
"and_another": ["a", "b"],
}
def benchmark(name, dumps):
start = time.time()
for i in range(1000000):
dumps(m)
print(name, '{:.3f}'.format(time.time() - start))
benchmark("json: ", json.dumps)
# orjson only outputs bytes, but often we need unicode:
benchmark("orjson: ", lambda s: str(orjson.dumps(s), "utf-8"))
benchmark("rapidjson:", rapidjson.dumps)
loads:
import time
import json
import orjson
import rapidjson
m = '{"timestamp": 1556283673.1523004,"task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7","task_level": [1, 2, 1],"action_status": "started","action_type": "main","key": "value","another_key": 123,"and_another": [
"a", "b"]}'
def benchmark(name, loads):
start = time.time()
for i in range(1000000):
loads(m)
print(name, '{:.3f}'.format(time.time() - start))
benchmark("json: ", json.loads)
# orjson only outputs bytes, but often we need unicode:
benchmark("orjson: ", orjson.loads)
benchmark("rapidjson:", rapidjson.loads)
output:
json | orjson | rapidjson | |
---|---|---|---|
dumps | 5.303 | 1.209 | 2.751 |
loads | 4.096 | 1.600 | 2.075 |