json和pickle两个序列化模块详解

最新推荐文章于 2024-08-08 16:43:27 发布

奔跑的乌班

最新推荐文章于 2024-08-08 16:43:27 发布

阅读量883

点赞数 1

分类专栏： python 文章标签： python 序列化模块 json模块 pickle模块

本文链接：https://blog.csdn.net/u010199356/article/details/85320511

版权

python 专栏收录该内容

5 篇文章 1 订阅

订阅专栏

用于序列化的两个模块

json：用于字符串和Python数据类型间进行转换
pickle: 用于python特有的类型和python的数据类型间进行转换

json提供四个功能：dumps,dump,loads,load

pickle提供四个功能：dumps,dump,loads,load

pickle模块和json模块还是比较实用的，还有许多的信息可以去了解，
想了解更多信息的话可以阅读下python官方的API文档（library文件）。另外的，也可以在python交互模式中导入json和pickle模块，
使用help（模块[.方法名]）方式查看。

那么为什么需要序列化和反序列化这一操作呢？
便于存储。（pickle）序列化过程将文本信息转变为二进制数据流。这样就信息就容易存储在硬盘之中，当需要读取文件的时候，从硬盘中读取数据，然后再将其反序列化便可以得到原始的数据。

json模块

1、json.dumps()

json.dumps()用于将dict类型的数据转成str，因为如果直接将dict类型的数据写入json文件中会发生报错，因此在将数据写入时需要用到该函数。

    import json
    
    print("json.dumps()")
    d1 = {"a": 1, "b": 2, "c": 3, "d": 4}
    print(d1)
    print("d1的数据类型是：", type(d1))
    dict_to_str1 = json.dumps(d1)
    print(dict_to_str1)
    print("dict_to_str1的数据类型是：", type(dict_to_str1))
    print("*" * 50)

# json.dumps()
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
# d1的数据类型是： <class 'dict'>
# {"a": 1, "b": 2, "c": 3, "d": 4}
# dict_to_str1的数据类型是： <class 'str'>
# **************************************************

2、json.loads()

json.loads()用于将str类型的数据转成dict。

从上面的json字符串dict_to_str1,我们可以使用loads（）将其转换为dick

print("json.loads()")
d2 = json.loads(dict_to_str1)
print(d2)
print("d2的数据类型是：", type(d2))
print("*" * 50)
# json.loads()
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
# d2的数据类型是： <class 'dict'>
# **************************************************

另外的，json模块方法dumps和loads的运用在爬虫过程中的应用非常重要，如我们在请求获取响应之后，会面对很多json格式的文件，这样json模块的作用就出来了。

3、json.dump()

json.dump()用于将dict类型的数据转成str，并写入到json文件中.

print("json.dump()")
import os
# os.chdir(r"C:\Users\999\Desktop")
# print(os.getcwd())

os.chdir(r"C:\Users\999\Desktop")
# 我现在将工作目录改到桌面，方便一会将创建的演示文件删除
print("当前工作目录为：", os.getcwd())
temp_dict = {'a': '11', 'b': '22', 'c': '33', 'd': '44', "e":{"aa":'a1', "bb":"b1", "cc":{"c1":"ccc1", "c2":"ccc2", "c3":"ccc3"}}}
temp_json_filename = 'temp_json.json'
print(temp_dict)
print("temp_dict的数据类型是：", type(temp_dict))
jsObj = json.dumps(temp_dict)
with open(os.getcwd() + "\\" + temp_json_filename, "w", encoding="utf-8") as f:
    f.write(jsObj)
print("写入成功")
print("*" * 50)
# json.dump()
# 当前工作目录为： C:\Users\999\Desktop
# {'a': '11', 'b': '22', 'c': '33', 'd': '44', 'e': {'aa': 'a1', 'bb': 'b1', 'cc': {'c1': 'ccc1', 'c2': 'ccc2', 'c3': 'ccc3'}}}
# temp_dict的数据类型是： <class 'dict'>
# 写入成功
# **************************************************

4、json.load()

json.load()用于从json文件中读取数据。

print("json.load()")
with open(os.getcwd() + "\\" + temp_json_filename, "r") as fr:
    temp_dict_ByLoad = json.load(fr)
print("读取成功")
print(temp_dict_ByLoad)
print("temp_dict_ByLoad的数据类型是：", type(temp_dict_ByLoad))
print("*" * 50)
# json.load()
# 读取成功
# {'a': '11', 'b': '22', 'c': '33', 'd': '44', 'e': {'aa': 'a1', 'bb': 'b1', 'cc': {'c1': 'ccc1', 'c2': 'ccc2', 'c3': 'ccc3'}}}
# temp_dict_ByLoad的数据类型是： <class 'dict'>
# **************************************************

pickle模块

pickle可以存储什么类型的数据呢？

所有python支持的原生类型：布尔值，整数，浮点数，复数，字符串，字节，None。
由任何原生类型组成的列表，元组，字典和集合。
函数，类，类的实例

在Python程序运行中得到了一些字符串、列表、字典等数据，想要长久的保存下来，方便以后使用，而不是简单的放入内存中关机断电就丢失数据。python模块大全中的Pickle模块就派上用场了，它可以将对象转换为一种可以传输或存储的格式。

1.pickle.dump(obj, file, protocol=None, )

必填参数obj表示将要封装的对象
#必填参数file表示obj要写入的文件对象，file必须以二进制可写模式打开，即“wb”

print("json.dump()")
import pickle

data1 = ["qq","www","e","rrr4"]
temp_pickle_filename = "temp_pickle.pk"
with open(os.getcwd() + "\\" + temp_pickle_filename, "wb") as fw:
    pickle.dump(data1,fw)
print(data1)
print("data1的数据类型是：", type(data1))
print("写入成功")
print("*" * 50)
# json.dump()
# ['qq', 'www', 'e', 'rrr4']
# data1的数据类型是： <class 'list'>
# 写入成功
# **************************************************

2.pickle.load(file, *, fix_imports=True, encoding=“ASCII”, errors=“strict”)

必填参数file必须以二进制可读模式打开，即“rb”，其他都为可选参数

print("json.load()")
with open(os.getcwd() + "\\" + temp_pickle_filename, "rb") as fr:
    data2 = pickle.load(fr)
print(data2)
print("data2的数据类型是：", type(data2))
print("*" * 50)
# json.load()
# ['qq', 'www', 'e', 'rrr4']
# data2的数据类型是： <class 'list'>
# **************************************************

3.pickle.dumps(obj)：以字节对象形式返回封装的对象，不需要写入文件中

print('pickle.dumps(obj)')
temp_pickle_filename = ""
data3 = ['aa', 'bb', 'cc']
print(data3)
print("data3数据类型为:", type(data3))
# dumps 将数据通过特殊的形式转换为只有python语言认识的字符串
change_str = pickle.dumps(data3)
print(change_str)
print("change_str数据类型为:", type(change_str))
print("*" * 50)
# pickle.dumps(obj)
# ['aa', 'bb', 'cc']
# data3数据类型为: <class 'list'>
# b'\x80\x03]q\x00(X\x02\x00\x00\x00aaq\x01X\x02\x00\x00\x00bbq\x02X\x02\x00\x00\x00ccq\x03e.'
# change_str数据类型为: <class 'bytes'>
# **************************************************

4.pickle.loads(bytes_object): 从字节对象中读取被封装的对象，并返回

print('pickle.loads(bytes_object)')
data4 =pickle.loads(change_str)
print(data4)
print("data4数据类型为:", type(data4))
print("*" * 50)

# pickle.loads(bytes_object)
# ['aa', 'bb', 'cc']
# data4数据类型为: <class 'list'>
# **************************************************

pickle模块可能出现三种异常：

1.PickleError：封装和拆封时出现的异常类，继承自Exception
2.PicklingError: 遇到不可封装的对象时出现的异常，继承自PickleError
3.UnPicklingError: 拆封对象过程中出现的异常，继承自PickleError

本文中的内容引用参考了以下链接内容：

1.http://www.php.cn/python-tutorials-372984.html
2.https://blog.csdn.net/weixin_42329277/article/details/80495065
3.https://blog.csdn.net/brink_compiling/article/details/54932095

小结

	文中有不足之处，请指出交流哈，谢谢。

附python交互模式中，导入json模块后使用help（模块.方法名）方式得到的json解释结果

json 模块中load loads jump jumps
1、json.dumps()
         json.dumps()用于将dict类型的数据转成str，因为如果直接将dict类型的数据写入json文件中会发生报错，因此在将数据写入时需要用到该函数。
 dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
    Serialize ``obj`` to a JSON formatted ``str``.
    
    If ``skipkeys`` is true then ``dict`` keys that are not basic types
    (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
    instead of raising a ``TypeError``.
    
    If ``ensure_ascii`` is false, then the return value can contain non-ASCII
    characters if they appear in strings contained in ``obj``. Otherwise, all
    such characters are escaped in JSON strings.
    
    If ``check_circular`` is false, then the circular reference check
    for container types will be skipped and a circular reference will
    result in an ``OverflowError`` (or worse).
    
    If ``allow_nan`` is false, then it will be a ``ValueError`` to
    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
    strict compliance of the JSON specification, instead of using the
    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
    
    If ``indent`` is a non-negative integer, then JSON array elements and
    object members will be pretty-printed with that indent level. An indent
    level of 0 will only insert newlines. ``None`` is the most compact
    representation.
    
    If specified, ``separators`` should be an ``(item_separator, key_separator)``
    tuple.  The default is ``(', ', ': ')`` if *indent* is ``None`` and
    ``(',', ': ')`` otherwise.  To get the most compact JSON representation,
    you should specify ``(',', ':')`` to eliminate whitespace.
    
    ``default(obj)`` is a function that should return a serializable version
    of obj or raise TypeError. The default simply raises TypeError.
    
    If *sort_keys* is ``True`` (default: ``False``), then the output of
    dictionaries will be sorted by key.
    
    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
    ``.default()`` method to serialize additional types), specify it with
    the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.
2、json.loads()
          json.loads()用于将str类型的数据转成dict。
 loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
    Deserialize ``s`` (a ``str`` instance containing a JSON
    document) to a Python object.
    
    ``object_hook`` is an optional function that will be called with the
    result of any object literal decode (a ``dict``). The return value of
    ``object_hook`` will be used instead of the ``dict``. This feature
    can be used to implement custom decoders (e.g. JSON-RPC class hinting).
    
    ``object_pairs_hook`` is an optional function that will be called with the
    result of any object literal decoded with an ordered list of pairs.  The
    return value of ``object_pairs_hook`` will be used instead of the ``dict``.
    This feature can be used to implement custom decoders that rely on the
    order that the key and value pairs are decoded (for example,
    collections.OrderedDict will remember the order of insertion). If
    ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.
    
    ``parse_float``, if specified, will be called with the string
    of every JSON float to be decoded. By default this is equivalent to
    float(num_str). This can be used to use another datatype or parser
    for JSON floats (e.g. decimal.Decimal).
    
    ``parse_int``, if specified, will be called with the string
    of every JSON int to be decoded. By default this is equivalent to
    int(num_str). This can be used to use another datatype or parser
    for JSON integers (e.g. float).
    
    ``parse_constant``, if specified, will be called with one of the
    following strings: -Infinity, Infinity, NaN, null, true, false.
    This can be used to raise an exception if invalid JSON numbers
    are encountered.
    
    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
    kwarg; otherwise ``JSONDecoder`` is used.
    
    The ``encoding`` argument is ignored and deprecated.

 
3、json.load()
      json.load()用于从json文件中读取数据。

 load(fp, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
    Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
    a JSON document) to a Python object.
    
    ``object_hook`` is an optional function that will be called with the
    result of any object literal decode (a ``dict``). The return value of
    ``object_hook`` will be used instead of the ``dict``. This feature
    can be used to implement custom decoders (e.g. JSON-RPC class hinting).
    
    ``object_pairs_hook`` is an optional function that will be called with the
    result of any object literal decoded with an ordered list of pairs.  The
    return value of ``object_pairs_hook`` will be used instead of the ``dict``.
    This feature can be used to implement custom decoders that rely on the
    order that the key and value pairs are decoded (for example,
    collections.OrderedDict will remember the order of insertion). If
    ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.
    
    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
    kwarg; otherwise ``JSONDecoder`` is used.
4、json.dump()
            json.dump()用于将dict类型的数据转成str，并写入到json文件中。
 
dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
    Serialize ``obj`` as a JSON formatted stream to ``fp`` (a
    ``.write()``-supporting file-like object).
    
    If ``skipkeys`` is true then ``dict`` keys that are not basic types
    (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
    instead of raising a ``TypeError``.
    
    If ``ensure_ascii`` is false, then the strings written to ``fp`` can
    contain non-ASCII characters if they appear in strings contained in
    ``obj``. Otherwise, all such characters are escaped in JSON strings.
    
    If ``check_circular`` is false, then the circular reference check
    for container types will be skipped and a circular reference will
    result in an ``OverflowError`` (or worse).
    
    If ``allow_nan`` is false, then it will be a ``ValueError`` to
    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``)
    in strict compliance of the JSON specification, instead of using the
    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
    
    If ``indent`` is a non-negative integer, then JSON array elements and
    object members will be pretty-printed with that indent level. An indent
    level of 0 will only insert newlines. ``None`` is the most compact
    representation.
    
    If specified, ``separators`` should be an ``(item_separator, key_separator)``
    tuple.  The default is ``(', ', ': ')`` if *indent* is ``None`` and
    ``(',', ': ')`` otherwise.  To get the most compact JSON representation,
    you should specify ``(',', ':')`` to eliminate whitespace.
    
    ``default(obj)`` is a function that should return a serializable version
    of obj or raise TypeError. The default simply raises TypeError.
    
    If *sort_keys* is ``True`` (default: ``False``), then the output of
    dictionaries will be sorted by key.
    
    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
    ``.default()`` method to serialize additional types), specify it with
    the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.