1. json数据格式–定义
JSON(JavaScript Object Notation) 是一种轻量级的数据交换格式,易于人阅读和编写。
2.json数据格式解编码(2.1,2.2两种方法)
2.1 json函数实现解编码:json.dumps及json.loads
函数 | 描述 |
---|---|
json.dumps | 将 Python 对象编码成 JSON 字符串 |
json.loads | 将已编码的 JSON 字符串解码为 Python 对象 |
2.1.1 python中json函数使用案例
- (1)json.dumps----将 Python 对象编码成 JSON 字符串
json.dumps工具函数介绍:
def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, cls=None, indent=None, separators=None,
default=None, sort_keys=False, **kw):
"""Serialize ``obj`` to a JSON formatted ``str``.
If ``skipkeys`` is true then ``dict`` keys that are not basic types
(``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
instead of raising a ``TypeError``.
If ``ensure_ascii`` is false, then the return value can contain non-ASCII
characters if they appear in strings contained in ``obj``. Otherwise, all
such characters are escaped in JSON strings.
If ``check_circular`` is false, then the circular reference check
for container types will be skipped and a circular reference will
result in an ``OverflowError`` (or worse).
If ``allow_nan`` is false, then it will be a ``ValueError`` to
serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
strict compliance of the JSON specification, instead of using the
JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
If ``indent`` is a non-negative integer, then JSON array elements and
object members will be pretty-printed with that indent level. An indent
level of 0 will only insert newlines. ``None`` is the most compact
representation.
If specified, ``separators`` should be an ``(item_separator, key_separator)``
tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and
``(',', ': ')`` otherwise. To get the most compact JSON representation,
you should specify ``(',', ':')`` to eliminate whitespace.
``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.
If *sort_keys* is ``True`` (default: ``False``), then the output of
dictionaries will be sorted by key.
To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
``.default()`` method to serialize additional types), specify it with
the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.
将 Python 对象编码成 JSON 字符串--demo1:
#!/usr/bin/python
import json
data = [ { 'a' : 1, 'b' : 2, 'c' : 3, 'd' : 4, 'e' : 5 } ]
json = json.dumps(data)
print json
json = json.dumps(data),dumps之后的json值为字符串:
'[{"d": 4, "e": 5, "c": 3, "a": 1, "b": 2}]'
print json的打印结果为:
[{"d": 4, "a": 1, "c": 3, "e": 5, "b": 2}]
将 Python 对象编码成 JSON 字符串,设置参数缩进为4等格式化输出字符串--demo2:
print (json.dumps({'a': 'Runoob', 'b': 7}, sort_keys=True, indent=4, separators=(',', ': ')))
print打印结果为:
{
"a": "Runoob",
"b": 7
}
可以看出,上述参数格式化json字符串里面参数sort_keys实现了字典的key的排序方式,,设置为True时,默认按字母的升序排序,设置为False时,默认key没有顺序;indent参数设置右缩进4个空格,separators参数让json更紧凑;
(2)json.loads----将已编码的 JSON 字符串解码为 Python 对象
json字符串类型 | python数据类型 |
---|---|
object | dict |
array | list |
string | unicode |
number(int) | int,long |
number(real) | float |
true | True |
false | False |
null | None |
import json
jsonData = '{"a":1,"b":2,"c":3,"d":4,"e":5}';
text = json.loads(jsonData)
print (text)
text的值为:{'d': 4, 'c': 3, 'a': 1, 'e': 5, 'b': 2}
print的结果:{'d': 4, 'c': 3, 'a': 1, 'e': 5, 'b': 2}
注意,print非字符串的Python数据类型,会按原样显示;
print字符串,显示的时候,会隐藏最外层的引号;
print("{'d': 4, 'c': 3, 'a': 1, 'e': 5, 'b': 2}")
结果:{'d': 4, 'c': 3, 'a': 1, 'e': 5, 'b': 2}
2.2.json函数实现解编码:使用第三方库:Demjson
Demjson 是 python 的第三方模块库,可用于编码和解码 JSON 数据,包含了 JSONLint 的格式化及校验功能。
Github 地址:https://github.com/dmeranda/demjson
官方地址:http://deron.meranda.us/python/demjson/
安装方式:
$ tar -xvzf demjson-2.2.3.tar.gz
$ cd demjson-2.2.3
$ python setup.py install
函数 | 描述 |
---|---|
encode | 将 Python 对象编码成 JSON 字符串 |
decode | 将已编码的 JSON 字符串解码为 Python 对象 |
encode函数工具:
r"""Encodes a Python object into a JSON-encoded string.
* 'strict' (Boolean, default False)
If 'strict' is set to True, then only strictly-conforming JSON
output will be produced. Note that this means that some types
of values may not be convertable and will result in a
JSONEncodeError exception.
* 'compactly' (Boolean, default True)
If 'compactly' is set to True, then the resulting string will
have all extraneous white space removed; if False then the
string will be "pretty printed" with whitespace and
indentation added to make it more readable.
* 'encode_namedtuple_as_object' (Boolean or callable, default True)
If True, then objects of type namedtuple, or subclasses of
'tuple' that have an _asdict() method, will be encoded as an
object rather than an array.
If can also be a predicate function that takes a namedtuple
object as an argument and returns True or False.
* 'indent_amount' (Integer, default 2)
The number of spaces to output for each indentation level.
If 'compactly' is True then indentation is ignored.
* 'indent_limit' (Integer or None, default None)
If not None, then this is the maximum limit of indentation
levels, after which further indentation spaces are not
inserted. If None, then there is no limit.
CONCERNING CHARACTER ENCODING:
The 'encoding' argument should be one of:
* None - The return will be a Unicode string.
* encoding_name - A string which is the name of a known
encoding, such as 'UTF-8' or 'ascii'.
* codec - A CodecInfo object, such as as found by codecs.lookup().
This allows you to use a custom codec as well as those
built into Python.
If an encoding is given (either by name or by codec), then the
returned value will be a byte array (Python 3), or a 'str' string
(Python 2); which represents the raw set of bytes. Otherwise,
if encoding is None, then the returned value will be a Unicode
string.
The 'escape_unicode' argument is used to determine which characters
in string literals must be \u escaped. Should be one of:
* True -- All non-ASCII characters are always \u escaped.
* False -- Try to insert actual Unicode characters if possible.
* function -- A user-supplied function that accepts a single
unicode character and returns True or False; where True
means to \u escape that character.
Regardless of escape_unicode, certain characters will always be
\u escaped. Additionaly any characters not in the output encoding
repertoire for the encoding codec will be \u escaped as well.
"""
encode函数的使用demo:
import demjson
data = [ { 'a' : 1, 'b' : 2, 'c' : 3, 'd' : 4, 'e' : 5 } ]
json = demjson.encode(data)
print (json)
json的值为字符串:'[{"a":1,"b":2,"c":3,"d":4,"e":5}]'
print的结果为:[{"a":1,"b":2,"c":3,"d":4,"e":5}]
decode函数工具:
"""Decodes a JSON-encoded string into a Python object.
== Optional arguments ==
* 'encoding' (string, default None)
This argument provides a hint regarding the character encoding
that the input text is assumed to be in (if it is not already a
unicode string type).
If set to None then autodetection of the encoding is attempted
(see discussion above). Otherwise this argument should be the
name of a registered codec (see the standard 'codecs' module).
* 'strict' (Boolean, default False)
If 'strict' is set to True, then those strings that are not
entirely strictly conforming to JSON will result in a
JSONDecodeError exception.
* 'return_errors' (Boolean, default False)
Controls the return value from this function. If False, then
only the Python equivalent object is returned on success, or
an error will be raised as an exception.
If True then a 2-tuple is returned: (object, error_list). The
error_list will be an empty list [] if the decoding was
successful, otherwise it will be a list of all the errors
encountered. Note that it is possible for an object to be
returned even if errors were encountered.
* 'return_stats' (Boolean, default False)
Controls whether statistics about the decoded JSON document
are returns (and instance of decode_statistics).
If True, then the stats object will be added to the end of the
tuple returned. If return_errors is also set then a 3-tuple
is returned, otherwise a 2-tuple is returned.
* 'write_errors' (Boolean OR File-like object, default False)
Controls what to do with errors.
- If False, then the first decoding error is raised as an exception.
- If True, then errors will be printed out to sys.stderr.
- If a File-like object, then errors will be printed to that file.
The write_errors and return_errors arguments can be set
independently.
* 'filename_for_errors' (string or None)
Provides a filename to be used when writting error messages.
* 'allow_xxx', 'warn_xxx', and 'forbid_xxx' (Booleans)
These arguments allow for fine-adjustments to be made to the
'strict' argument, by allowing or forbidding specific
syntaxes.
There are many of these arguments, named by replacing the
"xxx" with any number of possible behavior names (See the JSON
class for more details).
Each of these will allow (or forbid) the specific behavior,
after the evaluation of the 'strict' argument. For example,
if strict=True then by also passing 'allow_comments=True' then
comments will be allowed. If strict=False then
forbid_comments=True will allow everything except comments.
Unicode decoding:
-----------------
The input string can be either a python string or a python unicode
string (or a byte array in Python 3). If it is already a unicode
string, then it is assumed that no character set decoding is
required.
However, if you pass in a non-Unicode text string (a Python 2
'str' type or a Python 3 'bytes' or 'bytearray') then an attempt
will be made to auto-detect and decode the character encoding.
This will be successful if the input was encoded in any of UTF-8,
UTF-16 (BE or LE), or UTF-32 (BE or LE), and of course plain ASCII
works too.
Note though that if you know the character encoding, then you
should convert to a unicode string yourself, or pass it the name
of the 'encoding' to avoid the guessing made by the auto
detection, as with
python_object = demjson.decode( input_bytes, encoding='utf8' )
Callback hooks:
---------------
You may supply callback hooks by using the hook name as the
named argument, such as:
decode_float=decimal.Decimal
See the hooks documentation on the JSON.set_hook() method.
"""
decode函数的使用demo:
import demjson
json = '{"a":1,"b":2,"c":3,"d":4,"e":5}';
text = demjson.decode(json)
print(text)
text的值为字典对象:{"a":1,"b":2,"c":3,"d":4,"e":5}
print的结果为:{"a":1,"b":2,"c":3,"d":4,"e":5}