文章目录
一. 前言
JSON(JavaScript Object Notation) 是一种轻量级的数据交换格式,它使得人们很容易的进行阅读和编写。同时也方便了机器进行解析和生成。适用于进行数据交互的场景,比如网站前台与后台之间的数据交互。
JSON和XML的比较可谓不相上下。
Python 2.7中自带了JSON模块,直接import json就可以使用了。
官方文档:http://docs.python.org/library/json.html
Json在线解析网站:http://www.json.cn/#
二. JSON
json简单说就是javascript中的对象和数组,所以这两种结构就是对象和数组两种结构,通过这两种结构可以表示各种复杂的结构
对象:对象在js中表示为{ }括起来的内容,数据结构为 { key:value, key:value, … }的键值对的结构,在面向对象的语言中,key为对象的属性,value为对应的属性值,所以很容易理解,取值方法为 对象.key 获取属性值,这个属性值的类型可以是数字、字符串、数组、对象这几种。
数组:数组在js中是中括号[ ]括起来的内容,数据结构为 [“Python”, “javascript”, “C++”, …],取值方式和所有语言中一样,使用索引获取,字段值的类型可以是 数字、字符串、数组、对象几种。
json模块提供了四个功能:dumps、dump、loads、load,用于字符串 和 python数据类型间进行转换。
三. json.loads()
把Json格式字符串解码转换成Python对象 从json到python的类型转化对照如下:、
import json
strList = '[1, 2, 3, 4]'
strDict = '{"city": "北京", "name": "大猫"}'
json.loads(strList)
# [1, 2, 3, 4]
json.loads(strDict) # json数据自动按Unicode存储
# {u'city': u'\u5317\u4eac', u'name': u'\u5927\u732b'}
四. json.dumps()
实现python类型转化为json字符串,返回一个str对象 把一个Python对象编码转换成Json字符串
从python原始类型向json类型的转化对照如下:
import json
import chardet
listStr = [1, 2, 3, 4]
tupleStr = (1, 2, 3, 4)
dictStr = {
"city": "北京", "name": "大猫"}
json.dumps(listStr)
# '[1, 2, 3, 4]'
json.dumps(tupleStr)
# '[1, 2, 3, 4]'
# 注意:json.dumps() 序列化时默认使用的ascii编码
# 添加参数 ensure_ascii=False 禁用ascii编码,按utf-8编码
# chardet.detect()返回字典, 其中confidence是检测精确度
json.dumps(dictStr)
# '{"city": "\\u5317\\u4eac", "name": "\\u5927\\u5218"}'
chardet.detect(json.dumps(dictStr))
# {'confidence': 1.0, 'encoding': 'ascii'}
print json.dumps(dictStr, ensure_ascii=False)
# {"city": "北京", "name": "大刘"}
chardet.detect(json.dumps(dictStr, ensure_ascii=False))
# {'confidence': 0.99, 'encoding': 'utf-8'}
chardet是一个非常优秀的编码识别模块,可通过pip安装
五. json.dump()
将Python内置类型序列化为json对象后写入文件
import json
listStr = [{
"city": "北京"}, {
"name": "大刘"}]
json.dump(listStr, open("listStr.json","w"), ensure_ascii=False)
dictStr = {
"city": "北京", "name": "大刘"}
json.dump(dictStr, open("dictStr.json","w"), ensure_ascii=False)
六. json.load()
读取文件中json形式的字符串元素 转化成python类型
import json
strList = json.load(open("listStr.json"))
print strList
# [{u'city': u'\u5317\u4eac'}, {u'name': u'\u5927\u5218'}]
strDict = json.load(open("dictStr.json"))
print strDict
# {u'city': u'\u5317\u4eac', u'name': u'\u5927\u5218'}
七. JsonPath
JsonPath 是一种信息抽取类库,是从JSON文档中抽取指定信息的工具,提供多种语言实现版本,包括:Javascript, Python, PHP 和 Java。
JsonPath 对于 JSON 来说,相当于 XPATH 对于 XML。
下载地址:https://pypi.python.org/pypi/jsonpath
安装方法:点击Download URL链接下载jsonpath,解压之后执行python setup.py install
官方文档:http://goessner.net/articles/JsonPath
八. JsonPath与XPath语法对比
Json结构清晰,可读性高,复杂度低,非常容易匹配,下表中对应了XPath的用法。
九. 案例分析
以拉勾网城市JSON文件 http://www.lagou.com/lbs/getAllCitySearchLabels.json 为例,获取所有城市。
# 以拉勾网城市JSON文件 http://www.lagou.com/lbs/getAllCitySearchLabels.json 为例,获取所有城市。
import requests
import jsonpath
import json
import chardet
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
url = 'http://www.lagou.com/lbs/getAllCitySearchLabels.json'
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
response = requests.get(url,verify=False,headers = headers)
html = response.text
print('------------1.json格式,类型为str-----------')
print(html)
print (type(html))
# 把json格式字符串转换成python对象
jsonobj = json.loads(html)
print('------------2.python对象-----------')
print(jsonobj)
print(type(jsonobj))
# 从根节点开始,匹配name节点
citylist = jsonpath.jsonpath(jsonobj,'$..name')
print('------------3.提取name-----------')
print (citylist)
print (type(citylist))
content = json.dumps(citylist, ensure_ascii=False)
print('------------4.json格式,类型为str-----------')
print (content)
print (type(content))
fp = open('city.json','wb')
fp.write(content.encode('utf-8'))
fp.close()
输出结果:
------------1.json格式,类型为str-----------
{
"state":1,"message":"success","content":{
"data":{
"allCitySearchLabels":{
"A":[{
"id":723,"name":"安阳","parentId":545,"code":"171500000","isSelected":false},{
"id":601,"name":"鞍山","parentId":535,"code":"081600000","isSelected":false},{
"id":105795,"name":"澳门特别行政区","parentId":562,"code":"330100000","isSelected":false},{
"id":671,"name":"安庆","parentId":541,"code":"131800000","isSelected":false},{
"id":825,"name":"安顺","parentId":553,"code":"240400000","isSelected":false},{
"id":903,"name":"阿勒泰","parentId":560,"code":"310400000","isSelected":false},{
"id":897,"name":"阿克苏","parentId":560,"code":"311800000","isSelected":false},{
"id":862,"name":"安康","parentId":556,"code":"270400000","isSelected":false},{
"id":819,"name":"阿坝藏族羌族自治州","parentId":552,"code":"230700000","isSelected":false},{
"id":598,"name":"阿拉善盟","parentId":534,"code":"070300000","isSelected":false}],"B":[{
"id":5,"name":"北京","parentId":1,"code":"010100000","isSelected":false},{
"id":570,"name":"保定","parentId":532,"code":"051100000","isSelected":false},{
"id":666,"name":"蚌埠","parentId":541,"code":"131300000","isSelected":false},{
"id":588,"name":"包头","parentId":534,"code":"071300000","isSelected":false},{
"id":717,"name":"滨州","parentId":544,"code":"161400000","isSelected":false},{
"id":856,"name":"宝鸡","parentId":556,"code":"271000000","isSelected":false},{
"id":678,"name":"亳州","parentId":541,"code":"130500000","isSelected":false},{
"id":789,"name":"北海","parentId":549,"code":"211000000","isSelected":false},{
"id":794,"name":"百色","parentId":549,"code":"210500000","isSelected":false},{
"id":817,"name":"巴中","parentId":552,"code":"230500000","isSelected":false},{
"id":828,"name":"毕节","parentId":553,"code":"240700000","isSelected":false},{
"id":603,"name":"本溪","parentId":535,"code":"081400000","isSelected":false},{
"id":896,"name":"巴音郭楞","parentId":560,"code":"311700000","isSelected":false},{
"id":834,"name":"保山","parentId":554,"code":"251400000","isSelected":false},{
"id":597,"name":"巴彦淖尔","parentId":534,"code":"070400000","isSelected":false},{
"id":895,"name":"博尔塔拉","parentId":560,"code":"311600000","isSelected":false},{
"id":620,"name":"白城","parentId":536,"code":"090400000","isSelected":false},{
"id":618,"name":"白山","parentId":536,"code":"090600000","isSelected":false}],"C":[{
"id":801,"name":"成都","parentId":552,"code":"230100000","isSelected":false},{
"id":749,"name":"长沙","parentId":547,"code":"190100000","isSelected":false},{
"id":8,"name":"重庆","parentId":4,"code":"040100000","isSelected":false},{
"id":613,"name":"长春","parentId":536,"code":"090100000","isSelected":false},{
"id":638,"name":"常州","parentId":539,"code":"112000000","isSelected":false},{
"id":573,"name":"沧州","parentId":532,"code":"050800000","isSelected":false},{
"id":590,"name":"赤峰","parentId":534,"code":"071100000","isSelected":false},{
"id":758,"name":"郴州","parentId":547,"code":"190500000","isSelected":false},{
"id":781,"name":"潮州","parentId":548,"code":"200500000","isSelected":false},{
"id":755,"name":"常德","parentId":547,"code":"190800000","isSelected":false},{
"id":673,"name":"滁州","parentId":541,"code":"131100000","isSelected":false},{
"id":611,"name":"朝阳","parentId":535,"code":"080600000","isSelected":false},{
"id":572,"name":"承德","parentId":532,"code":"050900000","isSelected":false},{
"id":679,"name":"池州","parentId":541,"code":"130600000","isSelected":false},{
"id":836,"name":"楚雄","parentId":554,"code":"251200000","isSelected":false},{
"id":894,"name":"昌吉","parentId":560,"code":"311500000","isSelected":false},{
"id":905,"name":"崇左","parentId":549,"code":"211400000","isSelected":false}],"D":[{
"id":779,"name":"东莞","parentId":548,"code":"200300000","isSelected":false},{
"id":600,"name":"大连","parentId":535,"code":"081700000","isSelected":false},{
"id":715,"name":"德州","parentId":544,"code":"161600000","isSelected":false},{
"id":627,"name":"大庆","parentId":537,"code":"101300000","isSelected":false},{
"id":805,"name":"德阳","parentId":552,"code":"231700000","isSelected":false},{
"id":706,"name":"东营","parentId":544,"code":"162000000","isSelected":false},{
"id":577,"name":"大同","parentId":533,"code":"061200000","isSelected":false},{
"id":815,"name":"达州","parentId":552,"code":"230300000","isSelected":false},{
"id":841,"name":"大理","parentId":554,"code":"250700000","isSelected":false},{
"id":604,"name":"丹东","parentId":535,"code":"081300000","isSelected":false},{
"id":874,"name":"定西","parentId":557,"code":"280400000","isSelected":false},{
"id":842,"name":"德宏","parentId":554,"code":"250600000","isSelected":false},{
"id":107620,"name":"儋州","parentId":550,"code":"220201000","isSelected":false},{
"id":845,"name":"迪庆","parentId":554,"code":"250300000","isSelected":false}],"E":[{
"id":741,"name":"鄂州","parentId":546,"code":"181600000","isSelected":false},{
"id":748,"name":"恩施","parentId":546,"code":"180300000","isSelected":false},{
"id":592,"name":"鄂尔多斯","parentId":534,"code":"070900000","isSelected":false}],"F":[{
"id":768,"name":"佛山","parentId":548,"code":"202000000","isSelected":false},{
"id":681,"name":"福州","parentId":542,"code":"140100000","isSelected":false},{
"id":674,"name":"阜阳","parentId":541,"code":"131000000","isSelected":false},{
"id":700,"name":"抚州","parentId":543,"code":"150200000","isSelected":false},{
"id":602,"name":"抚顺","parentId":535,"code":"081500000","isSelected":false},{
"id":607,"name":"阜新","parentId":535,"code":"081000000","isSelected":false},{
"id":790,"name":"防城港","parentId":549,"code":"210900000","isSelected":false}],"G":[{
"id":763,"name":"广州","parentId":548,"code":"200100000","isSelected":false},{
"id":822,"name":"贵阳","parentId":553,"code":"240100000","isSelected":false},{
"id":787,"name":"桂林","parentId":549,"code":"211200000","isSelected":false},{
"id":697,"name":"赣州","parentId":543,"code":"150500000","isSelected":false},{
"id":807,"name":"广元","parentId":552,"code":"231900000","isSelected":false},{
"id":792,"name":"贵港","parentId":549,"code":"210700000","isSelected":false},{
"id":814,"name":"广安","parentId":552,"code":"230200000","isSelected":false},{
"id":889,"name":"固原","parentId":559,"code":"300400000","isSelected":false},{
"id":820,"name":"甘孜藏族自治州","parentId":552,"code":"230800000","isSelected":false}],"H":[{
"id":653,"name":"杭州","parentId":540,"code":"120100000","isSelected":false},{
"id":664,"name":"合肥","parentId":541,"code":"130100000","isSelected":false},{
"id":773,"name":"惠州","parentId":548,"code":"202500000","isSelected":false},{
"id":622,"name":"哈尔滨","parentId":537,"code":"100100000","isSelected":false},{
"id":799,"name":"海口","parentId":550,"code":"220100000","isSelected":false},{
"id":587,"name":"呼和浩特","parentId":534,"code":"070100000","isSelected":false},{
"id":568,"name":"邯郸","parentId":532,"code":"051300000","isSelected":false},{
"id":657,"name":"湖州","parentId":540,"code":"122200000","isSelected":false},{
"id":752,"name":"衡阳","parentId":547,"code":"191100000","isSelected":false},{
"id":108353,"name":"海外","parentId":108352,"code":"350100000","isSelected":false},{
"id":643,"name":"淮安","parentId":539,"code":"112500000","isSelected":false},{
"id":718,"name":"菏泽","parentId":544,"code":"160200000","isSelected":false},{
"id":575,"name":"衡水","parentId":532,"code":"050600000","isSelected":false},{
"id":776,"name":"河源","parentId":548,"code":"201400000","isSelected":false},{
"id":760,"name":"怀化","parentId":547,"code":"190300000","isSelected":false},{
"id":745,"name":"黄冈","parentId":546,"code":"181100000","isSelected":false},{
"id":737,"name":"黄石","parentId":546,"code":"181200000","isSelected":false},{
"id":672,"name":"黄山","parentId":541,"code":"131900000","isSelected":false},{
"id":612,"name":"葫芦岛","parentId":535,"code":"080500000","isSelected":false},{
"id":669,"name":"淮北","parentId":541,"code":"131600000","isSelected":false},{
"id":667,"name":"淮南","parentId":541,"code":"131400000","isSelected":false},{
"id":593,"name":"呼伦贝尔","parentId":534,"code":"070800000","isSelected":false},{
"id":860,"name":"汉中","parentId":556,"code":"270600000","isSelected":false},{
"id":795,"name":"贺州","parentId":549,"code":"210400000","isSelected":false},{
"id":837,"name":"红河","parentId":554,"code":"251100000","isSelected":false},{
"id":796,"name":"河池","parentId":549,"code":"210300000","isSelected":false},{
"id":724,"name":"鹤壁","parentId":545,"code":"171600000","isSelected":false},{
"id":879,"name":"海东","parentId":558,"code":"290200000","isSelected":false},{
"id":625,"name":"鹤岗","parentId":537,"code":"101500000","isSelected":false},{
"id":893,"name":"哈密","parentId":560,"code":"311400000","isSelected":false}],"J":[{
"id":702,"name":"济南","parentId":544,"code":"160100000","isSelected":false},{
"id":659,"name":"金华","parentId":540,"code":"122400000","isSelected":false},{
"id":656,"name":"嘉兴","parentId":540,"code":"122100000","isSelected":false},{
"id":769,"name":"江门","parentId":548,"code":"202100000","isSelected":false},{
"id":709,"name":"济宁","parentId":544,"code":"162300000","isSelected":false},{
"id":582,"name":"晋中","parentId":533,"code":"060700000","isSelected":false},{
"id":614,"name":"吉林","parentId":536,"code":"091000000","isSelected":false},{
"id":694,"name":"九江","parentId":543,"code":"150800000","isSelected":false},{
"id":782,"name":"揭阳","parentId":548,"code":"200600000","isSelected":false},{
"id":744,"name":"荆州","parentId":546,"code":"181900000","isSelected":false},{
"id":726,"name":"焦作","parentId":545,"code":"171800000","isSelected":false},{
"id":605,"name":"锦州","parentId":535,"code":"081200000","isSelected":false},{
"id":742,"name":"荆门","parentId":546,"code":"181700000","isSelected":false},{
"id":698,"name":"吉安","parentId":543,"code":"150400000","isSelected":false},{
"id":692,"name":"景德镇","parentId":543,"code":"151000000","isSelected":false},{
"id":580,"name":"晋城","parentId":533,"code":"060900000","isSelected":false},{
"id":629,"name":"佳木斯","parentId":537,"code":"101100000","isSelected":false},{
"id":872,"name":"酒泉","parentId":557,"code":"280600000","isSelected":false},{
"id":107292,"name":"济源","parentId":545,"code"