json文件数据:
{
"store": {
"book": [
{
"category": "修真",
"author": "六道",
"title": "坏蛋是怎样练成的",
"price": 8.95
},
{
"category": "玄幻",
"author": "天蚕土豆",
"title": "斗破苍穹",
"price": 12.99
},
{
"category": "科幻",
"author": "刘慈欣",
"title": "三体",
"price": 29.99
},
{
"category": "历史",
"author": "当年明月",
"title": "明朝那些事儿",
"price": 39.95
},
{
"category": "文学",
"author": "村上春树",
"title": "挪威的森林",
"price": 19.99
},
{
"category": "悬疑",
"author": "东野圭吾",
"title": "白夜行",
"price": 25.99,
"isbn": 11002
},
{
"category": "经济",
"author": "吴晓波",
"title": "大败局",
"price": 36.99,
"isbn": 11001
}
]
}
}
代码:
import json
import jsonpath
obj = json.load(open('爬虫_018_jsonpath.json', 'r', encoding='utf-8'))
#查看所有作者
#author_list = jsonpath.jsonpath(obj, '$.store.book[*].author')
#author_list = jsonpath.jsonpath(obj, '$..author')
#print(author_list)
#store下面的所有元素
# tag_list = jsonpath.jsonpath(obj, '$..store.*')
# print(tag_list)
#store下面的price
# price_list = jsonpath.jsonpath(obj, '$..book[*].price')
# print(price_list)
#第三本书
# num_list = jsonpath.jsonpath(obj, '$..book[2]')
# print(num_list)
#最后一本书
# book = jsonpath.jsonpath(obj, '$..book[(@.length-1])')
# print(book)
#前两本书
#book = jsonpath.jsonpath(obj, '$..book[:2]')
# book = jsonpath.jsonpath(obj, '$..book[0,1]')
# print(book)
#包含isbn的书
# isbn_list= jsonpath.jsonpath(obj, '$..book.[?(@.isbn)]')
# print(isbn_list)
#价格超过20的
price_list = jsonpath.jsonpath(obj, '$..book.[?(@.price>20)]')
print(price_list)
jsonpath语法
JSONPATH | 描述 |
$ | 根对象,例如$.name |
. 或者 [ ] | 子节点,例如$.name |
.. | 子孙节点访问,例如$..name |
@ | 当前对象自身 |
* | 通配符,所有,例如$.leader.* |
['key0','key1'] | 多个节点访问。例如$['id','name'] |
[num] | 数组索引访问,可以是负数。例如$[0].leader.departments[-1].name |
[num0,num1,num2...] | 数组多个元素访问,可以是负数,返回数组中的多个元素。例如$[0,3,-2,5] |
[start:end :step] | 数组范围访问,可以是负数;step是步长,返回数组中的多个元素。例如$[0:5:2] |
[?(key)] | 对象属性非空过滤,例如$.departs[?(name)],存在指定属性的departs |
[key > 123] | 数值类型对象属性比较过滤,例如$[id >= 123],支持=,!=,>,>=,<,<= |
[key = '123'] | 字符串类型对象属性比较过滤,例如$[name = '123'],支持=,!=,>,>=,<,<= |
[key like 'aa%'] | 字符串类型like过滤,例如$[name like 'sz*'],通配符只支持%,支持not like |
[key rlike 'regexpr'] | 字符串类型正则匹配过滤,指定正则字符串 例如departs[name like 'aa(.)*'], 正则语法为jdk的正则语法,支持not rlike |
[key in ('v0', 'v1')] | IN过滤, 支持字符串和数值类型 例如: $.departs[name in ('wenshao','Yako')] $.departs[id not in (101,102)] |
[key between 234 and 456] | BETWEEN过滤, 支持数值类型,删选数值范围,支持not between 例如: $.departs[id between 101 and 201] $.departs[id not between 101 and 201] |
length() 或者 size() | 数组长度。例如$.values.size() 支持类型java.util.Map和java.util.Collection和数组 |
keySet() | 获取Map的keySet或者对象的非空属性名称。例如$.val.keySet() 支持类型:Map和普通对象 不支持:Collection和数组(返回null) |
jsonpath和xpath对比
XPath | JSONPath | Result |
/store/book/author | $.store.book[*].author | the authors of all books in the store |
//author | $..author | all authors |
/store/* | $.store.* | all things in store, which are some books and a red bicycle. |
/store//price | $.store..price | the price of everything in the store. |
//book[3] | $..book[2] | the third book |
//book[last()] | $..book[(@.length-1)] $..book[-1:] | the last book in order. |
//book[position()<3] | $..book[0,1] $..book[:2] | the first two books |
//book[isbn] | $..book[?(@.isbn)] | filter all books with isbn number |
//book[price<10] | $..book[?(@.price<10)] | filter all books cheapier than 10 |
//* | $..* | all Elements in XML document. All members of JSON structure. |