使用shell脚本解析json数据不像java或者python一样轻便,较为麻烦,还好有一个jq.其使用起来比较像grep ,awk,sed等命令,也是管道命令.此外jq也没有乱七八槽的依赖,只需要一个binary文件即可.其官网介绍见这里{:target=”_blank”}
官方帮助文档见这里
安装jq
在Ubuntu系统下:
apt install jq
使用
有json数据文件example.json,其内容如下:
{ "took" : 39, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 325305, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "group" : { "doc_count_error_upper_bound" : -1, "sum_other_doc_count" : 319035, "buckets" : [ { "key" : "http://www.csdn.net/", "doc_count" : 1334, "importTotalSuccess" : { "value" : 9580402.0 }, "importTotalTime" : { "value" : 2.473873E7 }, "importTotalAll" : { "value" : 1.2589526E7 }, "importTotalRepeat" : { "value" : 3006465.0 }, "importTotalSize" : { "value" : 5.40141167E8 } }, { "key" : "https://www.baidu.com/", "doc_count" : 1346, "importTotalSuccess" : { "value" : 8749457.0 }, "importTotalTime" : { "value" : 1.712813E7 }, "importTotalAll" : { "value" : 1.0137599E7 }, "importTotalRepeat" : { "value" : 1385769.0 }, "importTotalSize" : { "value" : 4.62986378E8 } }, { "key" : "http://www.cnblogs.com/", "doc_count" : 3590, "importTotalSuccess" : { "value" : 7106773.0 }, "importTotalTime" : { "value" : 2.9117391E7 }, "importTotalAll" : { "value" : 9490425.0 }, "importTotalRepeat" : { "value" : 2380536.0 }, "importTotalSize" : { "value" : 5.16549393E8 } } ] } } }
- 使用jq格式化json数据(也可以检验json数据是否规范):
jq '.' ./example.json
输出结果:
{
"took": 39,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 325305,
"max_score": 0,
"hits": []
},
"aggregations": {
"group": {
"doc_count_error_upper_bound": -1,
"sum_other_doc_count": 319035,
"buckets": [
{
"key": "http://www.csdn.net/",
"doc_count": 1334,
"importTotalSuccess": {
"value": 9580402 },
"importTotalTime": {
"value": 24738730 },
"importTotalAll": {
"value": 12589526 },
"importTotalRepeat": {
"value": 3006465 },
"importTotalSize": {
"value": 540141167 }
},
{
"key": "https://www.baidu.com/",
"doc_count": 1346,
"importTotalSuccess": {
"value": 8749457 },
"importTotalTime": {
"value": 17128130 },
"importTotalAll": {
"value": 10137599 },
"importTotalRepeat": {
"value": 1385769 },
"importTotalSize": {
"value": 462986378 }
},
{
"key": "http://www.cnblogs.com/",
"doc_count": 3590,
"importTotalSuccess": {
"value": 7106773 },
"importTotalTime": {
"value": 29117391 },
"importTotalAll": {
"value": 9490425 },
"importTotalRepeat": {
"value": 2380536 },
"importTotalSize": {
"value": 516549393 }
}
]
}
}
}
2.获取某键的值,以hits.total值为例:
jq '.hits.total' ./example.json
输出:
325305
如果元素不存在,返回null
3.解析数组,以aggregations.group.buckets为例:
jq '.aggregations.group.buckets[0]' ./example.json
输出:
{
"key": "http://www.csdn.net/",
"doc_count": 1334,
"importTotalSuccess": {
"value": 9580402
},
"importTotalTime": {
"value": 24738730
},
"importTotalAll": {
"value": 12589526
},
"importTotalRepeat": {
"value": 3006465
},
"importTotalSize": {
"value": 540141167
}
}
4.自定义输出格式,获取json中某些key的值,并组合为新的json串
首先以获取aggregations.group.buckets数组中的第一个元素的键key和doc_count的值并更改键名为A和B为例:
jq '.aggregations.group.buckets[0] | {A:.key,B:.doc_count}
输出:
{
"A": "http://www.csdn.net/",
"B": 1334
}
获取aggregations.group.buckets数组中所有元素的键key和doc_count的值:
jq '.aggregations.group.buckets[] | {A:.key,B:.doc_count}' ./example.json
输出:
{
"A": "http://www.csdn.net/",
"B": 1334
}
{
"A": "https://www.baidu.com/",
"B": 1346
}
{
"A": "http://www.cnblogs.com/",
"B": 3590
}
5.自定义输出多项为数组的形式,只要在上一步的命令两端加个中括号即可:
jq '[.aggregations.group.buckets[] | {A:.key,B:.doc_count}]' ./example.json
输出:
[
{
"A": "http://www.csdn.net/",
"B": 1334
},
{
"A": "https://www.baidu.com/",
"B": 1346
},
{
"A": "http://www.cnblogs.com/",
"B": 3590
}
]
6.默认情况下,jq规范地打印输出json,通过查阅jq帮助文档可知使用参数-c能够简洁的将每个json对象打印一行:
jq -c '[.aggregations.group.buckets[] | {A:.key,B:.doc_count}]' ./example.json
输出
[{"A":"http://www.csdn.net/","B":1334},{"A":"https://www.baidu.com/","B":1346},{"A":"http://www.cnblogs.com/","B":3590}]
7.提取某些项以不含键的数组形式输出:
jq -c '[.aggregations.group.buckets[] | .key,.importTotalSuccess.value,.importTotalTime.value,.importTotalAll.value,.importTotalRepeat.value,.importTotalSize.value]' ./example.json
输出:
["http://www.csdn.net/",9580402,24738730,12589526,3006465,540141167,"https://www.baidu.com/",8749457,17128130,10137599,1385769,462986378,"http://www.cnblogs.com/",7106773,29117391,9490425,2380536,516549393]
以每个aggregations.group.buckets数组中的元素作为一行打印此数组中的值:
jq -c '.aggregations.group.buckets[] | [.key,.importTotalSuccess.value,.importTotalTime.value,.importTotalAll.value,.importTotalRepeat.value,.importTotalSize.value]' ./example.json
输出:
["http://www.csdn.net/",9580402,24738730,12589526,3006465,540141167]
["https://www.baidu.com/",8749457,17128130,10137599,1385769,462986378]
["http://www.cnblogs.com/",7106773,29117391,9490425,2380536,516549393]
8.内建函数
jq包含一些内建函数:keys,has,’+’,’-‘,’*’,’/’,’%’,length,in等等见官方帮助文档.其中keys获取json中的key元素.has判断是否存在某key:
jq 'keys' ./example.json
输出:
[
"_shards",
"aggregations",
"hits",
"timed_out",
"took"
]
jq 'has("hits")' ./example.json
输出:
true
其他
注意区分json对象,json字符串以及json格式的数据之间的区别:
json对象:
{
a: 1,
b: {
c: "abc"
}
}
json字符串
'{"a":1,"b":{"c":"abc"}}'
正确的json格式:
{
"a": 1,
"b": {
"c": "abc"
}
}