parse json with python

原创 2012年03月25日 01:56:34


parse a log file which is a json
python 2.6.* already has a json module, simplejson. python 2.5.* need install it manually. 

of course, 

import json

1, read the file: 

f = open('json.log', 'r')
data = f.read()

2, load as json

trim the other string at the beginning of file

jdata = json.loads(data[data.find('{') : ])

if the whole file is a json structure, would also load a json as following:

jdata = json.load(f) 
jdata = json.JSONDecoder().decode(f.read())

3, parse the json 

a sub josn named Mapper is included in it, so get it 

mapper = jdata["mapper"]
there may be a sub json, Reducer, in it, try to get it 

reducer = {}
if "Reducer" in jdata : 
reducer = jdata["Reducer"]

there are some other json object in it, iterate them 

for k, v in jdata.iteritems() : 
print k
the mapper, which is a list, have several elements which are json object also 

for elem in mapper : 
for k, v in elem.iteritems() : 
print k, v 

Practice 1

a Cangjie is a structure type  defined in our project. a Cangjie difination file includes many json structures. 

the file like this:

{
 "namespace" : "project::component",
 "dependence" : ["../include/log_cj.cj", "base_cj.cj"]
}

[[ // enum 
"LogType",
["CREATE",     0],     
["DELETE",         1]           
]]

[[ // enum 
"DetectionType",
["32BIT_PER_CHUNK",        0],     
["32BIT_PER_BLOCK_CRC32C", 1],
["32BIT_PER_BLOCK_CRC32C_V2", 2]
]]

["LogCJ",
    {
      "base": "project::component::Message"
    },
    ["required", "uint32", "Type", 100],
    ["required", "IdTypeCJ", "Id", 101],
    ["required", "vuint64", "Version", 102],
    ["optional", "uint32", "Checksum", 103],
    ["optional", "uint64", "Length", 104, 0],
    ["optional", "uint32", "DiskId", 105, 0],
    ["optional", "uint32", "DetectionType", 106, 0],
    ["optional", "string", "Checksum", 107, ""],
    ["optional", "uint64", "ChecksumLength", 108, 0],
    ["optional", "uint64", "Timestamps", 109, 0],
    ["optional", "int32",  "Status", 110, 0],
    ["optional", "bool",  "IsDeleted", 111, false] // some comments
]


i need a tool to parse it . the key is how to split jsons out. 

string  '}{' or  '}['  or '][' is the separator between jsons. so, use a regular expression and code as follow

    buf = ''
    fd = open(cangjie_file)
    for line in fd.readlines() :
        line = line[:line.find('//')].strip().replace(' ', '')
        if len(line) == 0 :
            continue
        buf += line
    fd.close()
    pattern = re.compile(r"\]\[|\}\[|\]\{") #set separator
    beg = 0
    for match in pattern.finditer(buf):
        #print match.group(),match.start(), match.end(), m.span()
        print json.loads(buf[beg : match.start() + 1]) # it's the json we want
        beg = match.end() - 1



TO BE CONTINUED





Python读写json格式文件

转自:http://blog.csdn.net/zhanh1218/article/details/26009329?utm_source=tuicool&utm_medium=referral J...
  • will130
  • will130
  • 2015年10月25日 16:31
  • 8175

Python模块学习之json

最近接触到JSON,顺便学习一下Python对Json的操作。

Python处理JSON

Python处理JSON背景拖了很久的项目,有一个大概是爬虫爬来的文档,全是JSON格式的,为了能够进一步处理,需要把里面的东西扒出来,大概了解了一下基本主要的语言都有JSON读写库,刚好这个项目要用...

json parse

  • 2015年09月20日 15:54
  • 2.92MB
  • 下载

Using Databases with Python - Many students in many courses (produce database from json file)

This application will read roster data in JSON format, parse the file, and then produce an SQLite da...

Parse JSON in TSQL

  • 2017年04月19日 06:25
  • 12KB
  • 下载

iText parse html with RichText and images to pdf

I use itextpdf to convert RichText to pdf and encountered many issues. Here are the three issues I w...
  • GAGA704
  • GAGA704
  • 2015年01月07日 10:01
  • 1969

Parse Date-time From String With PHP & MySQL

STR_TO_DATE: You need to tell MySQL how to parse the string, and you do that byfiguring out the...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:parse json with python
举报原因:
原因补充:

(最多只允许输入30个字)