json读取数据:ValueError: Extra data: line 77 column 2 - line 16485 column 1 (char 1159 - 227243)

python解析json文件,搞了好久,原来问题出在换行的这个地方。。。。。。借鉴了别人的博客把问题解决了

如果你要读取一个多行的json文件,比如

{"pid": 150400, "id": 150402, "name": "电影票"}
{"pid": 150000, "id": 150500, "name": "票务"}
{"pid": 150500, "id": 150501, "name": "国内旅游"}
{"pid": 150500, "id": 150502, "name": "海外旅游"}

如果你直接使用:

file = open('test.json','r')
res = file.read()
dic = json.loads(res)

则会抛出异常:ValueError: Extra data: line 2 column 2 - line 4 column 2
表示数据错误,数据太多:第二行-第四行
因为json只能读取一个文档对象,有两个解决办法
1、单行读取文件
2、保存数据源的时候,格式写为一个对象

代码:

方法一.单行读取文件

file = open('test.json','r')
for line in file.readlines():
    dic = json.loads(line)

方法二.保存数据源的时候,格式写为一个对象

{"cates":[
{"pid": 150400, "id": 150402, "name": "电影票"},
{"pid": 150000, "id": 150500, "name": "票务"},
{"pid": 150500, "id": 150501, "name": "国内旅游"},
{"pid": 150500, "id": 150502, "name": "海外旅游"}
]}

然后 就是作为一个文档对象处理

file = open('test.json','r')
res = file.read()
dic = json.loads(res)

我的json文件是分行的,每个对象存一行,这样我就按照第一个方法逐行读取了数据,此外,要注意读出数据的类型,int型要str(int_data)转化一下,如果是list,想转化为str的话用  ' '.join(list_data)   或者','.join(list_data)

即将list中的元素用空格或者逗号连接成为一个str

最后上代码

json文件

{"_shodan": {"options": {}, "id": null, "module": "nodata-tcp", "crawler": "70752434fdf0dcec35df6ae02b9703eaae035f7d"}, "product": "DarkComet trojan", "hash": 813106007, "os": null, "tags": ["malware"], "opts": {}, "ip": 782181893, "isp": "Rostelecom", "cpe": ["cpe:/o:microsoft:windows"], "port": 1604, "hostnames": [], "location": {"city": "Krasnodar", "region_code": "38", "area_code": null, "longitude": 38.976, "country_code3": "RUS", "country_name": "Russian Federation", "postal_code": "350000", "dma_code": null, "country_code": "RU", "latitude": 45.04480000000001}, "timestamp": "2018-07-07T11:09:36.775052", "domains": [], "org": "Rostelecom", "data": "BF7CAB464EFB", "asn": "AS12389", "transport": "tcp", "ip_str": "46.159.38.5"}
{"_shodan": {"options": {}, "id": null, "module": "nodata-tcp", "crawler": "63eb28fc97905ae9bb3bb2d833157792bedfef99"}, "product": "DarkComet trojan", "hash": 813106007, "os": null, "tags": ["malware"], "opts": {}, "ip": 2591821768, "isp": "SONATEL Societe Nationale Des Telecommunications D", "cpe": ["cpe:/o:microsoft:windows"], "port": 1604, "hostnames": [], "location": {"city": "Dakar", "region_code": "01", "area_code": null, "longitude": -17.43809999999999, "country_code3": "SEN", "country_name": "Senegal", "postal_code": null, "dma_code": null, "country_code": "SN", "latitude": 14.670800000000014}, "timestamp": "2018-06-09T16:25:56.327716", "domains": [], "org": "SONATEL Societe Nationale Des Telecommunications D", "data": "BF7CAB464EFB", "asn": "AS8346", "transport": "tcp", "ip_str": "154.124.15.200"}
{"_shodan": {"options": {}, "id": "29eeb4bf-aae6-4677-82c2-75f714b931e2", "module": "zeroaccess", "crawler": "c9b639b99e5410a46f656e1508a68f1e6e5d6f99"}, "product": "ZeroAccess trojan", "hash": -1656209872, "os": null, "tags": ["malware"], "opts": {"raw": "f52b12fd28948dbec9c0d1998381a33398c4973344068ecec63b0d7c2019383a5880055dec64e0e82c19e42a139381a3cf1985f4ea4c068e4941b3551d331938ce3cac5fc3cc64e0f5765bb78f3393810114bfd8bece4c066f8a95e8093a33193d32194bd2e8cc64573c9ecdb4a333934db4739c3f8ece4c590d399623383a3306d9c9485fe0e8ccb0406f7daf81a333200347674d068ece7e3acfa2331a383a1c31fb47651226519a48880e088caf2e407dc9c0088c7c449daaaf224e8003f1d94c60a9a8f0ef8cc1452638b17aa356ef207dd018260e8fa23eb00ae66b3a3040c9b7196705433d31d3c2520d9bbe0b2dece69884356c17f781fb7436301a2610283289c2ad5ae1562aa3dd6575ce9dc17ded6a0679419f481b3073fa0f1bb664327074ff5d33d799cac0d143a5ef842d44e7910d397aed2114c8be409d52bec6a367f71e3ce4a01785fb53632b5818698659d852082268c75c960571b60e470ebbe9ac7d3f4fbb6d44d937ed4ba9d2bfba485209cc96308393eca70ee933bfe12ff8295b691f2dbf994179ec15a400f9fd696c8d8264d6adfe4aff907e202b8d4e2a7bca6495a3e62fcb41339381236510d078ce42068ee9c27820e1cf85433d603005a7ba622a0693139806d389ecde4ef8681d467b93c9cec3a0e4ea43b121b6346c83b97063962d271d793f38fcc4d51f41b3e4de675745e25c8460e0d45c23d8a30e85ef501063312ae2645034714771479f4655df785dbf487fabc1ce6d039c6b477a6d382e3528d79f2f0ac5cecbf59b1c35928d9c990c9cf304ca0774683270755784dbf7cd392ae50112946d568b75f5c899a169fb0d191f7a3fa7eea8d39813b0a58bbd5a2fb3c166a70c0cf99c8b4b24fc0913451cd5f7aa7f7ed6e2d5c6bee02b496d09255c000e3d8e65be454d7eb80d6e55c7a3a9bf02ca446db3a6cea15d41a8dc502f3b25df8bfa2cb3ad585a4fb1e9744b17d02fedaadcc6aede35dacc646021f7fc83a3cf9381a400f1a28ca5876d941678804337f74b3fd7cee0b135dd8b6228fd946ac55d6b851793f3fd0721ac9a93289e5173a65ba2cb00a71472d0f9f3801f6dc36c687489071eeb9bcfa32def1d3f9f76c7dbd2225721b8f43c4c64806261bde766c6d3ddc6989551fed8981519f47e404170187b0c77d1cda2f5265e174df42efa2e4b8c672683b3f89f391c93990c1ac8e3a0718767a5d2f25e9b493baf15bb252d095b30cf552d3dfd1af5ba555d5358130fd52e62655fda9687c1c2178aa98e4ca76fe32bfb4ee255e0977d765bf3f1a1b38a586bec6a961a12be23f4a70df00338c778dc0539df841d36cdeb04e48b0f75532a9f5750a329df86362bca95ac35c2ae0a19488d4cdaea185d4a1d"}, "ip": 3200770856, "isp": "Cantv", "port": 16464, "hostnames": ["190-199-227-40.dyn.dsl.cantv.net"], "location": {"city": "Aguirre", "region_code": "07", "area_code": null, "longitude": -68.2658, "country_code3": "VEN", "country_name": "Venezuela", "postal_code": null, "dma_code": null, "country_code": "VE", "latitude": 10.243599999999986}, "timestamp": "2018-06-09T15:10:54.764284", "domains": ["cantv.net"], "org": "Cantv", "data": "\\xf5+\\x12\\xfd(\\x94\\x8d\\xbe\\xc9\\xc0\\xd1\\x99\\x83\\x81\\xa33\\x98\\xc4\\x973D\\x06\\x8e\\xce\\xc6;\\r| \\x198:X\\x80\\x05]\\xecd\\xe0\\xe8,\\x19\\xe4*\\x13\\x93\\x81\\xa3\\xcf\\x19\\x85\\xf4\\xeaL\\x06\\x8eIA\\xb3U\\x1d3\\x198\\xce<\\xac_\\xc3\\xccd\\xe0\\xf5v[\\xb7\\x8f3\\x93\\x81\\x01\\x14\\xbf\\xd8\\xbe\\xceL\\x06o\\x8a\\x95\\xe8\\t:3\\x19=2\\x19K\\xd2\\xe8\\xccdW<\\x9e\\xcd\\xb4\\xa33\\x93M\\xb4s\\x9c?\\x8e\\xceLY\\r9\\x96#8:3\\x06\\xd9\\xc9H_\\xe0\\xe8\\xcc\\xb0@o}\\xaf\\x81\\xa33 \\x03GgM\\x06\\x8e\\xce~:\\xcf\\xa23\\x1a8:\\x1c1\\xfbGe\\x12&Q\\x9aH\\x88\\x0e\\x08\\x8c\\xaf.@}\\xc9\\xc0\\x08\\x8c|D\\x9d\\xaa\\xaf\"N\\x80\\x03\\xf1\\xd9L`\\xa9\\xa8\\xf0\\xef\\x8c\\xc1E&8\\xb1z\\xa3V\\xef }\\xd0\\x18&\\x0e\\x8f\\xa2>\\xb0\\n\\xe6k:0@\\xc9\\xb7\\x19g\\x05C=1\\xd3\\xc2R\\r\\x9b\\xbe\\x0b-\\xec\\xe6\\x98\\x845l\\x17\\xf7\\x81\\xfbt60\\x1a&\\x10(2\\x89\\xc2\\xadZ\\xe1V*\\xa3\\xddeu\\xce\\x9d\\xc1}\\xedj\\x06yA\\x9fH\\x1b0s\\xfa\\x0f\\x1b\\xb6d2pt\\xff]3\\xd7\\x99\\xca\\xc0\\xd1C\\xa5\\xef\\x84-D\\xe7\\x91\\r9z\\xed!\\x14\\xc8\\xbe@\\x9dR\\xbe\\xc6\\xa3g\\xf7\\x1e<\\xe4\\xa0\\x17\\x85\\xfbSc+X\\x18i\\x86Y\\xd8R\\x08\"h\\xc7\\\\\\x96\\x05q\\xb6\\x0eG\\x0e\\xbb\\xe9\\xac}?O\\xbbmD\\xd97\\xedK\\xa9\\xd2\\xbf\\xbaHR\\t\\xcc\\x960\\x83\\x93\\xec\\xa7\\x0e\\xe93\\xbf\\xe1/\\xf8)[i\\x1f-\\xbf\\x99Ay\\xec\\x15\\xa4\\x00\\xf9\\xfdil\\x8d\\x82d\\xd6\\xad\\xfeJ\\xff\\x90~ +\\x8dN*{\\xcad\\x95\\xa3\\xe6/\\xcbA3\\x93\\x81#e\\x10\\xd0x\\xceB\\x06\\x8e\\xe9\\xc2x \\xe1\\xcf\\x85C=`0\\x05\\xa7\\xbab*\\x06\\x93\\x13\\x98\\x06\\xd3\\x89\\xec\\xdeN\\xf8h\\x1dF{\\x93\\xc9\\xce\\xc3\\xa0\\xe4\\xeaC\\xb1!\\xb64l\\x83\\xb9pc\\x96-\\'\\x1dy?8\\xfc\\xc4\\xd5\\x1fA\\xb3\\xe4\\xdegWE\\xe2\\\\\\x84`\\xe0\\xd4\\\\#\\xd8\\xa3\\x0e\\x85\\xefP\\x10c1*\\xe2dP4qGqG\\x9fFU\\xdfx]\\xbfH\\x7f\\xab
  • 3
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值