python出租车数据_Python处理JSON格式数据(出租车轨迹数据)

这篇博客介绍了如何处理包含多个JSON对象的文本文件,每个对象代表一次出租车行程。首先,通过Notepad++查看JSON数据结构,然后利用Python读取文件的前几个字符来了解数据格式。使用json和pandas库将数据转换为DataFrame,并按行写入CSV文件,方便进行后续的地理数据分析和处理。
摘要由CSDN通过智能技术生成

1.主要是先通过观察发现JSON数据的规律,我们可以使用Notepad++工具查看JSON数据的结构,了解数据的层次。

547326d72d46765316ef2f7ebdaf72c2.png

b50da6d5dfc056ce1a31276184f5474c.png

即使文件不是真正的json格式,例如有可能是多个json格式放在一起,每一行是一个json文件。

例如

{json1}

{json2}

……

2.很大的文本我们可以先用python读取前500或前1000个字符来查看JSON格式都有什么

当我们了解了结构之后,可以先按行读取文件。

fname=open("m.txt","r")

print(fname.read(1000))

处理json格式,我们需要json和pandas库

例如我们的文件名为m.txt

内容大致为:(实际每一行不一样)

{ "_id" : "729c3318ad3b4d53b7a92a31010fd085" , "fid" : "BJJYJ_095994" , "y" : "2012" , "m" : "03" , "d" : "01" , "sp" : 728 , "sl" : 235650.0 , "st1" : 40534.0 , "st2" : 40534.0 , "bb" : "116.2164536,39.8669052,116.6208801,40.0528488" , "pts" : [ { "1" : 0 , "2" : 39.9216461 , "3" : 116.6059799 , "4" : 0.0 , "5" : 0.0 , "6" : "2012-03-01 06:28:28" , "7" : "2012-03-01 06:28:33" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 1 , "2" : 39.9216461 , "3" : 116.6059799 , "4" : 0.06486486486486487 , "5" : 0.0 , "6" : "2012-03-01 06:29:36" , "7" : "2012-03-01 06:29:41" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 2 , "2" : 39.9216499 , "3" : 116.6060028 , "4" : 1.894736842105263 , "5" : 0.0 , "6" : "2012-03-01 06:31:27" , "7" : "2012-03-01 06:31:34" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 3 , "2" : 39.9216805 , "3" : 116.6056519 , "4" : 16.494545454545456 , "5" : 23.0 , "6" : "2012-03-01 06:32:24" , "7" : "2012-03-01 06:32:30" , "8" : "0" , "9" : 354.0 , "10" : 1} ]}

{ "_id" : "729c3318ad3b4d53b7a92a31010fd085" , "fid" : "BJJYJ_095994" , "y" : "2012" , "m" : "03" , "d" : "01" , "sp" : 728 , "sl" : 235650.0 , "st1" : 40534.0 , "st2" : 40534.0 , "bb" : "116.2164536,39.8669052,116.6208801,40.0528488" , "pts" : [ { "1" : 0 , "2" : 39.9216461 , "3" : 116.6059799 , "4" : 0.0 , "5" : 0.0 , "6" : "2012-03-01 06:28:28" , "7" : "2012-03-01 06:28:33" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 1 , "2" : 39.9216461 , "3" : 116.6059799 , "4" : 0.06486486486486487 , "5" : 0.0 , "6" : "2012-03-01 06:29:36" , "7" : "2012-03-01 06:29:41" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 2 , "2" : 39.9216499 , "3" : 116.6060028 , "4" : 1.894736842105263 , "5" : 0.0 , "6" : "2012-03-01 06:31:27" , "7" : "2012-03-01 06:31:34" , "8" : "0" , "9" : 0.0 , "10" : 1} , { "1" : 3 , "2" : 39.9216805 , "3" : 116.6056519 , "4" : 16.494545454545456 , "5" : 23.0 , "6" : "2012-03-01 06:32:24" , "7" : "2012-03-01 06:32:30" , "8" : "0" , "9" : 354.0 , "10" : 1} ]}

则处理代码如下

import json

import pandas as pd

j=0

with open(r'm.txt') as file:

for line in file:

m=json.loads(line)

n = m["pts"]

data = pd.DataFrame(n)

filename = "%d.csv"%j

f = open(filename , 'w')

j=j+1

data.to_csv(filename)

f.close()

处理的结果如图

11d4085ffcb5a2504d05c4937174ce86.png

每个文件的内容

,1,10,2,3,4,5,6,7,8,9

0,0,1,39.9216461,116.6059799,0.0,0.0,2012-03-01 06:28:28,2012-03-01 06:28:33,0,0.0

1,1,1,39.9216461,116.6059799,0.06486486486486487,0.0,2012-03-01 06:29:36,2012-03-01 06:29:41,0,0.0

2,2,1,39.9216499,116.6060028,1.894736842105263,0.0,2012-03-01 06:31:27,2012-03-01 06:31:34,0,0.0

3,3,1,39.9216805,116.6056519,16.494545454545456,23.0,2012-03-01 06:32:24,2012-03-01 06:32:30,0,354.0

4,4,1,39.9234695,116.6038361,24.588679245283018,23.0,2012-03-01 06:33:19,2012-03-01 06:33:24,0,302.0

5,5,1,39.9262543,116.6060333,16.599999999999998,21.0,2012-03-01 06:34:12,2012-03-01 06:34:18,0,36.0

这样json文件就被处理成可以使用的csv文件了,可以用于ArcGIS后期分析处理了

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值