取文本里的文本转Json(记录)

只是单纯的很幼稚的做法 也许绕了很大很大的弯
Json的应用我还不太了解
文本里的内容大致:

>>> f=open('222.txt','r')
>>> a=f.read()
>>> f.close()
>>> print a
appid:c50000100_h50001 flow:0.00243473 mo:CUCC ip:192.168.1.176
appid:c50000103_t50004 flow:207.359 mo:CUCC ip:192.168.1.119
appid:c50000100_t50011 flow:5.72205e-06 mo:CUCC ip:192.168.1.19
appid:c50000100_h50000 flow:0.104045 mo:CUCC ip:192.168.1.10

第一次正则匹配:

import re
>>> b=re.compile(r"appid:(.*?)\s*flow:(.*?)\s*mo:(.*?)\s*ip:(.*?)\n").findall(a)
>>> print b
[('c50000100_h50001', '0.00243473', 'CUCC', '192.168.1.176'), ('c50000103_t50004', '207.359', 'CUCC', '192.168.1.119'), ('c50000100_t50011', '5.72205e-06', 'CUCC', '192.168.1.19')]
#List包含Tuple
>>> import json
>>> j=json.dumps(b)
>>> print j
[["c50000100_h50001", "0.00243473", "CUCC", "192.168.1.176"], ["c50000103_t50004", "207.359", "CUCC", "192.168.1.119"], ["c50000100_t50011", "5.72205e-06", "CUCC", "192.168.1.19"]]

问题:
1.因为行末换行符的关系最后一行没有匹配到
2.Json的格式是List包含List(Json叫做数组Array)[[]];需要的形式是List包含Dict{}object等同于Python的dict

第二次正则匹配:

>>> b=re.compile(r"appid:(.*?)\s*flow:(.*?)\s*mo:(.*?)\s*ip:(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})").findall(a)
>>> print b
[('c50000100_h50001', '0.00243473', 'CUCC', '192.168.1.176'), ('c50000103_t50004', '207.359', 'CUCC', '192.168.1.119'), ('c50000100_t50011', '5.72205e-06', 'CUCC', '192.168.1.19'), ('c50000100_h50000', '0.104045', 'CUCC', '192.168.1.10')]
#通过直接匹配最后的IP地址,全部匹配

>>> d={}
>>> l=[]
>>> for i in range(len(b)):
    d['appid']=b[i][0]
    d['flow']=b[i][1]
    d['mo']=b[i][2]
    d['ip']=b[i][3]
    l.append(d)
    d={} #没有这句一直只赋值最后一个。。。搞了半天 略囧

>>> print l
[{'ip': '192.168.1.176', 'mo': 'CUCC', 'flow': '0.00243473', 'appid': 'c50000100_h50001'}, {'ip': '192.168.1.119', 'mo': 'CUCC', 'flow': '207.359', 'appid': 'c50000103_t50004'}, {'ip': '192.168.1.19', 'mo': 'CUCC', 'flow': '5.72205e-06', 'appid': 'c50000100_t50011'}, {'ip': '192.168.1.10', 'mo': 'CUCC', 'flow': '0.104045', 'appid': 'c50000100_h50000'}]
#得到了List包含Dict
>>> import json
>>> j=json.dumps(l)
>>> print j
[{"ip": "192.168.1.176", "mo": "CUCC", "flow": "0.00243473", "appid": "c50000100_h50001"}, {"ip": "192.168.1.119", "mo": "CUCC", "flow": "207.359", "appid": "c50000103_t50004"}, {"ip": "192.168.1.19", "mo": "CUCC", "flow": "5.72205e-06", "appid": "c50000100_t50011"}, {"ip": "192.168.1.10", "mo": "CUCC", "flow": "0.104045", "appid": "c50000100_h50000"}]
#转成Json也正常

附一种脑洞大开的写法:

>>> l=[]
>>> d={}
>>> f=open('222.txt','r')
>>> for line in open('222.txt'):
    s=f.readline()
    s=s.replace('\n','')
    for i in range(4):
        d[(((s.split(' '))[i]).split(':'))[0]]=(((s.split(' '))[i]).split(':'))[1]
    l.append(d)
    d={}

>>> print l
[{'ip': '192.168.1.176', 'mo': 'CUCC', 'flow': '0.00243473', 'appid': 'c50000100_h50001'}, {'ip': '192.168.1.119', 'mo': 'CUCC', 'flow': '207.359', 'appid': 'c50000103_t50004'}, {'ip': '192.168.1.19', 'mo': 'CUCC', 'flow': '5.72205e-06', 'appid': 'c50000100_t50011'}, {'ip': '192.168.1.10', 'mo': 'CUCC', 'flow': '0.104045', 'appid': 'c50000100_h50000'}]

转载于:https://www.cnblogs.com/guitar-tec/p/4762159.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值