使用python将一定格式的文本转成csv文件供excel做数据分析

现在,假设我们有这样一个文本数据:build_log.txt

time:     20180417 05:15:55
version:     1.0.266.0
server:     Prd
preview:     PreviewLianXiang
platform:     android
channel:     lianxiang
======================================================
time:     20180417 04:57:15
version:     1.0.266.0
server:     Prd
preview:     PreviewCulture
platform:     android
channel:     wc
======================================================
time:     20180417 01:52:21
version:     1.0.265.0
server:     Prd
preview:     PreviewYouPinWei
platform:     android
channel:     youpinwei
======================================================
time:     20180417 02:00:47
version:     1.0.265.0
server:     Prd
preview:     PreviewZhaomi360
platform:     android
channel:     zhaomi360
======================================================
time:     20180417 01:56:12
version:     1.0.265.0
server:     Prd
preview:     PreviewCulture
platform:     android
channel:     wc
======================================================
time:     20180417 01:58:46
version:     1.0.265.0
server:     Prd
preview:     PreviewMaoZhua
platform:     android
channel:     maozhua
======================================================
time:     20180417 02:05:58
version:     1.0.265.0
server:     Prd
preview:     PreviewLianXiang
platform:     android
channel:     lianxiang
======================================================
time:     20180417 02:02:38
version:     1.0.265.0
server:     Prd
preview:     PreviewHaoYouKuaiBao
platform:     android
channel:     haoyoukuaibao
======================================================
time:     20180417 11:45:36
version:     1.0.264.0
server:     Dev
preview:     None
platform:     android
channel:     lianxiang
======================================================
time:     20180417 10:56:04
version:     1.0.263.0
server:     Dev
preview:     None
platform:     android
channel:     lianxiang
======================================================
time:     20180416 02:44:15
version:     1.0.262.0
server:     Prd
preview:     None
platform:     android
channel:     zhaomi360
======================================================
time:     20180416 02:42:40
version:     1.0.262.0
server:     Prd
preview:     None
platform:     android
channel:     youpinwei

我们想把他转成excel可以做数据分析,像这样:build_log.csv
time,version,server,preview,platform,channel
20180417 05:15:55,1.0.266.0,Prd,PreviewLianXiang,android,lianxiang
20180417 04:57:15,1.0.266.0,Prd,PreviewCulture,android,wc
20180417 01:52:21,1.0.265.0,Prd,PreviewYouPinWei,android,youpinwei
20180417 02:00:47,1.0.265.0,Prd,PreviewZhaomi360,android,zhaomi360
20180417 01:56:12,1.0.265.0,Prd,PreviewCulture,android,wc
20180417 01:58:46,1.0.265.0,Prd,PreviewMaoZhua,android,maozhua
20180417 02:05:58,1.0.265.0,Prd,PreviewLianXiang,android,lianxiang
20180417 02:02:38,1.0.265.0,Prd,PreviewHaoYouKuaiBao,android,haoyoukuaibao
20180417 11:45:36,1.0.264.0,Dev,None,android,lianxiang
20180417 10:56:04,1.0.263.0,Dev,None,android,lianxiang
20180416 02:44:15,1.0.262.0,Prd,None,android,zhaomi360
20180416 02:42:40,1.0.262.0,Prd,None,android,youpinwei
用excel打开,就是这样子



转换的python脚本如下:

import re  

def read_file(f_name):
    f=open(f_name,'r')
    txt=f.read()
    f.close()
    return txt

def convert_txt_to_dic(txt):
    ls=re.split(r'==+',txt)
    res=[]
    for item in ls:
        item=item.strip()
        lines=item.splitlines()
        item_map={}
        for line in lines:
            index=line.find(":")
            key=line[:index].strip()
            value=line[index+1:].strip()
            #print("key: %s, value: %s"%(key,value))
            item_map[key]=value
        res.append(item_map)

    return res

def format_csv_item(txt):
    index=txt.find(",")
    if index >= 0:
        return "\"%s\""%(txt)
    return txt
        
        
def convert_dic_to_csv(res_ls,head):
    #print(res_ls)
    ls=[]
    head_keys=head.split(',')
    ls.append(head)
    for item in res_ls:
        #映射
        record = map(lambda k:format_csv_item(item.get(k) or ""),head_keys)
        #list推导式
        #record=[format_csv_item(item.get(k) or "") for k in head_keys]
        ls.append(",".join(record))
    
    return "\n".join(ls)
        
def save_file(f_name,txt):
    f=open(f_name,'w')
    f.write(txt)
    f.close()

    

txt=read_file("./build_log.txt")
res=convert_txt_to_dic(txt)
csv=convert_dic_to_csv(res,"time,version,server,preview,platform,channel")
save_file('build_log.csv',csv)


    
    

科普:

CSV

逗号分隔值(Comma-Separated Values,CSV,有时也称为字符分隔值,因为分隔字符也可以不是逗号),其文件以纯文本形式存储表格数据(数字和文本)。纯文本意味着该文件是一个字符序列,不含必须像二进制数字那样被解读的数据。CSV文件由任意数目的记录组成,记录间以某种换行符分隔;每条记录由字段组成,字段间的分隔符是其它字符或字符串,最常见的是逗号或制表符。通常,所有记录都有完全相同的字段序列。

规则

1 开头是不留空,以行为单位。
2 可含或不含列名,含列名则居文件第一行。
3 一行数据不跨行,无空行。
4 以半角逗号(即,)作分隔符,列为空也要表达其存在。
5列内容如存在半角引号(即"),替换成半角双引号("")转义,即用半角引号(即"")将该字段值包含起来。
6文件读写时引号,逗号操作规则互逆。
7内码格式不限,可为 ASCII、Unicode 或者其他。
8不支持特殊字符

举例说明

制造商
型号
说明
价值
1997
Ford
E350
ac, abs, moon
3000.00
1999
Chevy
Venture "Extended Edition"
 
4900.00
1999
Chevy
Venture "Extended Edition, Very Large"
 
5000.00
1996
Jeep
Grand Cherokee
MUST SELL!
  air, moon roof, loaded
4799.00
上面表格内容若以CSV格式表示就会像下列:
年,制造商,型号,说明,价值
1997,Ford,E350,"ac, abs, moon",3000.00
  1999,Chevy,"Venture ""Extended Edition""","",4900.00
  1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
  1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
以上这个CSV的例子说明了:
  • 包含逗号, 双引号, 或是换行符的字段必须放在引号内.
  • 字段内部的引号必须在其前面增加一个引号来实现文字引号的转码.
  • 分隔符逗号前后的空格 可能不会 被修剪掉. 这是RFC 4180的要求.
  • 元素中的换行符将被保留下来.


阅读更多
版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/linxinfa/article/details/79979487
文章标签: python excel csv
个人分类: python
上一篇Python常用函数的使用实例
下一篇用Laya开发微信小游戏
想对作者说点什么? 我来说一句

没有更多推荐了,返回首页

关闭
关闭