python集合优化实践

前段时间在用字典时发现如果集合比较大时,用in语句非常耗时,跑一个三四百M的输入要1个小时;

经过改进用set取代list,并且取消掉in语句,发现速度既然提高60倍,在短短的一分钟之类完成,下面附上代码:

 

未优化代码:

pvdic={}
uvdic={}
day=sys.argv[1]

for line in sys.stdin:
        frags = line.strip().split("\x01")
        if (len(frags) == 2 and frags[1].isdigit() ):
                uid = frags[0]
                dstr = int(frags[1])
                if(dstr <= 30 ):
                        diff = "0-30"
                ......
                else:
                        diff = "180+"

                if diff in pvdic:
                        pvdic[diff] += 1
                else:
                        pvdic[diff] = 1
                if diff in uvdic:
                        if uid not in uvdic[diff]:
                                uvdic[diff].append(uid)
                else:
                        uvdic[diff] = [uid]

difflist=["0-30","31-60","61-90","91-120","121-150","151-180","180+"]
pvlist=[]
uvlist=[]
for d in difflist:
        if d in pvdic:
                pvlist.append(str(pvdic[d]))
        else:
                pvlist.append("0")
        if d in uvdic:
                uvlist.append(str(len(uvdic[d])))
        else:
                uvlist.append("0")
print "%s\tpv\t%s" %(day,"\t".join(pvlist))   ...


优化后代码:
pvdic={}
uvdic={}
day=sys.argv[1]
difflist=["0-30","31-60","61-90","91-120","121-150","151-180","180+"]

for d in difflist:
        pvdic[d]=0
        uvdic[d]=set()

for line in sys.stdin:
        frags = line.strip().split("\x01")
        if (len(frags) == 2 and frags[1].isdigit() ):
                uid = frags[0]
                dstr = int(frags[1])
                if(dstr <= 30 ):
                        diff = "0-30"
                ......
                else:
                        diff = "180+"

                pvdic[diff] += 1
                uvdic[diff].add(uid)

pvlist=[]
uvlist=[]
for d in difflist:
        pvlist.append(str(pvdic[d]))
        uvlist.append(str(len(uvdic[d])))
print "%s\tpv\t%s" %(day,"\t".join(pvlist))    ...

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值