smallgfw: 一个基于DFA的敏感词检测和替换模块。

发布一个小项目:

smallgfw:  一个基于DFA的敏感词检测和替换模块。

http://code.google.com/p/smallgfw/

>>> gfw = GFW()
>>> gfw.set(["sexy","girl","love","shit"])
>>> s = gfw.replace("Shit!,Cherry is a sexy girl. She loves python.","*")
>>> print s
*!,Cherry is a * *. She *s python.

和用re正则表达式实现的版本对比:

check 1 times
re cost: 0.0149998664856
smallgfw cost: 0.0
===================================
check 2 times
re cost: 0.0320000648499
smallgfw cost: 0.0
===================================
check 3 times
re cost: 0.0460000038147
smallgfw cost: 0.0
===================================
check 4 times
re cost: 0.0629999637604
smallgfw cost: 0.0
===================================
check 5 times
re cost: 0.0780000686646
smallgfw cost: 0.0160000324249
===================================
check 6 times
re cost: 0.077999830246
smallgfw cost: 0.0150001049042
===================================
check 7 times
re cost: 0.0940001010895
smallgfw cost: 0.0159997940063
===================================
check 8 times
re cost: 0.109000205994
smallgfw cost: 0.0159997940063
===================================
check 9 times
re cost: 0.125
smallgfw cost: 0.0150001049042
===================================
check 10 times
re cost: 0.125
smallgfw cost: 0.0320000648499
===================================
check 11 times
re cost: 0.139999866486
smallgfw cost: 0.0320000648499
===================================
check 12 times
re cost: 0.155999898911
smallgfw cost: 0.0310001373291
===================================
check 13 times
re cost: 0.171999931335
smallgfw cost: 0.0160000324249
===================================
check 14 times
re cost: 0.203000068665
smallgfw cost: 0.0149998664856
===================================
check 15 times
re cost: 0.219000101089
smallgfw cost: 0.0160000324249
===================================
check 16 times
re cost: 0.233999967575
smallgfw cost: 0.0160000324249
===================================
check 17 times
re cost: 0.233999967575
smallgfw cost: 0.0309998989105
===================================
check 18 times
re cost: 0.25
smallgfw cost: 0.0320000648499
===================================
check 19 times
re cost: 0.265000104904
smallgfw cost: 0.0309998989105
===================================
check 20 times
re cost: 0.28200006485
smallgfw cost: 0.0309998989105
===================================
replace 1 times
re cost: 0.0160000324249
smallgfw cost: 0.0150001049042
===================================
replace 2 times
re cost: 0.0159997940063
smallgfw cost: 0.0150001049042
===================================
replace 3 times
re cost: 0.0320000648499
smallgfw cost: 0.0149998664856
===================================
replace 4 times
re cost: 0.047000169754
smallgfw cost: 0.0
===================================
replace 5 times
re cost: 0.077999830246
smallgfw cost: 0.0
===================================
replace 6 times
re cost: 0.0940001010895
smallgfw cost: 0.0160000324249
===================================
replace 7 times
re cost: 0.0929999351501
smallgfw cost: 0.0160000324249
===================================
replace 8 times
re cost: 0.108999967575
smallgfw cost: 0.0
===================================
replace 9 times
re cost: 0.125
smallgfw cost: 0.0160000324249
===================================
replace 10 times
re cost: 0.141000032425
smallgfw cost: 0.0149998664856
===================================
replace 11 times
re cost: 0.15700006485
smallgfw cost: 0.0150001049042
===================================
replace 12 times
re cost: 0.171999931335
smallgfw cost: 0.0160000324249
===================================
replace 13 times
re cost: 0.18700003624
smallgfw cost: 0.0309998989105
===================================
replace 14 times
re cost: 0.18799996376
smallgfw cost: 0.0310001373291
===================================
replace 15 times
re cost: 0.218999862671
smallgfw cost: 0.0160000324249
===================================
replace 16 times
re cost: 0.21799993515
smallgfw cost: 0.0320000648499
===================================
replace 17 times
re cost: 0.233999967575
smallgfw cost: 0.0310001373291
===================================
replace 18 times
re cost: 0.25
smallgfw cost: 0.0309998989105
===================================
replace 19 times
re cost: 0.296999931335
smallgfw cost: 0.0320000648499
===================================
replace 20 times
re cost: 0.280999898911
smallgfw cost: 0.0310001373291
===================================

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值