Python re 模块整理

整理下Python re模块几个重要的东西。

使用re 模块,我的习惯

1.编译pattern  

pattern=re.compile(r'hello')

2.使用re的搜索匹配函数

pattern.search("hello world")

3.获取匹配结果

if match:

print match.groups()

re的匹配函数有 match,search,findall,finditer,split,我常用的就这5个

match 返回的是 tuple 元组

search 返回的是 tuple 元组

findall 返回的是list 列表

finditer 返回的是iter 迭代器

split 返回的是list 列表

具体的测试例子如下所示:

xluren@test re_compile]$ cat demo.py 
import re

str1='218.205.750.157 46 TCP_MISS [16/Oct/2014:19:29:38 +0800] "GET /i.jpg HTTP/1.1" 200 4576 "-" "-" "GT-droid" "2297768042"'

str2='www.baidu.cn 220.162.917.199 9 TCP_HIT [16/Oct/2014:21:01:39 +0800] "GET /r.gif HTTP/0.0" 200 13815 "-" "-" "vroid" ""'

pattern=re.compile(r'([\w\d.]{0,})\s([0-9.]+)\s(\d+|-)\s(\w+)\s\[([^\[\]]+)\s\+\d+\]\s"((?:[^"]|\")+)"\s(\d{3})\s(\d+|-)\s"((?:[^"]|\")+|-)"\s"(.+|-)"\s"((?:[^"]|\")+)"\s"(.{0,}|-)"$')

print "="*10
print "match test"
match=pattern.match(str1)
if match:
    print match.groups()

match=pattern.match(str2)
if match:
    print match.groups()
print "return type is :",type(match.groups()).__name__

print "="*10
print "search"
search=pattern.search(str1)
if search:
    print search.groups()

search=pattern.search(str2)
if search:
    print search.groups()
print "return type is ",type(search.groups()).__name__


print "="*10
print "split"
split=pattern.split(str1)
if split:
    print split
print 'return type is ',type(split).__name__ 

print "="*10
print "finditer"
finditer=pattern.finditer(str1)
if finditer:
    for i in finditer:
        print i
finditer=pattern.finditer(str2)
if finditer:
    for i in finditer:
        print i.group()
print "return type is ",type(finditer).__name__

print "="*10
print "findall"
findall=pattern.findall(str1)
if findall:
    print findall
findall=pattern.findall(str2)
if findall:
    print findall
print "return type is ",type(findall).__name__

print "="*10
p = re.compile(r'(\w+) (\w+)')
s = 'i say, hello world!'
print p.sub(r'\2 \1', s)
[xluren@test re_compile]$ 

测试输出结果:

[xluren@test re_compile]$ python demo.py 
==========
match test
('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')
return type is : tuple
==========
search
('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')
return type is  tuple
==========
split
['218.205.750.157 46 TCP_MISS [16/Oct/2014:19:29:38 +0800] "GET /i.jpg HTTP/1.1" 200 4576 "-" "-" "GT-droid" "2297768042"']
return type is  list
==========
finditer
www.baidu.cn 220.162.917.199 9 TCP_HIT [16/Oct/2014:21:01:39 +0800] "GET /r.gif HTTP/0.0" 200 13815 "-" "-" "vroid" ""
return type is  callable-iterator
==========
findall
[('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')]
return type is  list
==========
say i, world hello!
[xluren@test re_compile]$ 


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值