正则表达式入门:
http://www.cnblogs.com/deerchao/archive/2006/08/24/zhengzhe30fengzhongjiaocheng.html
re.match
match从字符串的开头进行匹配(注意是开头!!!)
text = "wensishuai is a handsome boy, he is cool, clever, and so on..."
m = re.match(r'(w\w+)\s', text, re.I)
if m:
print m.groups()
else:
print 'not match'
返回的match object函数
re.search
search不必从开头进行匹配
返回值与match相同
re.findall
re.findall可以获取字符串中所有匹配的字符串。
text = "wensishuai is a handsome boy, he is cool, clever, and so on..."
strlist = re.findall(r'\b(c\w+)\b', text, re.I)
#strlist=['cool', 'clever']
re.sub
字符串替换
text = "Wensishuai is a handsome boy, he is cool, clever, and so on..."
newtext = re.sub(r'\b(W\w*)\b', lambda m: '<' + m.group(0) + '>', text)
re.split
返回一个list
text = "Wensishuai is a handsome boy, he is cool, clever, and so on..."
strlst = re.split(r'\s+', text)
re.complie
可以把正则表达式编译成一个正则表达式对象。可以把那些经常使用的正则表达式编译成正则表达式对象,这样可以提高一定的效率。下面是一个正则表达式对象的一个例子:
text = "JGood is a handsome boy, he is cool, clever, and so on..."
regex = re.compile(r'\w*oo\w*')
print regex.findall(text)
print regex.sub(lambda m: '[' + m.group(0) + ']', text)