[Python] Regular Expressions

1. regular expression

Regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. Regular expressions are widely used in UNIX world.

 

2.re module

re module supports Perl-like regular expression.

The re module raises the exception re.error if an error occurs while compiling or using a regular expression.

 

To avoid any confusion while dealing with regular expressions, we would use Raw Strings as r'expression'.

 

 3. match function

Syntax:
re.match(pattern, string, flags=0)
pattern #a regular expression to be matched
string #a string will be searched to match the pattern at the beginning of string
flags #modifiers. You can specify different flags using bitwise OR (|).

  

returns a match object on success, None on failure

 

Example:

import re

line = "Cats are smarter than dogs"

matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)

if matchObj:
   print "matchObj.group() : ", matchObj.group()
   print "matchObj.group(1) : ", matchObj.group(1)
   print "matchObj.group(2) : ", matchObj.group(2)
else:
   print "No match!!"

#group() is Match Object Methods
#group() represent all the string
#group(1) represent one word before pattern in the string
#group(2) represent one word after pattern in the string

  

4. search function

#Syntax:
re.search(pattern, string, flags=0)
#pattern: This is the regular expression to be matched.
#string: This is the string, which would be searched to match the pattern anywhere in the string.
#flags: the same as match()  

  

returns a match object on success, none on failure

 

Its group method is the same as match.

 

import re

line = "Cats are smater than dogs."

searchObj = re.search(r'(.*) are (.*?) .*', line, re.M|re.I)

if searchObj:
    print "searchObj.group(): ", searchObj.group()
    print "searchObj.group(1): ", searchObj.group(1)
    print "searchObj.group(2): ", searchObj.group(2)
else:
    print "no match"

  

5. Match VS Search

match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string

import re

line = "Cats are smater than dogs."

searchObj = re.search(r'dogs', line, re.M|re.I)
matchObj = re.match(r'dogs', line, re.M|re.I)

if searchObj:
    print "searchObj.group(): ", searchObj.group()
else:
    print "no match\n"

if matchObj:
    print "matchObj.group(): ", matchObj.group()
else:
    print "no match\n

  

When the code is executed, it produced the following result:

searchObj.group(): Cats are smater than dogs.
no match

  

6. sub

#syntax:
re.sub(pattern, repl, string, max=0)
#This method replaces all occurrences of the RE pattern in string with repl,
#substituting all occurrences unless max provided. 
#This method returns modified string.

  

Explame:

import re

phone = "32580-110-517 #nhmhhh"

#Delete python style comment
num = re.sub(r'#.*$', "", phone)
print "phone num:", num

#Delete non-digit characters
num = re.sub(r'\D', "", phone)
print "phone num:", num

  

When the above code is executed, it produces the following result −

 

phone num:32580-110-517 
phone num:32580110517 

  

7. Regular Expression Modifiers: Option flags

 You can provide multiple modifiers using exclusive OR (|).

re.I #Performs case-insensitive matching.
re.L #Interprets words according to the current locale.
re.M #Makes $ match the end of a line
#(not just the end of the string)
#makes ^ match the start of any line
#(not just the start of the string)
re.S #Makes a period (dot) match any character, including a newline.
re.U #Interprets letters according to the Unicode character set.
re.X #Permits "cuter" regular expression syntax. It ignores whitespace (except inside a set [] or when escaped by a backslash) and treats unescaped # as a comment marker.

  

8. Regular Expression Patterns

https://www.tutorialspoint.com/python/python_reg_expressions.htm

  

 

转载于:https://www.cnblogs.com/KennyRom/p/6368991.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值