正则表达式之多种格式的电话号码匹配

\w - 匹配字母或数字或下划线或汉字(我试验下了,发现 3.x 版本可以匹配汉字,
但 2.x 版本不可以)
\s - 匹配任意的空白符
^ - 匹配字符串的开始
$ - 匹配字符串的结束
2.
\S 其实就是\s 的反义,任意不是空白符的字符。同理,还有:
\W - 匹配任意不是字母,数字,下划线,汉字的字符
\D - 匹配任意非数字的字符
\B - 匹配不是单词开头或结束的位置
[a]的反义是[^a],表示除 a 以外的任意字符。[^abcd]就是除 abcd 以外的任意字
符。
3.
之前我们用过*、+、{}来表示字符的重复。其他重复的方式还有:
? - 重复零次或一次
{n,} - 重复 n 次或更多次
{n,m} - 重复 n 到 m 次
正则表达式不只是用来从一大段文字中抓取信息,很多时候也被用来判断输入的
文本是否符合规范,或进行分类。来点例子看看:
^\w{4,12}$
这个表示一段 4 到 12 位的字符,包括字母或数字或下划线或汉字,可以用来作
为用户注册时检测用户名的规则。(但汉字在 python2.x 里面可能会有问题)
\d{15,18}
表示 15 到 18 位的数字,可以用来检测身份证号码
^1\d*[x]?
以 1 开头的一串数字,数字结尾有字母 x,也可以没有。有的话就带上 x。

转义字符\。如果我们确实要匹配.或者*字符本身,而不
是要它们所代表的元字符,那就需要用\.或\*。\本身也需要用\\。

tel ='(021)88776543 010-55667890  02584453362  0571 66345673 '


data1=re.findall(r'\(?0\d{2,3}[) -]?\d{7,8}',tel)
print(data1)

tel2 ='(021 88776543 010-55667890  02584453362  0571 66345673 '

#\(0\d{2,3}\)\d{7,8}表示匹配出(021 88776543 而0\d{2,3}[ -]?\d{7,8}匹配出后三个
data2=re.findall(r'\(0\d{2,3}\)\d{7,8}|0\d{2,3}[ -]?\d{7,8}',tel2)
print(data2)

"""
(1) \(?  解释:()在正则表达式里也有着特殊的含义,所以要匹配字符"(",需要用"\("  而 ?表示这个括号是可有可无的。
(2)0\d{2,3}表示区号,0xx 或者 0xxx就是0开头0xx三位或者0xxx四位
(3)[) -]?  表示在区号之后跟着的可能是")" 或" "或"-",也可能什么也没有。
(4)\d{7,8} 表示7 或 8 位的电话号码  55667890    66345673  88776543

"""

 

 

 

北京大学oj题目,已提交AC。原题目如下: Description Businesses like to have memorable telephone numbers. One way to make a telephone number memorable is to have it spell a memorable word or phrase. For example, you can call the University of Waterloo by dialing the memorable TUT-GLOP. Sometimes only part of the number is used to spell a word. When you get back to your hotel tonight you can order a pizza from Gino's by dialing 310-GINO. Another way to make a telephone number memorable is to group the digits in a memorable way. You could order your pizza from Pizza Hut by calling their ``three tens'' number 3-10-10-10. The standard form of a telephone number is seven decimal digits with a hyphen between the third and fourth digits (e.g. 888-1200). The keypad of a phone supplies the mapping of letters to numbers, as follows: A, B, and C map to 2 D, E, and F map to 3 G, H, and I map to 4 J, K, and L map to 5 M, N, and O map to 6 P, R, and S map to 7 T, U, and V map to 8 W, X, and Y map to 9 There is no mapping for Q or Z. Hyphens are not dialed, and can be added and removed as necessary. The standard form of TUT-GLOP is 888-4567, the standard form of 310-GINO is 310-4466, and the standard form of 3-10-10-10 is 310-1010. Two telephone numbers are equivalent if they have the same standard form. (They dial the same number.) Your company is compiling a directory of telephone numbers from local businesses. As part of the quality control process you want to check that no two (or more) businesses in the directory have the same telephone number. Input The input will consist of one case. The first line of the input specifies the number of telephone numbers in the directory (up to 100,000) as a positive integer alone on the line. The remaining lines list the telephone numbers in the directory, with each number alone on a line. Each telephone number consists of a string composed of decimal digits, uppercase letters (excluding Q and Z) and hyphens. Exactly seven of the characters in the string will be digits or letters. Output Generate a line of output for each telephone number that appears more than once in any form. The line should give the telephone number in standard form, followed by a space, followed by the number of times the telephone number appears in the directory. Arrange the output lines by telephone number in ascending lexicographical order. If there are no duplicates in the input print the line: No duplicates. Sample Input 12 4873279 ITS-EASY 888-4567 3-10-10-10 888-GLOP TUT-GLOP 967-11-11 310-GINO F101010 888-1200 -4-8-7-3-2-7-9- 487-3279 Sample Output 310-1010 2 487-3279 4 888-4567 3
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值