正则表达式-使用说明Regular Expression How To (Perl, Python, etc)

notepad++ wiki about regular expression

正则表达式-使用说明Regular Expression How To (Perl, Python, etc)

https://docs.python.org/2/howto/regex.html#regex-howto

For more: 

https://docs.python.org/2/library/re.html


 

Quick Reference:

The first metacharacters we’ll look at are [ and ]. They’re used for specifying a character class, which is a set of characters that you wish to match. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a '-'. For example, [abc] will match any of the characters ab, or c; this is the same as [a-c], which uses a range to express the same set of characters. If you wanted to match only lowercase letters, your RE would be [a-z].

Metacharacters are not active inside classes. For example, [akm$] will match any of the characters 'a''k''m', or '$''$' is usually a metacharacter, but inside a character class it’s stripped of its special nature.

You can match the characters not listed within the class by complementing the set. This is indicated by including a '^' as the first character of the class; '^' outside a character class will simply match the '^' character. For example, [^5] will match any character except '5'.

Perhaps the most important metacharacter is the backslash, \. As in Python string literals, the backslash can be followed by various characters to signal various special sequences. It’s also used to escape all the metacharacters so you can still match them in patterns; for example, if you need to match a [ or \, you can precede them with a backslash to remove their special meaning: \[ or \\.

Some of the special sequences beginning with '\' represent predefined sets of characters that are often useful, such as the set of digits, the set of letters, or the set of anything that isn’t whitespace. The following predefined special sequences are a subset of those available. The equivalent classes are for byte string patterns. For a complete list of sequences and expanded class definitions for Unicode string patterns, see the last part of Regular Expression Syntax.

\d
Matches any decimal digit; this is equivalent to the class  [0-9].
\D
Matches any non-digit character; this is equivalent to the class  [^0-9].
\s
Matches any whitespace character; this is equivalent to the class  \t\n\r\f\v].
\S
Matches any non-whitespace character; this is equivalent to the class  [^ \t\n\r\f\v].
\w
Matches any alphanumeric character; this is equivalent to the class  [a-zA-Z0-9_].
\W
Matches any non-alphanumeric character; this is equivalent to the class  [^a-zA-Z0-9_].

These sequences can be included inside a character class. For example, [\s,.] is a character class that will match any whitespace character, or ',' or '.'.

The final metacharacter in this section is .. It matches anything except a newline character, and there’s an alternate mode (re.DOTALL) where it will match even a newline. '.' is often used where you want to match “any character”.


 

posted on 2017-07-02 16:01 sonictl 阅读( ...) 评论( ...) 编辑 收藏

转载于:https://www.cnblogs.com/sonictl/p/7106235.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值