正则表达式python_Python正则表达式备忘单

正则表达式python

The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out!

学习数据的困难之处在于记住所有语法。 在Dataquest时,我们提倡习惯于查阅Python文档 ,有时可以得到一些方便的参考,这很高兴,因此我们整理了这份备忘单,可以为您提供帮助!

This cheat sheet is based on Python 3’s documentation on regular expressions. If you’re interested in learning Python, we have a free Python Programming: Beginner course for you to try out.

该备忘单基于Python 3的正则表达式文档 。 如果您有兴趣学习Python,我们提供了免费的Python编程:入门课程供您尝试。

python-regular-expressions-cheatsheet_pic

Download the cheat sheet here

在此处下载备忘单

特殊的角色 (Special Characters)

^ | Matches the expression to its right at the start of a string. It matches every such instance before each n in the string.

^ | 在字符串的开头匹配表达式的右边。 它匹配字符串中每个n之前的每个此类实例。

$ | Matches the expression to its left at the end of a string. It matches every such instance before each n in the string.

$ | 在字符串末尾匹配表达式左侧。 它匹配字符串中每个n之前的每个此类实例。

. | Matches any character except line terminators like n.

. | 匹配除行终止符(如n以外的任何字符。

| Escapes special characters or denotes character classes.

| 转义特殊字符或表示字符类。

A|B | Matches expression A or B. If A is matched first, B is left untried.

A|B | 匹配表达式AB 如果A匹配A ,则B保持未试状态。

+ | Greedily matches the expression to its left 1 or more times.

+ | 贪婪地匹配表达式左侧1次或多次。

* | Greedily matches the expression to its left 0 or more times.

* | 贪婪地将表达式左对齐0次或多次。

? | Greedily matches the expression to its left 0 or more times. But if ? is added to qualifiers (+, *, and ? itself) it will perform matches in a non-greedy manner.

? | 贪婪地将表达式左对齐0次或多次。 但是,如果? 被添加到限定词( +*?本身)中,它将以非贪婪的方式执行匹配。

{m} | Matches the expression to its left m times, and not less.

{m} | 将表达式匹配到它的左侧m次,且不小于3次。

{m,n} | Matches the expression to its left m to n times, and not less.

{m,n} | 匹配表达式左侧的mn次,并且不小于。

{m,n}? | Matches the expression to its left m times, and ignores n. See ? above.

{m,n}? | 将表达式左对齐m次,并忽略n 。 看到了? 以上。

字符类(又名特殊序列) (Character Classes (a.k.a. Special Sequences))

w | Matches alphanumeric characters, which means a-z, A-Z, and 0-9. It also matches the underscore, _.

w | 匹配字母数字字符,表示azAZ0-9 。 它还与下划线_匹配。

d | Matches digits, which means 0-9.

d | 匹配数字,表示0-9

D | Matches any non-digits.

D | 匹配任何非数字。

s | Matches whitespace characters, which include the t, n, r, and space characters.

s | 匹配空白字符,包括tnr和空格字符。

S | Matches non-whitespace characters.

S | 匹配非空格字符。

b | Matches the boundary (or empty string) at the start and end of a word, that is, between w and W.

b | 在单词的开头和结尾(即wW之间)匹配边界(或空字符串)。

B | Matches where b does not, that is, the boundary of w characters.

B | 匹配b不存在的地方,即w字符的边界。

A | Matches the expression to its right at the absolute start of a string whether in single or multi-line mode.

A | 无论是单行还是多行模式,都在字符串的绝对开头处将表达式与右侧匹配。

Z | Matches the expression to its left at the absolute end of a string whether in single or multi-line mode.

Z | 无论是单行还是多行模式,都将表达式与字符串的绝对结尾处的左侧匹配。

套装 (Sets)

[ ] | Contains a set of characters to match.

[ ] | 包含一组要匹配的字符。

[amk] | Matches either a, m, or k. It does not match amk.

[amk] | 匹配amk 。 它与amk不匹配。

[a-z] | Matches any alphabet from a to z.

[az] | 匹配从az任何字母。

[a-z] | Matches a, -, or z. It matches - because escapes it.

[az] | 匹配a-z 。 它匹配-因为 逃脱它。

[a-] | Matches a or -, because - is not being used to indicate a series of characters.

[a-] | 匹配a-因为-没有被用来表示一个字符序列。

[-a] | As above, matches a or -.

[-a] | 如上所述,匹配a-

[a-z0-9] | Matches characters from a to z and also from 0 to 9.

[a-z0-9] | 匹配从az以及从09字符。

[(+*)] | Special characters become literal inside a set, so this matches (, +, *, and ).

[(+*)] | 特殊字符在集合内变为文字,因此它与(+*)匹配。

[^ab5] | Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5.

[^ab5] | 添加^排除了集合中的任何字符。 在这里,它匹配不是ab5字符。

团体 (Groups)

( ) | Matches the expression inside the parentheses and groups it.

( ) | 匹配括号内的表达式并将其分组。

(? ) | Inside parentheses like this, ? acts as an extension notation. Its meaning depends on the character immediately to its right.

(? ) | 这样的括号内, ? 充当扩展符号。 其含义取决于其右边的字符。

(?PAB) | Matches the expression AB, and it can be accessed with the group name.

(?PAB) | 与表达式AB匹配,并且可以使用组名进行访问。

(?aiLmsux) | Here, a, i, L, m, s, u, and x are flags:

(?aiLmsux) | 在这里, aiLmsux是标志:

  • a — Matches ASCII only
  • i — Ignore case
  • L — Locale dependent
  • m — Multi-line
  • s — Matches all
  • u — Matches unicode
  • x — Verbose
  • a只匹配ASCII -
  • i -忽略大小写
  • L —取决于语言环境
  • m —多行
  • s —匹配所有
  • u —匹配unicode
  • x —详细

(?:A) | Matches the expression as represented by A, but unlike (?PAB), it cannot be retrieved afterwards.

(?:A) | 匹配由A表示的表达式,但与(?PAB) ,此后无法检索。

(?#...) | A comment. Contents are for us to read, not for matching.

(?#...) | 一条评论。 内容供我们阅读,而不是匹配。

A(?=B) | Lookahead assertion. This matches the expression A only if it is followed by B.

A(?=B) | 前瞻性断言。 仅当表达式A后跟B才匹配表达式A

A(?!B) | Negative lookahead assertion. This matches the expression A only if it is not followed by B.

A(?!B) | 否定超前断言。 仅当表达式A不后跟B才匹配表达式A

(?<=B)A | Positive lookbehind assertion. This matches the expression A only if B is immediately to its left. This can only matched fixed length expressions.

(?<=B)A | 断言肯定。 仅当B紧靠其左侧时,此表达式才与表达式A匹配。 这只能匹配固定长度的表达式。

(?<!B)A | Negative lookbehind assertion. This matches the expression A only if B is not immediately to its left. This can only matched fixed length expressions.

(?<!B)A | 断言背后的否定。 仅当B不在其左侧时,此表达式才与表达式A匹配。 这只能匹配固定长度的表达式。

(?P=name) | Matches the expression matched by an earlier group named “name”.

(?P=name) | 匹配与名为“ name”的早期组匹配的表达式。

(...)1 | The number 1 corresponds to the first group to be matched. If we want to match more instances of the same expresion, simply use its number instead of writing out the whole expression again. We can use from 1 up to 99 such groups and their corresponding numbers.

(...)1 | 数字1对应于要匹配的第一组。 如果要匹配同一表达式的更多实例,只需使用其数字,而不是再次写出整个表达式。 我们可以使用199这样的组及其对应的编号。

流行的Python re模块功能 (Popular Python re module Functions)

re.findall(A, B) | Matches all instances of an expression A in a string B and returns them in a list.

re.findall(A, B) | 匹配字符串B表达式A所有实例,并在列表中返回它们。

re.search(A, B) | Matches the first instance of an expression A in a string B, and returns it as a re match object.

re.search(A, B) | 匹配字符串B表达式A的第一个实例,并将其作为重新匹配对象返回。

re.split(A, B) | Split a string B into a list using the delimiter A.

re.split(A, B) | 使用定界符A将字符串B拆分为列表。

re.sub(A, B, C) | Replace A with B in the string C.

re.sub(A, B, C) | 将字符串C中的A替换为B

适用于Python用户的有用的正则表达式网站 (Useful Regular Expressions Sites for Python users)

Python 3 re module documentation

Python 3重新模块文档

Online regex tester and debugger

在线正则表达式测试器和调试器

翻译自: https://www.pybloggers.com/2018/04/python-regular-expressions-cheat-sheet/

正则表达式python

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值