【Python 学习笔记】metacharacters in regex

what are metacharacters in python regex?

In regular expressions (regex), metacharacters are special characters that have a specific meaning and function within a regex pattern. They are used to define search patterns beyond literal text, allowing for powerful pattern matching. These characters often need to be escaped (using a backslash \) if you want to match them literally, rather than interpret them in their special function.

Common Regex Metacharacters:

  1. . (Dot)
    • Matches any character except a newline (\n).
    • Example: a.c matches "abc", "axc", but not "abxc".
  2. ^ (Caret)
    • Anchors the pattern to the start of a string.
    • Example: ^abc matches "abc" at the beginning of the string.
  3. $ (Dollar Sign)
    • Anchors the pattern to the end of a string.
    • Example: abc$ matches "abc" at the end of the string.
  4. * (Asterisk)
    • Matches 0 or more occurrences of the preceding element.
    • Example: ab*c matches "ac", "abc", "abbc", "abbbc", etc.
  5. + (Plus)
    • Matches 1 or more occurrences of the preceding element.
    • Example: ab+c matches "abc", "abbc", "abbbc", but not "ac".
  6. ? (Question Mark)
    • Matches 0 or 1 occurrence of the preceding element (makes it optional).
    • Example: colou?r matches both "color" and "colour".
  7. [] (Square Brackets)
    • Used for defining character classes (sets of characters).
    • Example: [abc] matches "a", "b", or "c".
    • Example: [a-z] matches any lowercase letter.
  8. | (Pipe)
    • Acts as an OR operator.
    • Example: a|b matches "a" or "b".
  9. () (Parentheses)
    • Used for grouping expressions and capturing substrings.
    • Example: (abc)+ matches "abc", "abcabc", etc.
  10. \ (Backslash)
    • Used to escape special characters (so you can match them literally).
    • Example: \. matches a literal dot, not any character.
  11. {} (Curly Braces)
    • Defines quantifiers for the preceding element.
    • Example: a{2,3} matches "aa" or "aaa" (2 to 3 occurrences of "a").
  12. - (Hyphen inside brackets)
    • Used to define ranges in a character class.
    • Example: [a-z] matches any lowercase letter.

Escape Sequences in Regular Expressions:

In regex, escape sequences are used to represent special sets of characters, making pattern matching easier.

  1. \d
    • Matches any digit (equivalent to [0-9]).
    • Example: \d{3} matches any three digits.
  2. \D
    • Matches any non-digit character.
    • Example: \D matches any character that is not a digit.
  3. \w
    • Matches any word character (letters, digits, and underscores) (equivalent to [a-zA-Z0-9_]).
    • Example: \w+ matches one or more word characters.
  4. \W
    • Matches any non-word character.
    • Example: \W matches spaces, punctuation, and other non-alphanumeric characters.
  5. \s
    • Matches any whitespace character (spaces, tabs, line breaks).
    • Example: \s+ matches one or more spaces.
  6. \S
    • Matches any non-whitespace character.
    • Example: \S+ matches one or more characters that are not spaces.
  7. \b
    • Matches a word boundary (the position between a word and a non-word character).
    • Example: \bword\b matches "word" as a whole word, but not "swordfish".
  8. \B
    • Matches a non-word boundary.
    • Example: \Bword\B matches "swordfish", but not "word" as a separate word.

"Answer Generated by OpenAI's ChatGPT"

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值