Character Classes Standard POSIX Character Classes [:alnum:] Alphanumeric characters. [:alpha:] Alphabetic characters. [:blank:] Space and tab characters. [:cntrl:] Control characters. [:digit:] Numeric characters. [:graph:] Characters that are printable and are also visible. (A space is printable, but not visible, while an `a' is both.) [:lower:] Lower-case alphabetic characters. [:print:] Printable characters (characters that are not control characters.) [:punct:] Punctuation characters (characters that are not letter, digits, control characters, or space characters). [:space:] Space characters (such as space, tab, and formfeed, to name a few). [:upper:] Upper-case alphabetic characters. [:xdigit:] Characters that are hexadecimal digits. Non-standard POSIX-style Character Classes [:javastart:] Start of a Java identifier [:javapart:] Part of a Java identifier Predefined Classes . Matches any character other than newline /w Matches a "word" character (alphanumeric plus "_") /W Matches a non-word character /s Matches a whitespace character /S Matches a non-whitespace character /d Matches a digit character /D Matches a non-digit character Boundary Matchers ^ Matches only at the beginning of a line $ Matches only at the end of a line /b Matches only at a word boundary /B Matches only at a non-word boundary Greedy Closures A* Matches A 0 or more times (greedy) A+ Matches A 1 or more times (greedy) A? Matches A 1 or 0 times (greedy) A{n} Matches A exactly n times (greedy) A{n,} Matches A at least n times (greedy) A{n,m} Matches A at least n but not more than m times (greedy) Reluctant Closures A*? Matches A 0 or more times (reluctant) A+? Matches A 1 or more times (reluctant) A?? Matches A 0 or 1 times (reluctant) Logical Operators AB Matches A followed by B A|B Matches either A or B (A) Used for subexpression grouping (?:A) Used for subexpression clustering (just like grouping but no backrefs) Backreferences /1 Backreference to 1st parenthesized subexpression /2 Backreference to 2nd parenthesized subexpression /3 Backreference to 3rd parenthesized subexpression /4 Backreference to 4th parenthesized subexpression /5 Backreference to 5th parenthesized subexpression /6 Backreference to 6th parenthesized subexpression /7 Backreference to 7th parenthesized subexpression /8 Backreference to 8th parenthesized subexpression /9 Backreference to 9th parenthesized subexpression All closure operators (+, *, ?, {m,n}) are greedy by default, meaning that they match as many elements of the string as possible without causing the overall match to fail. If you want a closure to be reluctant (non-greedy), you can simply follow it with a '?'. A reluctant closure will match as few elements of the string as possible when finding matches. {m,n} closures don't currently support reluctancy. Line terminators
更多,请访问: |