Java正则表达式,主要java.util.regex.Pattern 和 java.util.regex.Matcher 这两个类提供实现:
先上例程:如下
import java.util.regex.Pattern ;
import java.util.regex.Matcher ;
public class RegexDemo03{
public static void main(String args[]){
String str = "1983-07-27" ; // 指定好一个日期格式的字符串
String pat = "\\d{4}-\\d{2}-\\d{2}" ; // 指定好正则表达式
Pattern p = Pattern.compile(pat) ; // 实例化Pattern类
Matcher m = p.matcher(str) ; // 实例化Matcher类
if(m.matches()){ // 进行验证的匹配,使用正则
System.out.println("日期格式合法!") ;
}else{
System.out.println("日期格式不合法!") ;
}
}
};
字符拆分
import java.util.regex.Pattern;
public class DemoRegex001 {
public static void main(String args[]){
// 要求将里面的字符取出,也就是说按照数字拆分
String str = "A1B22C333D4444E55555F" ; // 指定好一个字符串
String pat = "\\d+" ; // 指定好正则表达式
Pattern p = Pattern.compile(pat) ; // 实例化Pattern类
String s[] = p.split(str) ; // 执行拆分操作
for(int x=0;x<s.length;x++){
System.out.print(s[x] + "\t") ;
}
}
}
Construct | Matches |
---|---|
Characters | |
x | The character x |
\\ | The backslash character |
\0n | The character with octal value 0n (0 <= n <= 7) |
\0nn | The character with octal value 0nn (0 <= n <= 7) |
\0mnn | The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) |
\xhh | The character with hexadecimal value 0xhh |
\uhhhh | The character with hexadecimal value 0xhhhh |
\x{h...h} | The character with hexadecimal value 0xh...h (Character.MIN_CODE_POINT <= 0xh...h <= Character.MAX_CODE_POINT ) |
\t | The tab character ('\u0009') |
\n | The newline (line feed) character ('\u000A') |
\r | The carriage-return character ('\u000D') |
\f | The form-feed character ('\u000C') |
\a | The alert (bell) character ('\u0007') |
\e | The escape character ('\u001B') |
\cx | The control character corresponding to x |
Character classes | |
[abc] | a, b, or c (simple class) |
[^abc] | Any character except a, b, or c (negation) |
[a-zA-Z] | a through z or A through Z, inclusive (range) |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) |
[a-z&&[def]] | d, e, or f (intersection) |
[a-z&&[^bc]] | a through z, except for b and c: [ad-z] (subtraction) |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z](subtraction) |
Predefined character classes | |
. | Any character (may or may not match line terminators) |
\d | A digit: [0-9] |
\D | A non-digit: [^0-9] |
\s | A whitespace character: [ \t\n\x0B\f\r] |
\S | A non-whitespace character: [^\s] |
\w | A word character: [a-zA-Z_0-9] |
\W | A non-word character: [^\w] |
POSIX character classes (US-ASCII only) | |
\p{Lower} | A lower-case alphabetic character: [a-z] |
\p{Upper} | An upper-case alphabetic character:[A-Z] |
\p{ASCII} | All ASCII:[\x00-\x7F] |
\p{Alpha} | An alphabetic character:[\p{Lower}\p{Upper}] |
\p{Digit} | A decimal digit: [0-9] |
\p{Alnum} | An alphanumeric character:[\p{Alpha}\p{Digit}] |
\p{Punct} | Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ |
\p{Graph} | A visible character: [\p{Alnum}\p{Punct}] |
\p{Print} | A printable character: [\p{Graph}\x20] |
\p{Blank} | A space or a tab: [ \t] |
\p{Cntrl} | A control character: [\x00-\x1F\x7F] |
\p{XDigit} | A hexadecimal digit: [0-9a-fA-F] |
\p{Space} | A whitespace character: [ \t\n\x0B\f\r] |
java.lang.Character classes (simple java character type) | |
\p{javaLowerCase} | Equivalent to java.lang.Character.isLowerCase() |
\p{javaUpperCase} | Equivalent to java.lang.Character.isUpperCase() |
\p{javaWhitespace} | Equivalent to java.lang.Character.isWhitespace() |
\p{javaMirrored} | Equivalent to java.lang.Character.isMirrored() |
Classes for Unicode scripts, blocks, categories and binary properties | |
\p{IsLatin} | A Latin script character (script) |
\p{InGreek} | A character in the Greek block (block) |
\p{Lu} | An uppercase letter (category) |
\p{IsAlphabetic} | An alphabetic character (binary property) |
\p{Sc} | A currency symbol |
\P{InGreek} | Any character except one in the Greek block (negation) |
[\p{L}&&[^\p{Lu}]] | Any letter except an uppercase letter (subtraction) |
Boundary matchers | |
^ | The beginning of a line |
$ | The end of a line |
\b | A word boundary |
\B | A non-word boundary |
\A | The beginning of the input |
\G | The end of the previous match |
\Z | The end of the input but for the final terminator, if any |
\z | The end of the input |
Greedy quantifiers | |
X? | X, once or not at all |
X* | X, zero or more times |
X+ | X, one or more times |
X{n} | X, exactly n times |
X{n,} | X, at least n times |
X{n,m} | X, at least n but not more than m times |
Reluctant quantifiers | |
X?? | X, once or not at all |
X*? | X, zero or more times |
X+? | X, one or more times |
X{n}? | X, exactly n times |
X{n,}? | X, at least n times |
X{n,m}? | X, at least n but not more than m times |
Possessive quantifiers | |
X?+ | X, once or not at all |
X*+ | X, zero or more times |
X++ | X, one or more times |
X{n}+ | X, exactly n times |
X{n,}+ | X, at least n times |
X{n,m}+ | X, at least n but not more than m times |
Logical operators | |
XY | X followed by Y |
X|Y | Either X or Y |
(X) | X, as a capturing group |
Back references | |
\n | Whatever the nth capturing group matched |
\k<name> | Whatever the named-capturing group "name" matched |
Quotation | |
\ | Nothing, but quotes the following character |
\Q | Nothing, but quotes all characters until \E |
\E | Nothing, but ends quoting started by \Q |
Special constructs (named-capturing and non-capturing) | |
(?<name>X) | X, as a named-capturing group |
(?:X) | X, as a non-capturing group |
(?idmsuxU-idmsuxU) | Nothing, but turns match flags i d m s u x U on - off |
(?idmsux-idmsux:X) | X, as a non-capturing group with the given flags i d m s u x on - off |
(?=X) | X, via zero-width positive lookahead |
(?!X) | X, via zero-width negative lookahead |
(?<=X) | X, via zero-width positive lookbehind |
(?<!X) | X, via zero-width negative lookbehind |
(?>X) | X, as an independent, non-capturing group |