Modifier and Type | Constant Field | Value |
---|---|---|
public static final int | CANON_EQ | 128 |
public static final int | CASE_INSENSITIVE | 2 |
public static final int | COMMENTS | 4 |
public static final int | DOTALL | 32 |
public static final int | LITERAL | 16 |
public static final int | MULTILINE | 8 |
public static final int | UNICODE_CASE | 64 |
public static final int | UNICODE_CHARACTER_CLASS | 256 |
public static final int | UNIX_LINES | 1 |
CASE_INSENSITIVE
public static final int CASE_INSENSITIVE
Enables case-insensitive matching.
By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE
flag in conjunction with this flag.
Case-insensitive matching can also be enabled via the embedded flag expression (?i).
Specifying this flag may impose a slight performance penalty.
See Also:
COMMENTS
public static final int COMMENTS
Permits whitespace and comments in pattern.
In this mode, whitespace is ignored, and embedded comments starting with # are ignored until the end of a line.
Comments mode can also be enabled via the embedded flag expression (?x).
See Also:
MULTILINE
public static final int MULTILINE
Enables multiline mode.
In multiline mode the expressions ^ and $ match just after or just before, respectively, a line terminator or the end of the input sequence. By default these expressions only match at the beginning and the end of the entire input sequence.
Multiline mode can also be enabled via the embedded flag expression (?m).
See Also:
DOTALL
public static final int DOTALL
Enables dotall mode.
In dotall mode, the expression . matches any character, including a line terminator. By default this expression does not match line terminators.
Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)
See Also:
Particularly useful among these flags are Pattern.CASE_INSENSITIVE, Pattern.MULTILINE, and Pattern.COMMENTS (helpful for clarity and/or documentation).
CANON_EQ
public static final int CANON_EQ
Enables canonical equivalence.
When this flag is specified then two characters will be considered to match if, and only if, their full canonical decompositions match. The expression "a\u030A", for example, will match the string "\u00E5" when this flag is specified. By default, matching does not take canonical equivalence into account.
The expression "\u003F" will match the String ?
There is no embedded flag character for enabling canonical equivalence.
Specifying this flag may impose a performance penalty.
See Also:
public static Pattern compile(String regex,
int flags)
Compiles the given regular expression into a pattern with the given flags.
Parameters:
regex
- The expression to be compiled
flags
- Match flags, a bit mask that may include CASE_INSENSITIVE
, MULTILINE
, DOTALL
, UNICODE_CASE
, CANON_EQ
, UNIX_LINES
, LITERAL
, UNICODE_CHARACTER_CLASS
and COMMENTS
Returns:
the given regular expression compiled into a pattern with the given flags
Throws:
IllegalArgumentException
- If bit values other than those corresponding to the defined match flags are set in flags
PatternSyntaxException
- If the expression's syntax is invalid
find
public boolean find(int start)
Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
If the match succeeds then more information can be obtained via the start, end, and group methods, and subsequent invocations of the find()
method will start at the first character not matched by this match.
Parameters:
start
- the index to start searching for a match
Returns:
true if, and only if, a subsequence of the input sequence starting at the given index matches this matcher's pattern
Throws:
IndexOutOfBoundsException
- If start is less than zero or if start is greater than the length of the input sequence.
group
public String group()
Returns the input subsequence matched by the previous match.
For a matcher m with input sequence s, the expressions m.group() and s.substring(m.start(), m.end()) are equivalent.
Note that some patterns, for example a*, match the empty string. This method will return the empty string when the pattern successfully matches the empty string in the input.
Specified by:
group
in interface MatchResult
Returns:
The (possibly empty) subsequence matched by the previous match, in string form
Throws:
IllegalStateException
- If no match has yet been attempted, or if the previous match operation failed
// strings/ReFlags.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.
import java.util.regex.*;
public class ReFlags {
public static void main(String[] args) {
Pattern p = Pattern.compile("^java", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
System.out.println("p.flags():" + p.flags());
Matcher m =
p.matcher(
"java has regex\nJava has regex\n"
+ "JAVA has pretty good regular expressions\n"
+ "Regular expressions are in Java");
while (m.find()) {
System.out.println(m.group());
}
}
}
/* My Output:
p.flags():10
java
Java
JAVA
*/
when replace Pattern.CASE_INSENSITIVE with Pattern.UNIX_LINES, the result is
java
/**
* Regular expression modifier values. Instead of being passed as
* arguments, they can also be passed as inline modifiers.
* For example, the following statements have the same effect.
* <pre>
* RegExp r1 = RegExp.compile("abc", Pattern.I|Pattern.M);
* RegExp r2 = RegExp.compile("(?im)abc", 0);
* </pre>
*
* The flags are duplicated so that the familiar Perl match flag
* names are available.
*/
Operator | Description |
---|---|
| | Bitwise OR |
& | Bitwise AND |
~ | Bitwise Complement |
^ | Bitwise XOR |
<< | Left Shift |
>> | Right Shift |
>>> | Unsigned Right Shift |
references:
1. On Java 8 - Bruce Eckel
2. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#CANON_EQ
3. https://docs.oracle.com/javase/8/docs/api/constant-values.html#java.util.regex.Pattern.CANON_EQ
4. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#find--
5. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#group--
6. https://github.com/wangbingfeng/OnJava8-Examples/blob/master/strings/ReFlags.java
7. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#compile-java.lang.String-int-
9. https://www.programiz.com/java-programming/bitwise-operators