strings - Pattern 9 Flags

Modifier and TypeConstant FieldValue
public static final intCANON_EQ128
public static final intCASE_INSENSITIVE2
public static final intCOMMENTS4
public static final intDOTALL32
public static final intLITERAL16
public static final intMULTILINE8
public static final intUNICODE_CASE64
public static final intUNICODE_CHARACTER_CLASS256
public static final intUNIX_LINES1

CASE_INSENSITIVE

public static final int CASE_INSENSITIVE

Enables case-insensitive matching.

By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE flag in conjunction with this flag.

Case-insensitive matching can also be enabled via the embedded flag expression (?i).

Specifying this flag may impose a slight performance penalty.

See Also:

Constant Field Values

 

COMMENTS

public static final int COMMENTS

Permits whitespace and comments in pattern.

In this mode, whitespace is ignored, and embedded comments starting with # are ignored until the end of a line.

Comments mode can also be enabled via the embedded flag expression (?x).

See Also:

Constant Field Values

 

MULTILINE

public static final int MULTILINE

Enables multiline mode.

In multiline mode the expressions ^ and $ match just after or just before, respectively, a line terminator or the end of the input sequence. By default these expressions only match at the beginning and the end of the entire input sequence.

Multiline mode can also be enabled via the embedded flag expression (?m).

See Also:

Constant Field Values

 

DOTALL

public static final int DOTALL

Enables dotall mode.

In dotall mode, the expression . matches any character, including a line terminator. By default this expression does not match line terminators.

Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)

See Also:

Constant Field Values

 

Particularly useful among these flags are Pattern.CASE_INSENSITIVE, Pattern.MULTILINE, and Pattern.COMMENTS (helpful for clarity and/or documentation).

CANON_EQ

public static final int CANON_EQ

Enables canonical equivalence.

When this flag is specified then two characters will be considered to match if, and only if, their full canonical decompositions match. The expression "a\u030A", for example, will match the string "\u00E5" when this flag is specified. By default, matching does not take canonical equivalence into account.

The expression "\u003F" will match the String ?

There is no embedded flag character for enabling canonical equivalence.

Specifying this flag may impose a performance penalty.

See Also:

Constant Field Values

 

public static Pattern compile(String regex,
                              int flags)

Compiles the given regular expression into a pattern with the given flags.

Parameters:

regex - The expression to be compiled

flags - Match flags, a bit mask that may include CASE_INSENSITIVEMULTILINEDOTALLUNICODE_CASECANON_EQUNIX_LINESLITERALUNICODE_CHARACTER_CLASS and COMMENTS

Returns:

the given regular expression compiled into a pattern with the given flags

Throws:

IllegalArgumentException - If bit values other than those corresponding to the defined match flags are set in flags

PatternSyntaxException - If the expression's syntax is invalid

 

find

public boolean find(int start)

Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

If the match succeeds then more information can be obtained via the start, end, and group methods, and subsequent invocations of the find() method will start at the first character not matched by this match.

Parameters:

start - the index to start searching for a match

Returns:

true if, and only if, a subsequence of the input sequence starting at the given index matches this matcher's pattern

Throws:

IndexOutOfBoundsException - If start is less than zero or if start is greater than the length of the input sequence.

 

group

public String group()

Returns the input subsequence matched by the previous match.

For a matcher m with input sequence s, the expressions m.group() and s.substring(m.start(), m.end()) are equivalent.

Note that some patterns, for example a*, match the empty string. This method will return the empty string when the pattern successfully matches the empty string in the input.

Specified by:

group in interface MatchResult

Returns:

The (possibly empty) subsequence matched by the previous match, in string form

Throws:

IllegalStateException - If no match has yet been attempted, or if the previous match operation failed

// strings/ReFlags.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.

import java.util.regex.*;

public class ReFlags {
  public static void main(String[] args) {
    Pattern p = Pattern.compile("^java", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
System.out.println("p.flags():" + p.flags());
    Matcher m =
        p.matcher(
            "java has regex\nJava has regex\n"
                + "JAVA has pretty good regular expressions\n"
                + "Regular expressions are in Java");
    while (m.find()) {
      System.out.println(m.group());
    }
  }
}
/* My Output:
p.flags():10
java
Java
JAVA
*/

when replace Pattern.CASE_INSENSITIVE with Pattern.UNIX_LINES, the result is 

java
    /**
     * Regular expression modifier values.  Instead of being passed as
     * arguments, they can also be passed as inline modifiers.
     * For example, the following statements have the same effect.
     * <pre>
     * RegExp r1 = RegExp.compile("abc", Pattern.I|Pattern.M);
     * RegExp r2 = RegExp.compile("(?im)abc", 0);
     * </pre>
     *
     * The flags are duplicated so that the familiar Perl match flag
     * names are available.
     */
Java Bitwise and Bit Shift Operators
OperatorDescription
|Bitwise OR
&Bitwise AND
~Bitwise Complement
^Bitwise XOR
<<Left Shift
>>Right Shift
>>>Unsigned Right Shift

references:

1. On Java 8 - Bruce Eckel

2. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#CANON_EQ

3. https://docs.oracle.com/javase/8/docs/api/constant-values.html#java.util.regex.Pattern.CANON_EQ

4. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#find--

5. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#group--

6. https://github.com/wangbingfeng/OnJava8-Examples/blob/master/strings/ReFlags.java

7. https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#compile-java.lang.String-int-

8. http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/regex/Pattern.java

9. https://www.programiz.com/java-programming/bitwise-operators

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值