java 正则regex_Java中的正则表达式– Java Regex示例

java 正则regex

Welcome to Regular Expression in Java. It’s also called Regex in Java. When I started programming, java regular expression was a nightmare for me. This tutorial is aimed to help you master Regular Expression in Java. I will also come back here to refresh my Java Regex learning.

欢迎使用Java正则表达式。 在Java中也称为Regex。 当我开始编程时,Java正则表达式对我来说是一场噩梦。 本教程旨在帮助您掌握Java中的正则表达式。 我还将回到这里来刷新我的Java Regex学习。

Java正则表达式 (Regular Expression in Java)

The regular expression in java defines a pattern for a String. Regular Expression can be used to search, edit or manipulate text. A regular expression is not language specific but they differ slightly for each language. Regular Expression in Java is most similar to Perl.

Java中的正则表达式为String定义了一个模式。 正则表达式可用于搜索,编辑或处理文本。 正则表达式不是特定于语言的,但是每种语言的正则表达式略有不同。 Java中的正则表达式与Perl最相似。

Java Regex classes are present in java.util.regex package that contains three classes:

Java Regex类存在于java.util.regex软件包中,该软件包包含三个类:

  1. Pattern: Pattern object is the compiled version of the regular expression. Pattern class doesn’t have any public constructor and we use it’s public static method compile to create the pattern object by passing regular expression argument.

    PatternPattern对象是正则表达式的编译版本。 模式类没有任何公共构造函数,我们使用它的公共静态方法compile通过传递正则表达式参数来创建模式对象。
  2. Matcher: Matcher is the java regex engine object that matches the input String pattern with the pattern object created. Matcher class doesn’t have any public constructor and we get a Matcher object using pattern object matcher method that takes the input String as argument. We then use matches method that returns boolean result based on input String matches the regex pattern or not.

    MatcherMatcher是java regex引擎对象,它将输入的String模式与创建的pattern对象进行匹配。 Matcher类没有任何公共构造函数,并且我们使用模式对象matcher方法(将输入String作为参数)来获得Matcher对象。 然后,我们使用matches方法,该方法根据输入的String是否匹配正则表达式模式返回布尔结果。
  3. PatternSyntaxException: PatternSyntaxException is thrown if the regular expression syntax is not correct.

    PatternSyntaxException :如果正则表达式语法不正确,则引发PatternSyntaxException

Let’s have a look at Java Regex example program.

让我们看一下Java Regex示例程序。

package com.journaldev.util;

import java.util.regex.*;

public class PatternExample {

	public static void main(String[] args) {
		Pattern pattern = Pattern.compile(".xx.");
		Matcher matcher = pattern.matcher("MxxY");
		System.out.println("Input String matches regex - "+matcher.matches());
		// bad regular expression
		pattern = Pattern.compile("*xx*");

	}

}

When we run this java regex example program, we get below output.

当我们运行此Java regex示例程序时,将获得以下输出。

Input String matches regex - true
Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*xx*
^
	at java.util.regex.Pattern.error(Pattern.java:1924)
	at java.util.regex.Pattern.sequence(Pattern.java:2090)
	at java.util.regex.Pattern.expr(Pattern.java:1964)
	at java.util.regex.Pattern.compile(Pattern.java:1665)
	at java.util.regex.Pattern.(Pattern.java:1337)
	at java.util.regex.Pattern.compile(Pattern.java:1022)
	at com.journaldev.util.PatternExample.main(PatternExample.java:13)

Since java regular expression revolves around String, String class has been extended in Java 1.4 to provide a matches method that does regex pattern matching. Internally it uses Pattern and Matcher java regex classes to do the processing but obviously it reduces the code lines.

由于Java正则表达式围绕String展开,因此Java 1.4中对String类进行了扩展,以提供执行正则表达式模式匹配的matches方法。 在内部,它使用PatternMatcher java regex类进行处理,但是显然,它减少了代码行。

Pattern class also contains matches method that takes regex and input String as argument and return boolean result after matching them.

Pattern类还包含matches方法,该方法以正则表达式和输入String作为参数,并在匹配它们后返回布尔结果。

So below code works fine for matching input String with a regular expression in Java.

因此,以下代码可以很好地将输入String与Java中的正则表达式进行匹配。

String str = "bbb";
System.out.println("Using String matches method: "+str.matches(".bb"));
System.out.println("Using Pattern matches method: "+Pattern.matches(".bb", str));

So if your requirement is just to check if the input String matches with the pattern, you should save time and lines of code by using simple String matches method.

因此,如果您只需要检查输入的String是否与模式匹配,则应使用简单的String Match方法来节省时间和代码行。

You should use Pattern and Matches classes only when you need to manipulate the input String or you need to reuse the pattern.

仅在需要操纵输入String或需要重用模式时,才应使用Pattern和Matches类。

Note that the pattern defined by regex is applied on the String from left to right and once a source character is used in a match, it can’t be reused.

请注意,由正则表达式定义的模式从左到右应用于字符串,并且一旦在匹配项中使用了源字符,就无法重用它。

For example, regex “121” will match “31212142121” only twice as “_121____121”.

例如,正则表达式“ 121”将匹配“ 31212142121”的次数是“ _121____121”的两倍。

Java中的正则表达式–常见的匹配符号 (Regular Expression in Java – common matching symbols)

Regular ExpressionDescriptionExample
.Matches any single character (“..”, “a%”) – true(“..”, “.a”) – true

(“..”, “a”) – false

^aaaMatches aaa regex at the beginning of the line  (“^a.c.”, “abcd”) – true

(“^a”, “ac”) – false

aaa$Matches regex aaa at the end of the line (“..cd$”, “abcd”) – true(“a$”, “a”) – true

(“a$”, “aca”) – false

[abc]Can match any of the letter a, b or c. [] are known as character classes. (“^[abc]d.”, “ad9”) – true(“[ab].d$”, “bad”) – true

(“[ab]x”, “cx”) – false

[abc][12]Can match a, b or c followed by 1 or 2(“[ab][12].”, “a2#”) – true(“[ab]..[12]”, “acd2”) – true

(“[ab][12]”, “c2”) – false

[^abc]When ^ is the first character in [], it negates the pattern, matches anything except a, b or c(“[^ab][^12].”, “c3#”) – true(“[^ab]..[^12]”, “xcd3”) – true

(“[^ab][^12]”, “c2”) – false

[a-e1-8]Matches ranges between a to e or 1 to 8(“[a-e1-3].”, “d#”) – true(“[a-e1-3]”, “2”) – true

(“[a-e1-3]”, “f2”) – false

xx|yyMatches regex xx or yy(“x.|y”, “xa”) – true(“x.|y”, “y”) – true (“x.|y”, “yz”) – false
正则表达式 描述
匹配任何单个字符 (“ ..”,“。a%”)– true(“ ..”,“。a”)– true

(“ ..”,“ a”)–否

^ aaa 在行首匹配aaa regex (“ ^ ac”,“ abcd”)–是

(“ ^ a”,“ ac”)–否

aaa $ 在行尾匹配正则表达式aaa (“ ..cd $”,“ abcd”)–真实(“ a $”,“ a”)–真实

(“ a $”,“ aca”)–否

[abc] 可以匹配字母a,b或c。 []被称为字符类。 (“ ^ [abc] d。”,“ ad9”)–正确(“ [ab] .d $”,“坏”)–正确

(“ [ab] x”,“ cx”)–否

[abc] [12] 可以匹配a,b或c,然后匹配1或2 (“ [ab] [12]。”,“ a2#”)– true(“ [ab] .. [12]”,“ acd2”)– true

(“ [ab] [12]”,“ c2”)–错误

[^ abc] 当^是[]中的第一个字符时,它会否定模式,匹配除a,b或c之外的任何其他字符 (“ [[^ ab] [^ 12]。”,“ c3#”)–真(“ [[^ ab] .. [^ 12]”,“ xcd3”)–真

(“ [^ ab] [^ 12]”,“ c2”)–否

[a-e1-8] 匹配范围从a到e或1到8 (“ [a-e1-3]”,“ d#”)–真(“ [a-e1-3]”,“ 2”)-真

(“ [a-e1-3]”,“ f2”)–错误

xx | yy 匹配正则表达式xx或yy (“ x。| y”,“ xa”)– true(“ x。| y”,“ y”)– true(“ x。| y”,“ yz”)– false

Java Regex元字符 (Java Regex Metacharacters)

We have some meta characters in Java regex, it’s like shortcodes for common matching patterns.

Java正则表达式中有一些元字符,就像常见匹配模式的短代码一样。

Regular Expression Description
\dAny digits, short of [0-9]
\DAny non-digit, short for [^0-9]
\sAny whitespace character, short for [\t\n\x0B\f\r]
\SAny non-whitespace character, short for [^\s]
\wAny word character, short for [a-zA-Z_0-9]
\WAny non-word character, short for [^\w]
\bA word boundary
\BA non word boundary
正则表达式 描述
\ d 任何数字,少于[0-9]
\ D 任何非数字,是[^ 0-9]的缩写
\ s 任何空格字符,是[\ t \ n \ x0B \ f \ r]的缩写
\ S 任何非空白字符,是[^ \ s]的缩写
\ w 任何文字字符,是[a-zA-Z_0-9]的缩写
\ W 任何非单词字符,是[^ \ w]的缩写
\ b 单词边界
\ B 非单词边界

There are two ways to use metacharacters as ordinary characters in regular expressions.

在正则表达式中有两种方法可以将元字符用作普通字符。

  1. Precede the metacharacter with a backslash (\).

    在元字符之前加一个反斜杠(\)。
  2. Keep metacharcter within \Q (which starts the quote) and \E (which ends it).

    保持元字符在\ Q(以引号开头)和\ E(以引号结尾)之内。

Java中的正则表达式–量词 (Regular Expression in Java – Quantifiers)

Java Regex Quantifiers specify the number of occurrence of a character to match against.

Java Regex量词指定要匹配的字符的出现次数。

Regular ExpressionDescription
x?x occurs once or not at all
X*X occurs zero or more times
X+X occurs one or more times
X{n}X occurs exactly n times
X{n,}X occurs n or more times
X{n,m}X occurs at least n times but not more than m times
正则表达式 描述
X? x发生一次或根本不发生
X* X出现零次或多次
X + X发生一次或多次
X {n} X正好发生n次
X {n,} X出现n次或更多次
X {n,m} X发生至少n次但不超过m次

Java Regex Quantifiers can be used with character classes and capturing groups also.

Java Regex量词也可以与字符类和捕获组一起使用。

For example, [abc]+ means – a, b, or c – one or more times.

例如,[abc] +表示– a,b或c –一次或多次。

(abc)+ means the group “abc” one more more times. We will discuss about Capturing Group now.

(abc)+表示“ abc”组再出现一次。 我们现在将讨论捕获组

Java中的正则表达式–捕获组 (Regular Expression in Java – Capturing Groups)

Regular Expression in Java Capturing groups is used to treat multiple characters as a single unit. You can create a group using (). The portion of input String that matches the capturing group is saved into memory and can be recalled using Backreference.

Java捕获组中的正则表达式用于将多个字符视为一个单元。 您可以使用()创建一个组。 输入String与捕获组匹配的部分被保存到内存中,可以使用Backreference进行调用。

You can use matcher.groupCount method to find out the number of capturing groups in a java regex pattern. For example, ((a)(bc)) contains 3 capturing groups – ((a)(bc)), (a) and (bc) .

您可以使用matcher.groupCount方法找出Java正则表达式模式中捕获组的数量。 例如,(((a)(bc))包含3个捕获组–((a)(bc)),(a)和(bc)。

You can use Backreference in the regular expression with a backslash (\) and then the number of the group to be recalled.

您可以在正则表达式中使用反引号 (\),然后使用要重新调用的组号。

Capturing groups and Backreferences can be confusing, so let’s understand this with an example.

捕获组和反向引用可能会造成混淆,因此让我们通过一个示例来理解它。

System.out.println(Pattern.matches("(\\w\\d)\\1", "a2a2")); //true
System.out.println(Pattern.matches("(\\w\\d)\\1", "a2b2")); //false
System.out.println(Pattern.matches("(AB)(B\\d)\\2\\1", "ABB2B2AB")); //true
System.out.println(Pattern.matches("(AB)(B\\d)\\2\\1", "ABB2B3AB")); //false

In the first example, at runtime first capturing group is (\w\d) which evaluates to “a2” when matched with the input String “a2a2” and saved in memory. So \1 is referring to “a2” and hence it returns true. Due to the same reason the second statement prints false.

在第一个示例中,在运行时,第一个捕获组是(\ w \ d),当与输入字符串“ a2a2”匹配并保存在内存中时,其计算结果为“ a2”。 因此\ 1指的是“ a2”,因此它返回true。 由于相同的原因,第二条语句打印为false。

Try to understand this scenario for statement 3 and 4 yourself. 🙂

尝试自己了解语句3和4的这种情况。 🙂

Now we will look at some important methods of Pattern and Matcher classes.

现在,我们将研究一些重要的Pattern和Matcher类方法。

  1. We can create a Pattern object with flags. For example Pattern.CASE_INSENSITIVE enables case insensitive matching.

    我们可以创建带有标志的Pattern对象。 例如Pattern.CASE_INSENSITIVE启用不区分大小写的匹配。
  2. Pattern class also provides split(String) method that is similar to String class split() method.

    图案类还提供split(String)方法,它类似于String类split()方法。
  3. Pattern class toString() method returns the regular expression String from which this pattern was compiled.

    模式类toString()方法返回从中编译此模式的正则表达式String。
  4. Matcher classes have start() and end() index methods that show precisely where the match was found in the input string.

    Matcher类具有start()end()索引方法,可精确显示在输入字符串中找到匹配项的位置。
  5. Matcher class also provides String manipulation methods replaceAll(String replacement) and replaceFirst(String replacement).

    Matcher类还提供String操作方法replaceAll(String replacement)replaceFirst(String replacement)

Let’s look at these java regex methods in a simple example program.

让我们在一个简单的示例程序中查看这些Java regex方法。

package com.journaldev.util;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexExamples {

	public static void main(String[] args) {
		// using pattern with flags
		Pattern pattern = Pattern.compile("ab", Pattern.CASE_INSENSITIVE);
		Matcher matcher = pattern.matcher("ABcabdAb");
		// using Matcher find(), group(), start() and end() methods
		while (matcher.find()) {
			System.out.println("Found the text \"" + matcher.group()
					+ "\" starting at " + matcher.start()
					+ " index and ending at index " + matcher.end());
		}

		// using Pattern split() method
		pattern = Pattern.compile("\\W");
		String[] words = pattern.split("one@two#three:four$five");
		for (String s : words) {
			System.out.println("Split using Pattern.split(): " + s);
		}

		// using Matcher.replaceFirst() and replaceAll() methods
		pattern = Pattern.compile("1*2");
		matcher = pattern.matcher("11234512678");
		System.out.println("Using replaceAll: " + matcher.replaceAll("_"));
		System.out.println("Using replaceFirst: " + matcher.replaceFirst("_"));
	}

}

The output of the above java regex example program is.

上面的java regex示例程序的输出是。

Found the text "AB" starting at 0 index and ending at index 2
Found the text "ab" starting at 3 index and ending at index 5
Found the text "Ab" starting at 6 index and ending at index 8
Split using Pattern.split(): one
Split using Pattern.split(): two
Split using Pattern.split(): three
Split using Pattern.split(): four
Split using Pattern.split(): five
Using replaceAll: _345_678
Using replaceFirst: _34512678

That’s all for Regular expressions in Java. Java Regex seems hard at first, but if you work with them for some time, it’s easy to learn and use.

这就是Java中的正则表达式。 刚开始,Java Regex似乎很困难,但是如果您与他们一起工作了一段时间,则很容易学习和使用。

GitHub Repository. GitHub存储库中检出完整的代码和更多正则表达式示例。

翻译自: https://www.journaldev.com/634/regular-expression-in-java-regex-example

java 正则regex

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值