Java Regular Expression (Java正则表达式)

In current Project, we need to parse log file (written by log4j) and extract dedicated information for future actions:

 

  1. Display info in GUI
  2. Analyze error and send mail to responder
  3. Backup key logs into DB which can be shared to external system
  4. ...
Oh, what's the topic I am going to talk in this blog? ERP, CRM or requirement for DMS? OK, that's Reguler Expression in Java. Forgive me, sometime enginers start talking in topic A but finally they made a deal for topic B. 
So Regular Expression is one solution to parse some log files.

I will talk about general regular expression semantics first and then some regular expression usage with java.

Regular Expression Semantics

Common Match Sysmbols

RE

Description

.

Matches any sign

^regex

regex must match at the beginning of the line

regex$

Finds regex must match at the end of the line

[abc]

Set definition, can match the letter a or b or c

[abc[vz]]

Set definition, can match a or b or c followed by either v or z

[^abc]

When a "^" appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c

[a-d1-7]

Ranges, letter between a and d and figures from 1 to 7, will not match d1

X|Z

Finds X or Z

XZ

Finds X directly followed by Z

$

Checks if a line end follows

 

Special Metacharacters

RE

Description

\d

Any digit, short for [0-9]

\D

A non-digit, short for [^0-9]

\s

A whitespace character, short for [ \t\n\x0b\r\f]

\S

A non-whitespace character, for short for [^\s]

\w

A word character, short for [a-zA-Z_0-9]

\W

A non-word character [^\w]

\S+

Several non-whitespace characters

 

Quantifier

RE

Description

Examples

*

Occurs zero or more times, is short for {0,}

X* - Finds no or several letter X, .* - any character sequence

+

Occurs one or more times, is short for {1,}

X+ - Finds one or several letter X

?

Occurs no or one times, ? is short for {0,1}

X? -Finds no or exactly one letter X

{X}

Occurs X number of times, {} describes the order of the preceding liberal

\d{3} - Three digits, .{10} - any character sequence of length 10

{X,Y}

.Occurs between X and Y times,

\d{1,4}- \d must occur at least once and at a maximum of four

*?

? after a qualifier makes it a "reluctant quantifier", it tries to find the smallest match.

 

 

Java Regular Expression Usage

To be updated.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值