正则表达式

查找中文:[\u4e00-\u9fa5]



最近在eclipse开发中用到正则表达式搜索替换,顺便总结。


搜索:^(.*)<h:outputText.*value=.*"#\{(.*)\}"(.*)$
替换:$1<h:outputText value= "#{strings.trim($2,30)}"$3


实现的功能是能找到代码中类似:

<h:outputText value="#{appl.departmentName}" />

的行。并替换为:

<h:outputText value= "#{strings.trim(appl.applicantName,30)}" />                                  

在上一组搜索表达式基础上作了些改进:

搜索:^(.*)<h:outputText(.*\R??.*)value=\p{Space}*"#\{(.*)\}"(.*)$
替换:$1<h:outputText$2value="#{strings.trim($3,30)}"$4

这组条件考虑到了中间带换行的情况。


这种正则表达式的编写有3个问题要注意:

1.有些特殊字符需要转意。例如:{  (   这种字符被正则表达式语法赋予了新意义,因此要用\{  \( 来代表原字符。

2.用( )对来截取字段中需要保留的变化部分。再用$1 , $2 , $3 ...$n 在替换字段里复用他们。


3.注意eclipse里的换行是 \R  ,不是下面文档里的 \r  。



以下是正则表达式语法的详细列举:


Construct Matches  Characters Character classes Predefined character classes POSIX character classes (US-ASCII only) Classes for Unicode blocks and categories Boundary matchers Greedy quantifiers Reluctant quantifiers Possessive quantifiers Logical operators Back references Quotation Special constructs (non-capturing)
x The character x
\\ The backslash character
\0n The character with octal value 0n (0 <= n <= 7)
\0nn The character with octal value 0nn (0 <= n <= 7)
\0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)
\xhh The character with hexadecimal value 0xhh
\uhhhh The character with hexadecimal value 0xhhhh
\t The tab character ('\ ')
\n The newline (line feed) character ('\ ')
\r The carriage-return character ('\ ')
\f The form-feed character ('\ ')
\a The alert (bell) character ('\\u0007')
\e The escape character ('\\u001B')
\cx The control character corresponding to x
[abc] ab, or c (simple class)
[^abc] Any character except ab, or c (negation)
[a-zA-Z] a through z or A through Z, inclusive (range)
[a-d[m-p]] a through d, or m through p[a-dm-p] (union)
[a-z&&[def]] de, or f (intersection)
[a-z&&[^bc]] a through z, except for b and c[ad-z] (subtraction)
[a-z&&[^m-p]] a through z, and not m through p[a-lq-z](subtraction)
. Any character (may or may not match line terminators)
\d A digit: [0-9]
\D A non-digit: [^0-9]
\s A whitespace character: [ \t\n\x0B\f\r]
\S A non-whitespace character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W A non-word character: [^\w]
\p{Lower} A lower-case alphabetic character: [a-z]
\p{Upper} An upper-case alphabetic character:[A-Z]
\p{ASCII} All ASCII:[\x00-\x7F]
\p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}]
\p{Digit} A decimal digit: [0-9]
\p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}]
\p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph} A visible character: [\p{Alnum}\p{Punct}]
\p{Print} A printable character: [\p{Graph}]
\p{Blank} A space or a tab: [ \t]
\p{Cntrl} A control character: [\x00-\x1F\x7F]
\p{XDigit} A hexadecimal digit: [0-9a-fA-F]
\p{Space} A whitespace character: [ \t\n\x0B\f\r]
\p{InGreek} A character in the Greek block (simple block)
\p{Lu} An uppercase letter (simple category)
\p{Sc} A currency symbol
\P{InGreek} Any character except one in the Greek block (negation)
[\p{L}&&[^\p{Lu}]]  Any letter except an uppercase letter (subtraction)
^ The beginning of a line
$ The end of a line
\b A word boundary
\B A non-word boundary
\A The beginning of the input
\G The end of the previous match
\Z The end of the input but for the final terminator, if any
\z The end of the input
X? X, once or not at all
X* X, zero or more times
X+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n but not more than m times
X?? X, once or not at all
X*? X, zero or more times
X+? X, one or more times
X{n}? X, exactly n times
X{n,}? X, at least n times
X{n,m}? X, at least n but not more than m times
X?+ X, once or not at all
X*+ X, zero or more times
X++ X, one or more times
X{n}+ X, exactly n times
X{n,}+ X, at least n times
X{n,m}+ X, at least n but not more than m times
XY X followed by Y
X|Y Either X or Y
(X) X, as a capturing group
\n Whatever the nth capturing group matched
\ Nothing, but quotes the following character
\Q Nothing, but quotes all characters until \E
\E Nothing, but ends quoting started by \Q
(?:X) X, as a non-capturing group
(?idmsux-idmsux)  Nothing, but turns match flags on - off
(?idmsux-idmsux:X)   X, as a non-capturing group with the given flags on - off
(?=X) X, via zero-width positive lookahead
(?!X) X, via zero-width negative lookahead
(?<=X) X, via zero-width positive lookbehind
(?<!X) X, via zero-width negative lookbehind
(?>X) X, as an independent, non-capturing group

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值