Emeditor Regular Expression

要注意的是,查找所有,要加上 "." 后面+上“*”
EmEditor FAQ: What are examples of regular expressions? rel="stylesheet" type="text/css" href="../../help.css">

Q. What are examples of regular expressions?

  • strings surrounded by double-quotation marks
    ".*?"
  • strings surrounded by [ ]
    /[[^/[]*?/]
  • variable names
    [a-zA-Z_][a-zA-Z_0-9]*
  • IP addresses
    ([0-9]{1,3})/.([0-9]{1,3})/.([0-9]{1,3})/.([0-9]{1,3})
  • URL
    (/S+)://([^:/]+)(:(/d+))?(/[^#/s]*)(#(/S+))?
  • lines followed by a tab
    /t.*$
  • Hiragana
    [/x{3041}-/x{309e}]
  • Full-width Katakana
    [/x{309b}-/x{309c}/x{30a1}-/x{30fe}]
  • Half-width Kana
    [/x{ff61}-/x{ff9f}]
  • CJK ideographs
    [/x{3400}-/x{9fff}/x{f900}-/x{fa2d}]
  • CJK ideograph marks
    [/x{3000}-/x{3037}]
  • Hangul
    [/x{1100}-/x{11f9}/x{3131}-/x{318e}/x{ac00}-/x{d7a3}]
  • Insert // at start of lines
    Find: ^
    Replace with: //
  • Remove // at end of lines
    Find: ^//
    Replace:
  • Remove trailing whitespaces
    Find: /s+?$
    Replace with:
  • Replace (abc) with [abc]
    Find: /((.*?)/)
    Replace: /[/1/]
  • Replace <H3 ...> with <H4 ...>
    Find: <H3(.*?)>
    Replace: <H4/1>
  • Replace 9/13/2003 with 2003.9.13
    Find: ([0-9]{1,2})/([0-9]{1,2})/([0-9]{2,4})
    Replace: /3/./1/./2/.
  • Uppercase characters from a to z (EmEditor Professional only)
    Find: [a-z]
    Replace: /U/0
  • Capitalize all words (EmEditor Professional only)
    Find: ([a-zA-Z])([a-zA-Z]*)
    Replace: /U/1/L/2
EmEditor How to: Regular Expression Syntax rel="stylesheet" type="text/css" href="../../help.css">

Regular Expression Syntax

EmEditor regular expression syntax is based on the Perl regular expression syntax.

Literals

All characters are literals except: ".", "*", "?", "+", "(", ")", "{", "}", "[", "]", "^", "$" and "/". These characters are literals when preceded by a "/". A literal is a character that matches itself. For example, searching for "/?" will match every "?"in the document, or searching for "Hello" will match every "Hello" in the document.

Metacharacters

The following tables contain the complete list of metacharacters (non-literals) and their behavior in the context of regular expressions.

/Marks the next character as a special character, a literal, or a back reference. For example, 'n' matches the character "n". '/n' matches a newline character. The sequence '//' matches "/" and "/(" matches "(".
^Matches the position at the beginning of the input string. For example, "^e" matches any "e" that begins a string.
$Matches the position at the end of the input string. For example, "e$" matches any "e" that ends a string.
*Matches the preceding character or sub-expression zero or more times. For example, zo* matches "z" and "zoo". * is equivalent to {0,}.
+Matches the preceding character or sub-expression one or more times. For example,'zo+' matches "zo" and "zoo" , but not "z". + is equivalent to {1,}.
?Matches the preceding character or sub-expression zero or one time. For example,"do(es)?" matches the "do" in  "do"or "does".? is equivalent to {0,1}
{n}n is a nonnegative integer. Matches exactly n times. For example, 'o{2}' does not match the "o" in "Bob" but matches the two o's in "food".
{n,}n is a nonnegative integer. Matches at least n times. For example,'o{2,}' does not match "o" in "Bob" and matches all the o's in "foooood". "o{1,}" is equivalent to  'o+'. 'o{0,}' is equivalent to 'o*'.
{n,m}m and n are nonnegative integers, where n <= m. Matches at least n and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood". 'o{0,1}' is equivalent to 'o?'. Note that you cannot put a space between the comma and the numbers.
?When this character immediately follows any of the other quantifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. A non-greedy pattern matches as little of the searched string as possible, whereas the default greedy pattern matches as much of the searched string as possible. For example, in the string "oooo", 'o+?' matches a single "o", while 'o+' matches all 'o's.
.Matches any single character. For example, ".e" will match text where any character precedes an "e", like "he", "we", or "me". In EmEditor Professional, it matches a new line within the range specified in the Additional Lines to Search for Regular Expressions text box if the A Regular Expression "." Can Match the New Line Character check box is checked.
(pattern)Parentheses serve two purposes: to group a pattern into a sub-expression, and to capture what generated the match. For example the expression "(ab)*" would match all of the string "ababab". Each sub-expression match is captured as a back reference (see below) numbered from left to right. To match parentheses characters ( ), use '/(' or '/)'.
/1 - /9Indicates a back reference - a back reference is a reference to a previous sub-expression that has already been matched. The reference is to what the sub-expression matched, not to the expression itself. A back reference consists of the escape character "/" followed by a digit "1" to "9", "/1" refers to the first sub-expression, "/2" to the second etc. For example, "(a)/1" would capture "a" as the first back reference and match any text "aa". Back references can also be used when using the Replace feature under the Search menu. Use regular expressions to locate a text pattern, and the matching text can be replaced by a specified back reference. For example, "(h)(e)" will find "he",  and putting "/1" in the Replace With box will replace "he" with "h" whereas "/2/1" will replace "he" with "eh".
(?:pattern)A subexpression that matches pattern but does not capture the match, that is, it is a non-capturing match that is not stored for possible later use with back references. This is useful for combining parts of a pattern with the "or" character (|). For example, 'industr(?:y|ies) is a more economical expression than 'industry|industries'.
(?=pattern)A subexpression that performs a positive lookahead search, which matches the string at any point where a string matching pattern begins. For example, "x(?=abc)" matches an "x"only if it is followed by the expression "abc". This is a non-capturing match, that is, the match is not captured for possible later use with back references. pattern cannot contain a new line.
(?!pattern)A subexpression that performs a negative lookahead search, which matches the search string at any point where a string not matching pattern begins. For example, "x(?!abc)" matches an "x" only if it is not followed by the expression "abc". This is a non-capturing match, that is, the match is not captured for possible later use with back references. pattern cannot contain a new line.
(?<=pattern)A subexpression that performs a positive lookbehind search, which matches the search string at any point where a string matching pattern ends. For example, "(?<=abc)x" matches an "x" only if it is preceded by the expression "abc". This is a non-capturing match, that is, the match is not captured for possible later use with back references. pattern cannot contain a new line.
(?<!pattern)A subexpression that performs a negative lookbehind search, which matches the search string at any point where a string not matching pattern ends. For example, "(?<!abc)x" matches an "x" only if it is not preceded by the expression "abc". This is a non-capturing match, that is, the match is not captured for possible later use with back references. pattern cannot contain a new line.
x|yMatches either x or y. For example, 'z|food' matches "z" or "food". '(z|f)ood' matches "zood" or "food". 
[xyz]A character set. Matches any one of the enclosed characters. For example, '[abc]' matches the 'a' in "plain". 
[^xyz]A negative character set. Matches any character not enclosed. For example, '[^abc]' matches the 'p' in "plain". 
[a-z]A range of characters. Matches any character in the specified range. For example, '[a-z]' matches any lowercase alphabetic character in the range 'a' through 'z'.
[^a-z]A negative range characters. Matches any character not in the specified range. For example, '[^a-z]' matches any character not in the range 'a' through 'z'.

Character Classes

The following character classes are used within a character set such as "[:classname:]". For instance, "[[:space:]]" is the set of all whitespace characters.

alnumAny alphanumeric character.
alphaAny alphabetical character a-z, A-Z, and other character.
blankAny blank character, either a space or a tab.
cntrlAny control character.
digitAny digit 0-9.
graphAny graphical character.
lowerAny lowercase character a-z, and other lowercase character.
printAny printable character.
punctAny punctuation character.
spaceAny whitespace character.
upperAny uppercase character A-Z, and other uppercase character.
xdigitAny hexadecimal digit character, 0-9, a-f and A-F.
wordAny word character - all alphanumeric characters plus the underscore.
unicodeAny character whose code is greater than 255.

Single character escape sequences

The following escape sequences are aliases for single characters:

0x07 /aBell characer.
0x0C /fForm feed.
0x0A  /nNewline character.
0x0D /rCarriage return.
0x09  /tTab character.
0x0B /vVertical tab.
0x1B /eASCII Escape character.
0dd /0ddAn octal character code, where dd is one or more octal digits.
0xXX /xXXA hexadecimal character code, where XX is one or more hexadecimal digits (a Unicode character).
0xXXXX  /x{XXXX}A hexadecimal character code, where XXXX is one or more hexadecimal digits (a Unicode character).
Z-'@'  /cZ Z-'@'An ASCII escape sequence control-Z, where Z is any ASCII character greater than or equal to the character code for '@'.

Character class escape sequences

The following escape sequences can be used to represent entire character classes:

/wAny word character - all alphanumeric characters plus the underscore.
/WComplement of /w - find any non-word character
/sAny whitespace character.
/SComplement of /s.
/dAny digit 0-9.
/DComplement of /d.
/lAny lower case character a-z.
/LComplement of /l.
/uAny upper case character A-Z.
/UComplement of /u.
/CAny single character, equivalent to '.'.
/QThe begin quote operator, everything that follows is treated as a literal character until a /E end quote operator is found.
/EThe end quote operator, terminates a sequence begun with /Q.

Replacement Expressions

The following expressions are available for the Replace With box in the Replace dialog box and in the Replace in Files dialog box.

/0 Indicates a back reference to the entire regular expression.
/1 - /9Indicates a back reference - a back reference is a reference to a previous sub-expression that has already been matched. The reference is to what the sub-expression matched, not to the expression itself. A back reference consists of the escape character "/" followed by a digit "1" to "9", "/1" refers to the first sub-expression, "/2" to the second etc.
/nA new line.
/rA carriage return in case of Replace in Files. See also To Specify New Lines.
/tA tab
/LForces all subsequent substituted characters to be in lowercase. (EmEditor Professional only)
/UForces all subsequent substituted characters to be in uppercase. (EmEditor Professional only)
/HForces all subsequent substituted characters to be in half-width characters. (EmEditor Professional only)
/FForces all subsequent substituted characters to be in full-width characters. (EmEditor Professional only)
/ETurns off previous /L, /U, /F, or /H. (EmEditor Professional only)

Notes

  • In Find in Files and in Replace in Files, the carriage return (/r) and the line feed (/n) must be specified carefully. See To Specify New Lines for details.
  • In order for some escape sequences to work in EmEditor, like "/l", "/u" and their complements, the Match Case option has to be selected.

Copyright Notice

The regular expression routines used in EmEditor use Boost library Regex++.

Copyright (c) 1998-2001 Dr John Maddock



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值