在Visual Basic中为应用程序和Visual Basic 6使用正则表达式

Using Regular Expressions in VBA and VB6 Projects

By Patrick Matthews (matthewspatrick, https://www.experts-exchange.com/M_2428436.html)

由Patrick Matthews(matthewspatrick, https ://www.experts-exchange.com/M_2428436.html)

Introduction

Regular Expressions (RegExp) is a class that offers incredibly powerful text parsing capabilities, in particular allowing users to find (and even replace) substrings within larger text entries that fit a particular pattern.  This article provides basic information on the VBScript implementation of RegExp, and how you can use it to gain robust text parsing capabilities in Visual Basic 6 (VB6) and Visual Basic for Applications (VBA) projects.  With a little knowledge of the rules and some imagination, you can build incredibly flexible and/or complex patterns.

正则表达式(RegExp)是一个类,它提供了难以置信的强大的文本解析功能,尤其是允许用户在适合特定模式的较大文本条目中查找(甚至替换)子字符串。 本文提供有关RegExp的VBScript实现的基本信息,以及如何在Visual Basic 6(VB6)和Visual Basic for Applications(VBA)项目中使用它来获得可靠的文本解析功能。 只需了解一些规则并有一些想象力,您就可以构建出令人难以置信的灵活和/或复杂的模式。

While this article may be most useful to users with intermediate or advanced programming skills, users of all skill levels (beginners included) can benefit from using RegExp by implementing the "wrapper" functions included in this article.  (This article includes a sample Excel workbook that provides examples for each function.)  By adding these functions to your VB6/VBA projects, you can quickly and easily incorporate RegExp functionality into your applications.  If you are a Microsoft Office user, you can even use these functions in worksheet formulas for Microsoft Excel or in queries in Microsoft Access.  

虽然本文可能对具有中级或高级编程技能的用户最有用,但是,通过实施本文中包括的“包装器”功能,所有技能水平(包括初学者)的用户都可以从使用RegExp中受益。 (本文包括一个示例Excel工作簿,其中提供了每个功能的示例。)通过将这些功能添加到VB6 / VBA项目中,可以快速轻松地将RegExp功能集成到应用程序中。 如果您是Microsoft Office用户,则甚至可以在Microsoft Excel的工作表公式中或Microsoft Access的查询中使用这些功能。

These new functions are:

这些新功能是:

RegExpFind (Returns information for a single match from the input text, or an array of all the matches.  Info returned can be the match Value, starting character position, or Length for a single match, or for all matches)

RegExpFind(从输入文本或所有匹配项的数组中返回单个匹配项的信息。返回的信息可以是匹配值,单个字符或所有匹配项的起始字符位置或长度)

RegExpFindExtended (Returns a 2-D array of the Value, starting character position, or Length)

RegExpFindExtended(返回值,起始字符位置或长度的二维数组)

RegExpFindSubmatch (Returns a 2-D array of submatches for the various matches in the input text)

RegExpFindSubmatch(为输入文本中的各种匹配返回子匹配的二维数组)

RegExpReplace (Returns a string in which either all matches are replaced with the replacement text, or just the first match)

RegExpReplace(返回一个字符串,其中所有匹配项都替换为替换文本,或者仅是第一个匹配项)

RegExpReplaceExpression (Excel only, allows you to build an expression using arithmetic operators and/or Excel functions to manipulate the replacement text)

RegExpReplaceExpression(仅Excel,允许您使用算术运算符和/或Excel函数构建表达式来操纵替换文本)

RegExpReplaceRange (Returns a string in which the Xth to Yth matches are replaced)

RegExpReplaceRange(返回一个字符串,其中第X个匹配到第Y个匹配项)

Quick Examples Using RegExp使用RegExp的快速示例

The following two examples briefly illustrate the usefulness of RegExp, and also you might use some of the included wrapper functions to address each problem.

下面的两个示例简要说明了RegExp的有用性,并且您还可以使用其中包含的一些包装函数来解决每个问题。

Consider the following text:

考虑以下文本:

"He hit the ball.  Then he ran.  The crowd was cheering!  How did he feel?  I felt so energized!"

“他打了球。然后他跑了。人群在欢呼!他感觉如何?我感到如此充满活力!”

Suppose you needed to extract the first word from each sentence.  You could use a pattern such as:

假设您需要从每个句子中提取第一个单词。 您可以使用以下模式:

"(^|([\.!?] +))([a-z]+)"

That pattern matches the following:

该模式符合以下条件:

The beginning of the string OR (period, exclamation point, or question mark followed by 1+ spaces) {This whole item taken together is the first submatch; the period, exclamation point, or question mark followed by 1+ spaces is the 2nd submatch}

字符串OR的开头(句号,感叹号或问号,后跟1+空格){这整个项目在一起是第一个子匹配项; 句号,感叹号或问号后跟1+空格是第二个子匹配}

Then, 1+ characters from a-z {3rd submatch}

然后,来自az {3rd submatch}的1个以上字符

Using one of the wrapper functions, this would return an array of the first words in each sentence:

使用包装函数之一,这将返回每个句子中第一个单词的数组:

RegExpFindSubmatch(MyText, "(^|([\.!?] +))([a-z]+)", , 3, False)
Returns "He", "Then", "The", "How", "I"

If you wanted to extract the last word in each sentence, you could use this pattern instead:

如果要提取每个句子中的最后一个单词,可以改用以下模式:

"[a-z]+(?=[\.!\?])"

That pattern matches the following:

该模式符合以下条件:

1+ characters from a-z

来自AZ的1个以上字符

Then,  a "look ahead" to see if it is followed by a period, exclamation point, or question mark.  In a look ahead, RegExp checks to see whether the look ahead condition is met, but does not include the look ahead in the match that gets returned

然后,“向前看”以查看是否跟有句点,感叹号或问号。 在进行前瞻时,RegExp会检查是否满足前瞻条件,但不会在返回的比赛中包含前瞻

Using a wrapper function, this returns an array of the last word in each sentence:

使用包装函数,这将返回每个句子中最后一个单词的数组:

RegExpFind(MyText, "[a-z]+(?=[\.!\?])", , False)
Returns "ball", "ran", "cheering", "feel", "energized"

For another example (based on EE question https://www.experts-exchange.com/Q_24367818.html), suppose you had the following log entries:

对于另一个示例(基于EE问题https://www.experts-exchange.com/Q_24367818.html ),假设您具有以下日志条目:

http://www.abc.com

http://www.abc.com

www.xyz.com

www.xyz.com

123.mno.net

123.mno.net

http://www.abc.com/category

http://www.abc.com/category

www.abc.net/category/sub-category/index.html

www.abc.net/category/sub-category/index.html

xyz.info/index.php

xyz.info/index.php

www.abc.com:8082/

www.abc.com:8082/

Suppose that, for each entry, you had to extract the site address (e.g., www.abc.com).  You could use the following pattern, replacing all text matching that pattern with a zero-length string:

假设对于每个条目,您都必须提取站点地址(例如,www.abc.com)。 您可以使用以下模式,将所有与该模式匹配的文本替换为零长度的字符串:

"http://|[:/].*"

That pattern matches the following:

该模式符合以下条件:

(colon or forward slash) followed by zero or more characters of any type

(冒号或正斜杠)后跟零个或多个任何类型的字符

If you were looping through an array of log entries, you could use this to extract the address:

如果要遍历一系列日志条目,则可以使用它来提取地址:

RegExpReplace(LogArr(i), "http://|[:/].*", "", True, False)

If instead you needed to extract just the top-level domain (e.g., com, net, info), you could use this pattern instead:

相反,如果您只需要提取顶级域(例如com,net,info),则可以使用以下模式:

"(http://)?([\w-]+\.)([\w-]+\.)?([\w-]+)([:/].*)?"

That pattern matches the following:

该模式符合以下条件:

http:// appearing zero or one time {1st submatch} http://出现零次或一次{{第一子匹配}

Then, a word character or dash appearing 1+ times followed by a period {2nd submatch}

然后,一个单词字符或破折号出现1次以上,后跟一个句号{2nd submatch}

Then, a word character or dash appearing 1+ times followed by a period, the whole block appearing zero or one time {3rd submatch}

然后,一个单词字符或破折号出现1次以上,后跟一个句点,整个块出现0次或一次{3rd submatch}

Then, a word character or dash appearing 1+ times {4th submatch}

然后,单词字符或破折号出现1次以上{第4个子匹配项}

Then, a colon or forward slash followed by zero or more characters of any type {5th submatch}

然后,冒号或正斜杠后跟零个或多个任何类型的字符{5th submatch}

In this case, the 4th submatch will be the top-level domain.  To get it, you could use one of two expressions:

在这种情况下,第4个子匹配将是顶级域名。 要获得它,可以使用以下两个表达式之一:

RegExpReplace(LogArr(i), "(http://)?([\w-]+\.)([\w-]+\.)?([\w-]+)([:/].*)?", "$4", True, False)

or

=RegExpFindSubmatch(LogArr(i), "(http://)?([\w-]+\.)([\w-]+\.)?([\w-]+)([:/].*)?", 1, 4, False, False)

If instead this log were in an Access table, you could use these wrapper functions in a query such as this:

相反,如果此日志位于Access表中,则可以在以下查询中使用这些包装器函数:

SELECT LogEntry, 
    RegExpReplace(LogArr(i), "http://|[:/].*", "", True, False) AS Address,
    RegExpReplace(LogArr(i), "(http://)?([\w-]+\.)([\w-]+\.)?([\w-]+)([:/].*)?", 
        "$4", True, False) AS TopLevel
FROM LogTable
The VBScript RegExp ClassVBScript RegExp类

Regular Expressions may be most familiar to users of the Perl programming language, but many other languages incorporate similar functionality.  This article concerns how to use the RegExp class implemented in VBScript 5.5 to enhance your text parsing capabilities in your Visual Basic for Applications (and thus Microsoft Excel, Access, or any other application that supports VBA) and Visual Basic 6 projects.  Because this article focuses on the specific VBScript implementation, there may be some differences in syntax and/or features when compared to other Regular Expressions implementations.  (For example, the VBScript implementation does not support the "look behind" capabilities in Perl and the .Net framework.)

Perl编程语言的用户可能对正则表达式最为熟悉,但是许多其他语言都包含类似的功能。 本文涉及如何使用VBScript 5.5中实现的RegExp类来增强Visual Basic for Applications(以及Microsoft Excel,Access或支持VBA的任何其他应用程序)和Visual Basic 6项目中的文本解析功能。 因为本文着重于特定的VBScript实现,所以与其他正则表达式实现相比,语法和/或功能可能有所不同。 (例如,VBScript实现不支持Perl和.Net框架中的“向后看”功能。)

RegExp Object Model

To use RegExp in your VB6 or VBA projects, you must instantiate a RegExp object.  If you are using early binding, then you would add a reference to the "Microsoft VBScript Regular Expressions 5.5" library.  (Do not use the version 1.0 library, as it lacks many of the best features of RegExp and is provided solely for backward compatibility.)  You would then declare the variable in your code this way:

要在VB6或VBA项目中使用RegExp,必须实例化RegExp对象。 如果使用的是早期绑定,则应添加对“ Microsoft VBScript正则表达式5.5”库的引用。 (不要使用1.0版库,因为它缺少RegExp的许多最佳功能,并且仅为了向后兼容而提供。)然后,您可以通过以下方式在代码中声明变量:

Dim RegX As VBScript_RegExp_55.RegExp
Set RegX = New VBScript_RegExp_55.RegExp

Using late binding, you would do it this way:

使用后期绑定,您可以这样做:

Dim RegX As Object
Set RegX = CreateObject("VBScript.RegExp")

Once you have the object instantiated, you will have to set certain properties of the RegExp class:

实例化对象后,必须设置RegExp类的某些属性:

Global: This determines whether a RegExp method will try to find only the first match (False), or all matches (True).  The default value is False.

全局:确定RegExp方法将尝试仅查找第一个匹配项(False)还是查找所有匹配项(True)。 默认值为False。

IgnoreCase: This determines whether the pattern match is case-sensitive; the default is False (thus making pattern matching case-sensitive by default).  If IgnoreCase is True, then a pattern of [a-z] or [A-Z] is equivalent to a pattern of [a-zA-Z].

IgnoreCase:确定模式匹配是否区分大小写; 默认值为False(默认情况下使模式匹配区分大小写)。 如果IgnoreCase为True,则[az]或[AZ]的模式等同于[a-zA-Z]的模式。

Multiline: Indicates how RegExp should handle input text with line breaks.  In particular, it determines how the ^ (start of input) and $ (end of input) behave: if False, then these match only the very beginning and very end of the entire input string.  If True, then ^ and $ match the beginning and end of each line in the input string.  The default value is False.

多行:指示RegExp应如何处理带有换行符的输入文本。 特别是,它确定^(输入的开始)和$(输入的结束)的行为:如果为False,则它们仅匹配整个输入字符串的开头和结尾。 如果为True,则^和$匹配输入字符串中每行的开头和结尾。 默认值为False。

Pattern: This is the pattern that RegExp tries to match.  Patterns can be relatively simple, or highly complex.  For example, the pattern

模式:这是RegExp尝试匹配的模式。 模式可以相对简单,也可以高度复杂。 例如图案



circus

马戏团

will look only for the six consecutive characters "circus" (and, if IgnoreCase = False, then those characters would have to be lower case to make a match).  On the other hand:

将仅查找六个连续的字符“马戏团”(并且,如果IgnoreCase = False,则这些字符必须小写才能进行匹配)。 另一方面:



circus(es)?

马戏团?

will match "circus" or "circuses".  Further, the following pattern matches circus, circuses, pony, or ponies:

将匹配“马戏团”或“马戏团”。 此外,以下模式匹配马戏团,马戏团,小马或小马:



circus(es)?|pon(y|ies)

马戏团?| pon(y | ies)

Please see the Quick Primer on Pattern Syntax for a few simple tips on Pattern syntax to get you started.  The functionality is incredibly rich, however, and for fuller treatments of the subject you should look in the Further Reading section of this article for web sites and books that provide excellent information on the RegExp Pattern syntax.

请参阅“ 以获取有关模式语法的一些简单提示,以帮助您入门。 但是,该功能非常丰富,要对主题进行更充分的处理,您应该在本文的“ 部分中查找有关提供有关RegExp Pattern语法的出色信息的网站和书籍。

After you have set the RegExp properties to appropriate values, you then have to decide which method to use.  The RegExp class exposes three methods:

将RegExp属性设置为适当的值后,然后必须决定使用哪种方法。 RegExp类公开了三种方法:

Execute: This method returns a Matches collection (of Match objects), containing the various matches to the specified pattern found in the input string (which is passed to the Execute method as an argument).  The Global property determines whether the collection contains all matches (True) or just the first match (False).

Execute:此方法返回Match对象的Matches集合,其中包含与在输入字符串中找到的指定模式的各种匹配(作为参数传递给Execute方法)。 Global属性确定集合是包含所有匹配项(True)还是仅包含第一个匹配项(False)。

Replace: This method takes two arguments.  The first argument is the string to be searched for pattern matches, and the second is the replacement text to be used if a match is found.  The Global property determines whether RegExp replaces all matches (True) or just the first match (False).  The Replace method returns a string that includes all applicable replacements.  (Note that if there are no matches found, then the output and the original input are identical.)  To incorporate the entire match into the replacement text, use $& to stand in for the match, and to use a submatch in the replacement text, use $1, $2, $3, etc. to use the nth submatch (or $+ for the last submatch).  You can also use $` for the text prior to the match, $' for the text after the match, and $_ for the entire input string

替换:此方法有两个参数。 第一个参数是要搜索模式匹配的字符串,第二个参数是找到匹配项时要使用的替换文本。 Global属性确定RegExp是替换所有匹配项(True)还是仅替换第一个匹配项(False)。 Replace方法返回一个字符串,其中包括所有适用的替换。 (请注意,如果没有找到匹配项,则输出和原始输入是相同的。)要将整个匹配项合并到替换文本中,请使用$&代表匹配项,并在替换文本中使用子匹配项,请使用$ 1,$ 2,$ 3等来使用第n个子匹配(或最后一个子匹配使用$ +)。 您还可以将$`用于匹配之前的文本,将$'用于匹配之后的文本,并将$ _用作整个输入字符串

Test: This method takes a single argument, the input string, and returns a Boolean True/False to indicate whether there is at least one match in the input string.

测试:此方法采用一个参数,即输入字符串,并返回布尔值True / False,以指示输入字符串中是否至少存在一个匹配项。

If you need to work with the various matches, then you work through the Matches collection and the various Match objects it contains.  The Matches collection has the following properties:

如果需要使用各种匹配项,则可以遍历Matches集合及其包含的各种Match对象。 Matches集合具有以下属性:

Count: This is the number of Match objects contained in the collection

计数:这是集合中包含的Match对象的数量

Item: This returns the indicated Match object based on an index (the argument for this property).  Please note that the first item takes the index number zero, and thus the index for the last item is one less than the Count.  The Item property is the default property for the Matches collection, so MyMatches(0) is equivalent to MyMatches.Item(0).

项目:这将基于索引(该属性的参数)返回指示的Match对象。 请注意,第一项的索引号为零,因此最后一项的索引比Count小一。 Item属性是Matches集合的默认属性,因此MyMatches(0)等同于MyMatches.Item(0)。

For example, this code enumerates the Matches:

例如,此代码枚举Matches:

Dim RegX As Object, Mats As Object, Counter As Long

Set RegX = CreateObject("VBScript.RegExp")
With RegX
    .Global = True
    .Pattern = "\d+"
    Set Mats = .Execute("1 turtle, 15 rabbits, and 1078 swans")
End With

For Counter = 0 To Mats.Count - 1
    MsgBox Mats(Counter)
Next

Set Mats = Nothing
Set RegX = Nothing

That code will present three MsgBoxes, showing 1, 15, and 1078 respectively.

该代码将显示三个MsgBoxes,分别显示1、15和1078。

To work further with the individual Match objects, you can use the following properties:

要进一步使用单个Match对象,可以使用以下属性:

FirstIndex: The starting position of the Match value within the original input string.  Please be aware that this property treats the first character in the input string as position zero, whereas VB6 and VBA (as well as Access and Excel) typically treat the first character as position 1.  The sample functions I provide here correct for this.

FirstIndex:匹配值在原始输入字符串中的起始位置。 请注意,此属性将输入字符串中的第一个字符视为位置零,而VB6和VBA(以及Access和Excel)通常将第一个字符视为位置1。我在此处提供的示例函数对此是正确的。

Length: The number of characters in the Match value

长度:匹配值中的字符数

Submatches: This returns a Submatches collection composed of the various Submatch objects found within the Match.

子匹配项:返回由匹配项中找到的各种子匹配项对象组成的子匹配项集合。

Value: The string of the Match itself.  Value is the Match object's default property, so in the code example above Mats(Counter) is equivalent to Mats(Counter).Value.

值:Match本身的字符串。 Value是Match对象的默认属性,因此在上面的代码示例中,Mats(Counter)等效于Mats(Counter).Value。

Drilling down even further, a Match may have Submatches.  A Submatch is a portion of a Match that corresponds to part of the Pattern bracketed by parentheses.  For example, you could use a pattern such as this to match IP addresses in an input string:

进一步深入研究,“匹配”可能包含“子匹配”。 子匹配项是匹配项的一部分,与括号括起来的部分模式相对应。 例如,您可以使用这种模式来匹配输入字符串中的IP地址:

(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})

Suppose that the input string were:

假设输入字符串为:

I thought her IP address was 100.200.300.400, but it was really 99.99.99.99
我以为她的IP地址是100.200.300.400,但实际上是99.99.99.99

There would be two matches, 100.200.300.400 and 99.99.99.99.  In addition, each match would have four submatches; in the first case these submatches are 100, 200, 300, and 400.  These submatches correspond to the portions of the pattern enclosed in parentheses.

将会有两场比赛,分别为100.200.300.400和99.99.99.99。 此外,每个比赛将有四个子比赛; 在第一种情况下,这些子匹配为100、200、300和400。这些子匹配对应于括号中的图案部分。

The Submatches collection has the following properties:

Submatches集合具有以下属性:

Count: This is the number of Submatch objects contained in the collection.  There will always be as many Submatch objects as there are groups in the pattern enclosed in parentheses.

计数:这是集合中包含的Submatch对象的数量。 子匹配对象将始终与括号中包含的模式中的组一样多。

Item: This returns the indicated Submatch object based on an index (the argument for this property).  Please note that the first item takes the index number zero, and thus the index for the last item is one less than the Count.  The Item property is the default property for the Submatches collection, so MyMatch.Submatches(0) is equivalent to MyMatch.Submatches.Item(0).

项目:这将基于索引(该属性的参数)返回指示的Submatch对象。 请注意,第一项的索引号为零,因此最后一项的索引比Count小一。 Item属性是Submatches集合的默认属性,因此MyMatch.Submatches(0)等同于MyMatch.Submatches.Item(0)。

The Submatch object has a single property, Value.  This Value is simply the text comprising the Submatch.  If there is no match found for the indicated group in the pattern, then the Submatch value is a zero-length string.

Submatch对象具有单个属性Value。 该值只是构成子匹配项的文本。 如果在模式中找不到与指定组的匹配项,则Submatch值为零长度的字符串。

Quick Primer on Pattern Syntax

This section includes some basic pointers on Pattern syntax to help you get started.  It is, however, by no means complete, and for more detailed treatments of the subject you should review the Web sites and books in the Further Reading section of this article.

本节包含有关Pattern语法的一些基本指针,以帮助您入门。 但是,它绝不是完整的,并且对于该主题的更详细的处理,您应该在本文的“ 部分中查看网站和书籍。

Keep in mind that in all cases, whether a match is case sensitive or not depends on the setting for RegExp.IgnoreCase.

请记住,在所有情况下,匹配项是否区分大小写都取决于RegExp.IgnoreCase的设置。

Symbol		Notes
--------------------------------------------------------------------------------------------
abc, ABC	Matches "abc" or "ABC"
[abc]		Matches any character within the brackets
[^abc]		Matches any character not within the brackets
[a-n], [0-5]	Matches any character within the range of letters or digits in brackets
[^a-n], [^0-5]	Matches any character not within the range of letters or digits in brackets
.		Matches any single character except a new line character
A|B		Matches "A" or "B"
()		Creates a subgroup within the pattern
{x}		Matches the preceding element exactly x times
{x,y}		Matches the preceding element between x and y times (greedy)
{x,y}?		Matches the preceding element between x and y times (non-greedy)
{x,}		Matches the preceding element at least x times (greedy)
{x,}?		Matches the preceding element at least x times (non-greedy)
{,y}		Matches the preceding element between zero and y times (greedy)
{,y}?		Matches the preceding element between zero and y times (non-greedy)
*		Matches preceding element zero or more times (greedy)
*?		Matches preceding element zero or more times (non-greedy)
?		Matches preceding element zero or one times
+		Matches preceding element one or more times (greedy)
+?		Matches preceding element one or more times (non-greedy)
^		Matches start of input (or start of line if RegExp.Multiline = True)
$		Matches end of input (or end of line if RegExp.Multiline = True)
\d		Matches a digit (same as [0-9])
\D		Matches a non-digit (same as [^0-9])
\b		Matches a word boundary
\B		Matches a non-word boundary
\f		Form feed
\n		New line
\r		Carriage return
\t		Tab
\v		Vertical tab
\s		Matches a "white space" character, same as [ \f\n\r\t\v]
\S		Matches a non-"white space" character, same as [^ \f\n\r\t\v]
\w		Matches a "word" character plus underscore, same as [A-Za-z0-9_]
\W		Matches a non-"word" character plus underscore, same as [^A-Za-z0-9_]
\		"Escapes" a special character.  
		For example:
		   To match the backslash, use \\
		   To match the $, use \$
		   To match the period, use \.

x(?=y)		Positive lookahead
		   X is a pattern you want to match
		   Y is a pattern you want to see immediately follow the pattern you are 
			looking for, but is not included in the match
		Example: Suppose you have a part number that is always four digits, 
		   with an optional suffix a, b, c, etc.  You want to grab all matches 
		   that look like part numbers with suffixes, but you want to exclude 
		   the suffix from the match.  
		   You could use a pattern \d{4}(?=[a-z]).  
		   If the input string is "1234, 2345x, 3456", it matches "2345".

x(?!y)		Negative lookahead
		   X is a pattern you want to match
		   Y is a pattern you do not want to see immediately follow the pattern 
		   	you are looking for, but is not included in the match
		Example: Suppose you have a part number that is always four digits, 
		   with an optional suffix a, b, c, etc.  You want to grab all matches that 
		   look like part numbers, but you want to exclude part numbers with suffixes.  
		   You could use a pattern \d{4}(?![a-z])
		   If the input string is "1234, 2345x, 3456", it matches "1234" and "3456".

Further Reading

For further reading, I recommend the following web sites and books:

为了进一步阅读,我建议以下网站和书籍:

For an excellent tutorial on RegExp, including information on the various language-specific implementations, go to: "Using Regular Expressions with Client Side Scripting", http://www.regular-expressions.info/vbscript.html

有关RegExp的出色教程,包括各种特定于语言的实现的信息,请访问:“在客户端脚本中使用正则表达式”, http://www.regular-expressions.info/vbscript.html

This site provides a very complete listing of the syntax for RegExp as implemented in VBScript and Jscript: "Regular Expression Syntax (Scripting)", http://msdn.microsoft.com/en-us/library/1400241x(VS.85).aspx

该站点提供了VBScript和Jscript中实现的RegExp语法的非常完整的列表:“正则表达式语法(脚本)”, http://msdn.microsoft.com/zh-cn/library/1400241x (VS.85 ) .aspx

For general information on VBScript RegExp: "Microsoft Beefs Up VBScript with Regular Expressions", http://msdn.microsoft.com/en-us/library/ms974570.aspx

有关VBScript RegExp的一般信息:“ Microsoft用正则表达式增强VBScript”, http://msdn.microsoft.com/zh-cn/library/ms974570.aspx

Another site for general information: "An Introduction to Regular Expression with VBScript", http://www.4guysfromrolla.com/webtech/090199-1.shtml

另一个提供一般信息的站点:“使用VBScript进行正则表达式简介”, http://www.4guysfromrolla.com/webtech/090199-1.shtml

I keep the following book at my desk for quick reference on RegExp and other VBScript matters: VBScript in a Nutshell, written by Lomax, Childs, and Petrusha.  Its descriptions of the various RegExp-related objects, properties, and methods are useful, but are a bit incomplete.  (For example, the book makes no mention of submatches or "look ahead".)

我将下面的书留在办公桌上,以快速参考RegExp和其他VBScript问题:Lomax,Childs和Petrusha编写

If you really want to master RegExp, then I suggest Mastering Regular Expressions, by Friedl.  That book is not for the faint-hearted, but if you really want to know about how RegExp works its magic, and how you can get the most out of RegExp, it is worth the effort.

如果您真的想精通RegExp,那么我建议您使用Friedl的

For additional examples and practical applications, I also recommend the Regular Expressions Cookbook, by Goyvaerts and Levithan.

对于其他示例和实际应用,我还推荐Goyvaerts和Levithan撰写的《

Acknowledgements

I would like to thank EE Member brettdj (https://www.experts-exchange.com/M_770818.html) for introducing me to the wonderful world of Regular Expressions, and for his generous assistance in reviewing this article before submission.  His very helpful comments helped improve the example workbook markedly.  webtubbs (https://www.experts-exchange.com/M_3470508.html) also reviewed this article before submission.  Any mistakes in this article are most certainly mine, and not theirs.

我要感谢EE成员brettdj( https://www.experts-exchange.com/M_770818.html ),向我介绍了奇妙的正则表达式世界,并感谢他在提交前对本文进行了慷慨的帮助。 他非常有帮助的评论极大地改善了示例工作簿。 webtubbs( https://www.experts-exchange.com/M_3470508.html )在提交之前还查看了本文。 本文中的任何错误肯定是我的,而不是他们的。

Code Samples

Below is source code for six "wrapper" user defined functions you may use and freely distribute, as long as you properly attribute authorship and where you found the code.  You will also find these functions in the included sample file "Regular Expressions Examples.xls".

下面是六个“包装器”用户定义函数的源代码,您可以使用它们并自由分发它们,只要您正确地分配了作者身份并在哪里找到了代码。 您还可以在随附的示例文件“ Regular Expressions Examples.xls”中找到这些功能。

Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True, Optional ReturnType As Long = 0, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' Pos = 0                   : the last match
    ' Pos = -1                  : the last match
    ' Pos = -2                  : the 2nd to last match
    ' Pos = <negative integer>  : the Nth to last match
    ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
    ' matches, the function returns an empty string.  If no match is found, the function returns
    ' an empty string.  (Earlier versions of this code used zero for the last match; this is
    ' retained for backward compatibility)
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' ReturnType indicates what information you want to return:
    ' ReturnType = 0            : the matched values
    ' ReturnType = 1            : the starting character positions for the matched values
    ' ReturnType = 2            : the lengths of the matched values
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
    ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
    ' the character positions conform to VBA/VB6 expectations
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer()
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Evaluate ReturnType
    
    If ReturnType < 0 Or ReturnType > 2 Then
        RegExpFind = ""
        Exit Function
    End If
    
    ' Create instance of RegExp object if needed, and set properties
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.Test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' Test to see if Pos is negative, which indicates the user wants the Nth to last
        ' match.  If it is, then based on the number of matches convert Pos to a positive
        ' number, or zero for the last match
        
        If Not IsMissing(Pos) Then
            If Pos < 0 Then
                If Pos = -1 Then
                    Pos = 0
                Else
                    
                    ' If Abs(Pos) > number of matches, then the Nth to last match does not
                    ' exist.  Return a zero-length string
                    
                    If Abs(Pos) <= TheMatches.Count Then
                        Pos = TheMatches.Count + Pos + 1
                    Else
                        RegExpFind = ""
                        GoTo Cleanup
                    End If
                End If
            End If
        End If
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1)
            For Counter = 0 To UBound(Answer)
                Select Case ReturnType
                    Case 0: Answer(Counter) = TheMatches(Counter)
                    Case 1: Answer(Counter) = TheMatches(Counter).FirstIndex + 1
                    Case 2: Answer(Counter) = TheMatches(Counter).Length
                End Select
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        
        Else
            Select Case Pos
                Case 0                          ' Last match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(TheMatches.Count - 1)
                        Case 1: RegExpFind = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(TheMatches.Count - 1).Length
                    End Select
                Case 1 To TheMatches.Count      ' Nth match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(Pos - 1)
                        Case 1: RegExpFind = TheMatches(Pos - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(Pos - 1).Length
                    End Select
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFind = ""
    End If
    
Cleanup:
    ' Release object variables
    
    Set TheMatches = Nothing
    
End Function
Function RegExpFindExtended(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True, Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and returns a 0-(N-1), 0-2
    ' array of the matched values (position 0 for the 2nd dimension), the starting character
    ' positions (position 1 for the 2nd dimension), and the length of the matched values (position 2
    ' for the 2nd dimension)
    
    ' Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' Pos = 0                   : the last match
    ' Pos = -1                  : the last match
    ' Pos = -2                  : the 2nd to last match
    ' Pos = <negative integer>  : the Nth to last match
    ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
    ' matches, the function returns an empty string.  If no match is found, the function returns
    ' an empty string.
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
    ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
    ' the character positions conform to VBA/VB6 expectations
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer()
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFindExtended = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Create instance of RegExp object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.Test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        If Not IsMissing(Pos) Then
            If Pos < 0 Then
                If Pos = -1 Then
                    Pos = 0
                Else
                    
                    ' If Abs(Pos) > number of matches, then the Nth to last match does not
                    ' exist.  Return a zero-length string
                    
                    If Abs(Pos) <= TheMatches.Count Then
                        Pos = TheMatches.Count + Pos + 1
                    Else
                        RegExpFindExtended = ""
                        GoTo Cleanup
                    End If
                End If
            End If
        End If
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1, 0 To 2)
            For Counter = 0 To UBound(Answer)
                Answer(Counter, 0) = TheMatches(Counter)
                Answer(Counter, 1) = TheMatches(Counter).FirstIndex + 1
                Answer(Counter, 2) = TheMatches(Counter).Length
            Next
            RegExpFindExtended = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        
        Else
            Select Case Pos
                Case 0                          ' Last match
                    ReDim Answer(0 To 0, 0 To 2)
                    Answer(0, 0) = TheMatches(TheMatches.Count - 1)
                    Answer(0, 1) = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                    Answer(0, 2) = TheMatches(TheMatches.Count - 1).Length
                    RegExpFindExtended = Answer
                Case 1 To TheMatches.Count      ' Nth match
                    ReDim Answer(0 To 0, 0 To 2)
                    Answer(0, 0) = TheMatches(Pos - 1)
                    Answer(0, 1) = TheMatches(Pos - 1).FirstIndex + 1
                    Answer(0, 2) = TheMatches(Pos - 1).Length
                    RegExpFindExtended = Answer
                Case Else                       ' Invalid item number
                    RegExpFindExtended = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFindExtended = ""
    End If
    
Cleanup:

    ' Release object variables
    
    Set TheMatches = Nothing
    
End Function
Function RegExpFindSubmatch(LookIn As String, PatternStr As String, Optional MatchPos, _
    Optional SubmatchPos, Optional MatchCase As Boolean = True, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return "submatches"
    ' from the various matches to a pattern (PatternStr).  In RegExp, submatches within a pattern
    ' are defined by grouping portions of the pattern within parentheses.
    
    ' Use MatchPos to indicate which match you want:
    ' MatchPos omitted               : function returns results for all matches
    ' MatchPos = 1                   : the first match
    ' MatchPos = 2                   : the second match
    ' MatchPos = <positive integer>  : the Nth match
    ' MatchPos = 0                   : the last match
    ' MatchPos = -1                  : the last match
    ' MatchPos = -2                  : the 2nd to last match
    ' MatchPos = <negative integer>  : the Nth to last match
    
    ' Use SubmatchPos to indicate which match you want:
    ' SubmatchPos omitted               : function returns results for all submatches
    ' SubmatchPos = 1                   : the first submatch
    ' SubmatchPos = 2                   : the second submatch
    ' SubmatchPos = <positive integer>  : the Nth submatch
    ' SubmatchPos = 0                   : the last submatch
    ' SubmatchPos = -1                  : the last submatch
    ' SubmatchPos = -2                  : the 2nd to last submatch
    ' SubmatchPos = <negative integer>  : the Nth to last submatch
    
    ' The return type for this function depends on whether your choice for MatchPos is looking for
    ' a single value or for potentially many.  All arrays returned by this function are zero-based.
    ' When the function returns a 2-D array, the first dimension is for the matches and the second
    ' dimension is for the submatches
    ' MatchPos omitted, SubmatchPos omitted: 2-D array of submatches for each match.  First dimension
    '                                        based on number of matches (0 to N-1), second dimension
    '                                        based on number of submatches (0 to N-1)
    ' MatchPos omitted, SubmatchPos used   : 2-D array (0 to N-1, 0 to 0) of the specified submatch
    '                                        from each match
    ' MatchPos used, SubmatchPos omitted   : 2-D array (0 to 0, 0 to N-1) of the submatches from the
    '                                        specified match
    ' MatchPos used, SubmatchPos used      : String with specified submatch from specified match
    
    ' For any submatch that is not found, the function treats the result as a zero-length string
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Mat As Object
    Dim Answer() As String
    Dim Counter As Long
    Dim SubCounter As Long
    
    ' Evaluate MatchPos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(MatchPos) Then
        If Not IsNumeric(MatchPos) Then
            RegExpFindSubmatch = ""
            Exit Function
        Else
            MatchPos = CLng(MatchPos)
        End If
    End If
    
    ' Evaluate SubmatchPos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(SubmatchPos) Then
        If Not IsNumeric(SubmatchPos) Then
            RegExpFindSubmatch = ""
            Exit Function
        Else
            SubmatchPos = CLng(SubmatchPos)
        End If
    End If
    
    ' Create instance of RegExp object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.Test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' If MatchPos is missing, user either wants array of all the submatches for each match, or an
        ' array of all the specified submatches for each match.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(MatchPos) Then
            
            ' Return value is a 2-D array of all the submatches for each match
            
            If IsMissing(SubmatchPos) Then
                For Counter = 0 To TheMatches.Count - 1
                    Set Mat = TheMatches(Counter)
                    
                    ' To determine how many submatches there are we need to first evaluate a match.  That
                    ' is why we redim the array inside the for/next loop
                    
                    If Counter = 0 Then
                        ReDim Answer(0 To TheMatches.Count - 1, 0 To Mat.Submatches.Count - 1) As String
                    End If
                    
                    ' Loop through the submatches and populate the array.  If the Nth submatch is not
                    ' found, RegExp returns a zero-length string
                    
                    For SubCounter = 0 To UBound(Answer, 2)
                        Answer(Counter, SubCounter) = Mat.Submatches(SubCounter)
                    Next
                Next
            
            ' Return value is a 2-D array of the specified submatch for each match.
            
            Else
                For Counter = 0 To TheMatches.Count - 1
                    Set Mat = TheMatches(Counter)
                    
                    ' To determine how many submatches there are we need to first evaluate a match.  That
                    ' is why we redim the array inside the for/next loop.  If SubmatchPos = 0, then we want
                    ' the last submatch.  In that case reset SubmatchPos so it equals the submatch count.
                    ' Negative number indicates Nth to last; convert that to applicable "positive" position
                    
                    If Counter = 0 Then
                        ReDim Answer(0 To TheMatches.Count - 1, 0 To 0) As String
                        Select Case SubmatchPos
                            Case Is > 0: 'no adjustment needed
                            Case 0, -1: SubmatchPos = Mat.Submatches.Count
                            Case Is < -Mat.Submatches.Count: SubmatchPos = -SubmatchPos
                            Case Else: SubmatchPos = Mat.Submatches.Count + SubmatchPos + 1
                        End Select
                    End If
                    
                    ' Populate array with the submatch value.  If the submatch value is not found, or if
                    ' SubmatchPos > the count of submatches, populate with a zero-length string
                    
                    If SubmatchPos <= Mat.Submatches.Count Then
                        Answer(Counter, 0) = Mat.Submatches(SubmatchPos - 1)
                    Else
                        Answer(Counter, 0) = ""
                    End If
                Next
            End If
            RegExpFindSubmatch = Answer
            
        ' User wanted the info associated with the Nth match (or last match, if MatchPos = 0)
        
        Else
            
            ' If MatchPos = 0 then make MatchPos equal the match count.  If negative (indicates Nth
            ' to last), convert to equivalent position.
            
            Select Case MatchPos
                Case Is > 0: 'no adjustment needed
                Case 0, -1: MatchPos = TheMatches.Count
                Case Is < -TheMatches.Count: MatchPos = -MatchPos
                Case Else: MatchPos = TheMatches.Count + MatchPos + 1
            End Select
            
            ' As long as MatchPos does not exceed the match count, process the Nth match.  If the
            ' match count is exceeded, return a zero-length string
            
            If MatchPos <= TheMatches.Count Then
                Set Mat = TheMatches(MatchPos - 1)
                
                ' User wants a 2-D array of all submatches for the specified match; populate array.  If
                ' a particular submatch is not found, RegExp treats it as a zero-length string
                
                If IsMissing(SubmatchPos) Then
                    ReDim Answer(0 To 0, 0 To Mat.Submatches.Count - 1)
                    For SubCounter = 0 To UBound(Answer, 2)
                        Answer(0, SubCounter) = Mat.Submatches(SubCounter)
                    Next
                    RegExpFindSubmatch = Answer
                
                ' User wants a single value
                
                Else
                    
                    ' If SubmatchPos = 0 then make it equal count of submatches.  If negative, this
                    ' indicates Nth to last; convert to equivalent positive position
                    
                    Select Case SubmatchPos
                        Case Is > 0: 'no adjustment needed
                        Case 0, -1: SubmatchPos = Mat.Submatches.Count
                        Case Is < -Mat.Submatches.Count: SubmatchPos = -SubmatchPos
                        Case Else: SubmatchPos = Mat.Submatches.Count + SubmatchPos + 1
                    End Select
                    
                    ' If SubmatchPos <= count of submatches, then get that submatch for the specified
                    ' match.  If the submatch value is not found, or if SubmathPos exceeds count of
                    ' submatches, return a zero-length string.  In testing, it appeared necessary to
                    ' use CStr to coerce the return to be a zero-length string instead of zero
                    
                    If SubmatchPos <= Mat.Submatches.Count Then
                        RegExpFindSubmatch = CStr(Mat.Submatches(SubmatchPos - 1))
                    Else
                        RegExpFindSubmatch = ""
                    End If
                End If
            Else
                RegExpFindSubmatch = ""
            End If
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFindSubmatch = ""
    End If
    
Cleanup:
    ' Release object variables
    Set Mat = Nothing
    Set TheMatches = Nothing
    
End Function
Function RegExpReplace(LookIn As String, PatternStr As String, Optional ReplaceWith As String = "", _
    Optional ReplaceAll As Boolean = True, Optional MatchCase As Boolean = True, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string, and replace parts of the string
    ' matching the specified pattern with another string.  The optional argument ReplaceAll
    ' controls whether all instances of the matched string are replaced (True) or just the first
    ' instance (False)
    
    ' If you need to replace the Nth match, or a range of matches, then use RegExpReplaceRange
    ' instead
    
    ' By default, RegExp is case-sensitive in pattern-matching.  To keep this, omit MatchCase or
    ' set it to True
    
    ' If you use this function from Excel, you may substitute range references for all the arguments
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = ReplaceAll
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
    
    RegExpReplace = RegX.Replace(LookIn, ReplaceWith)
    
End Function
Function RegExpReplaceRange(LookIn As String, PatternStr As String, Optional ReplaceWith As String = "", _
    Optional StartAt As Long = 1, Optional EndAt As Long = 0, Optional MatchCase As Boolean = True, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string, and replace parts of the string
    ' matching the specified pattern with another string.  In particular, this function replaces
    ' the specified range of matched values with the designated replacement string.
    
    ' StartAt indicates the start of the range of matches to be replaced.  Thus, 2 indicates
    ' that the second match gets replaced starts the range of matches to be replaced.  Use zero
    ' to specify the last match.
    
    ' EndAt indicates the end of the range of matches to be replaced.  Thus, a 5 would indicate
    ' that the 5th match is the last one to be replaced. Use zero to specify the last match.
    
    ' Thus, if you use StartAt = 2 and EndAt = 5, then the 2nd through 5th matches will be
    ' replaced.
    
    ' By default, RegExp is case-sensitive in pattern-matching.  To keep this, omit MatchCase
    ' or set it to True
    
    ' If you use this function from Excel, you may substitute range references for all the
    ' arguments
    
    ' Note: Match.FirstIndex assumes that the first character position in a string is zero.
    ' This differs from VBA and VB6, which has the first character at position 1
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in
    ' cases where a large number of calls to this function are made, making RegX a static
    ' variable that preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim StartStr As String
    Dim WorkingStr As String
    Dim Counter As Long
    Dim arr() As String
    Dim StrStart As Long
    Dim StrEnd As Long
    
    ' Instantiate RegExp object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        
        ' First search needs to find all matches
        
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
        
        ' Run RegExp to find the matches
        
        Set TheMatches = .Execute(LookIn)
        
        ' If there are no matches, no replacement need to happen
    
        If TheMatches.Count = 0 Then
            RegExpReplaceRange = LookIn
            GoTo Cleanup
        End If
        
        ' Reset StartAt and EndAt if necessary based on matches actually found.  Escape if StartAt > EndAt
        ' 0 or -1 indicates last match.  Negative number indicates Nth to last
        
        Select Case StartAt
            Case Is > 0: 'no adjustment needed
            Case 0, -1: StartAt = TheMatches.Count
            Case Is < -TheMatches.Count: StartAt = -StartAt
            Case Else: StartAt = TheMatches.Count + StartAt + 1
        End Select
        
        Select Case EndAt
            Case Is > 0: 'no adjustment needed
            Case 0, -1: EndAt = TheMatches.Count
            Case Is < -TheMatches.Count: EndAt = -EndAt
            Case Else: EndAt = TheMatches.Count + EndAt + 1
        End Select
        
        If StartAt > EndAt Then
            RegExpReplaceRange = LookIn
            GoTo Cleanup
        End If
        
        ' Now create an array for the partial strings.  The elements of the array correspond to...
        ' 0         : text before the 1st match
        ' 1         : the first match
        ' 2 * N - 2 : text between the (N - 1)th and the Nth match (repeat as needed)
        ' 2 * N - 1 : the Nth match (repeat as needed)
        ' X         : text after the last match (X = 2 * number of matches)
        
        ReDim arr(0 To 2 * TheMatches.Count) As String
        
        ' Loop through the matches to populate the array
        
        For Counter = 1 To TheMatches.Count
            
            ' If Counter = 1 then it's the first match, and we need the text before the first match.
            ' If not, then we need the text between the (N - 1)th and the Nth match
            
            If Counter = 1 Then
                arr(0) = Left(LookIn, TheMatches(0).FirstIndex)
            Else
                
                ' Starting character position for text between the (N - 1)th and the Nth match
                
                StrStart = TheMatches(Counter - 2).FirstIndex + TheMatches(Counter - 2).Length + 1
                
                ' Length of text between the (N - 1)th and the Nth match
                
                StrEnd = TheMatches(Counter - 1).FirstIndex - StrStart + 1
                arr(2 * Counter - 2) = Mid(LookIn, StrStart, StrEnd)
            End If
            
            ' Now we process the match.  If the match number is within the replacement range,
            ' then put the replacement value into the array.  If not, put in the match value
            
            If Counter >= StartAt And Counter <= EndAt Then
'                arr(2 * Counter - 1) = ReplaceWith
                arr(2 * Counter - 1) = .Replace(TheMatches(Counter - 1), ReplaceWith)
            Else
                arr(2 * Counter - 1) = TheMatches(Counter - 1)
            End If
            
            ' If Counter = TheMatches.Count then we need to get the text after the last match
            
            If Counter = TheMatches.Count Then
                StrStart = TheMatches(Counter - 1).FirstIndex + TheMatches(Counter - 1).Length + 1
                arr(UBound(arr)) = Mid(LookIn, StrStart)
            End If
        Next
    End With
    
    ' Use Join to concatenate the elements of the array for our answer
    
    RegExpReplaceRange = Join(arr, "")
    
Cleanup:
    
    ' Clear object variables
    
    Set TheMatches = Nothing
    
End Function
Function RegExpReplaceExpression(LookIn As String, PatternStr As String, Expression As String, _
    Optional StartAt As Long = 1, Optional EndAt As Long = 0, _
    Optional MatchCase As Boolean = True, Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function is intended for use only in Excel-based VBA projects, since it relies on the
    ' Excel Application.Evaluate method to process the expression.  The expression must use only
    ' normal arithmetic operators and/or native Excel functions.  Use $& to indicate where the
    ' entire match value should go, or $1 through $9 to use submatches 1 through 9
    
    ' This function uses Regular Expressions to parse a string, and replace parts of the string
    ' matching the specified pattern with another string.  In particular, this function replaces
    ' the specified range of matched values with the designated replacement string.  In a twist,
    ' though
    
    ' StartAt indicates the start of the range of matches to be replaced.  Thus, 2 indicates that
    ' the second match gets replaced.  Use zero to specify the last match.  Negative numbers
    ' indicate the Nth to last: -1 is the last, -2 the 2nd to last, etc
    
    ' EndAt indicates the end of the range of matches to be replaced.  Thus, a 5 would indicate
    ' that the 5th match is the last one to be replaced. Use zero to specify the last match.
    ' Negative numbers indicate the Nth to last: -1 is the last, -2 the 2nd to last, etc
    
    ' Thus, if you use StartAt = 2 and EndAt = 5, then the 2nd through 5th matches will be replaced.
    
    ' By default, RegExp is case-sensitive in pattern-matching.  To keep this, omit MatchCase or
    ' set it to True
    
    ' Note: Match.FirstIndex assumes that the first character position in a string is zero.  This
    ' differs from VBA and VB6, which has the first character at position 1
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim StartStr As String
    Dim WorkingStr As String
    Dim Counter As Long
    Dim arr() As String
    Dim StrStart As Long
    Dim StrEnd As Long
    Dim Counter2 As Long
    
    ' Instantiate RegExp object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        
        ' First search needs to find all matches
        
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
        
        ' Run RegExp to find the matches
        
        Set TheMatches = .Execute(LookIn)
        
        ' If there are no matches, no replacement need to happen
    
        If TheMatches.Count = 0 Then
            RegExpReplaceExpression = LookIn
            GoTo Cleanup
        End If
        
        ' Reset StartAt and EndAt if necessary based on matches actually found.  Escape if StartAt > EndAt
        
        Select Case StartAt
            Case Is > 0: 'no adjustment needed
            Case 0, -1: StartAt = TheMatches.Count
            Case Is < -TheMatches.Count: StartAt = -StartAt
            Case Else: StartAt = TheMatches.Count + StartAt + 1
        End Select
            
        Select Case EndAt
            Case Is > 0: 'no adjustment needed
            Case 0, -1: EndAt = TheMatches.Count
            Case Is < -TheMatches.Count: EndAt = -EndAt
            Case Else: EndAt = TheMatches.Count + EndAt + 1
        End Select
        
        If StartAt > EndAt Then
            RegExpReplaceExpression = LookIn
            GoTo Cleanup
        End If
        
        ' Now create an array for the partial strings.  The elements of the array correspond to...
        ' 0         : text before the 1st match
        ' 1         : the first match
        ' 2 * N - 2 : text between the (N - 1)th and the Nth match (repeat as needed)
        ' 2 * N - 1 : the Nth match (repeat as needed)
        ' X         : text after the last match (X = 2 * number of matches)
        
        ReDim arr(0 To 2 * TheMatches.Count) As String
        
        ' Loop through the matches to populate the array
        
        For Counter = 1 To TheMatches.Count
            
            ' If Counter = 1 then it's the first match, and we need the text before the first match.
            ' If not, then we need the text between the (N - 1)th and the Nth match
            
            If Counter = 1 Then
                arr(0) = Left(LookIn, TheMatches(0).FirstIndex)
            Else
                
                ' Starting character position for text between the (N - 1)th and the Nth match
                
                StrStart = TheMatches(Counter - 2).FirstIndex + TheMatches(Counter - 2).Length + 1
                
                ' Length of text between the (N - 1)th and the Nth match
                
                StrEnd = TheMatches(Counter - 1).FirstIndex - StrStart + 1
                arr(2 * Counter - 2) = Mid(LookIn, StrStart, StrEnd)
            End If
            
            ' Now we process the match.  If the match number is within the replacement range,
            ' then put the replacement value into an Evaluate expression, and place the result
            ' into the array.  If not, put in the match value
            
            If Counter >= StartAt And Counter <= EndAt Then
            
                ' $& stands in for the entire match
                
                Expression = Replace(Expression, "$&", TheMatches(Counter - 1))
                
                ' Now loop through the Submatches, if applicable, and make replacements
                
                For Counter2 = 1 To TheMatches(Counter - 1).Submatches.Count
                    Expression = Replace(Expression, "$" & Counter2, TheMatches(Counter - 1).Submatches(Counter2 - 1))
                Next
                
                ' Evaluate the expression
                
                arr(2 * Counter - 1) = Evaluate(Expression)
            Else
                arr(2 * Counter - 1) = TheMatches(Counter - 1)
            End If
            
            ' If Counter = TheMatches.Count then we need to get the text after the last match
            
            If Counter = TheMatches.Count Then
                StrStart = TheMatches(Counter - 1).FirstIndex + TheMatches(Counter - 1).Length + 1
                arr(UBound(arr)) = Mid(LookIn, StrStart)
            End If
        Next
    End With
    
    ' Use Join to concatenate the elements of the array for our answer
    
    RegExpReplaceExpression = Join(arr, "")
    
Cleanup:
    
    ' Clear object variables
    
    Set TheMatches = Nothing
    Set RegX = Nothing
    
End Function

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=

If you liked this article and want to see more from this author,  please click here.

如果您喜欢本文,并希望从该作者那里获得更多信息, 请单击此处

If you found this article helpful, please click the Yes button near the:

如果您认为这篇文章对您有所帮助 ,请单击以下位置旁边的“

      Was this article helpful?

本文是否有帮助?

label that is just below and to the right of this text.   Thanks!

此文字下方和右侧的标签。

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=

Regular-Expressions-Examples.xls 正则表达式-Examples.xls

翻译自: https://www.experts-exchange.com/articles/1336/Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值