Lua中如何使用模式匹配

最新推荐文章于 2024-05-31 16:53:55 发布

blueboy2000

最新推荐文章于 2024-05-31 16:53:55 发布

阅读量1.5k

点赞数

分类专栏： lua 文章标签： lua 正则表达式 string resources regex function

本文链接：https://blog.csdn.net/blueboy2000/article/details/5441745

版权

lua 专栏收录该内容

35 篇文章 0 订阅

订阅专栏

什么是正则表达式?

　　简单的说，正则表达式(我们通常称之为 Regex)就是符合某些规则的字符串.

　　这里是对正则表达式一个非常好的介绍. Lua 宣称自己是实现的是 ""patterns" 而不是 "regexes" (其实是一码事) . 在看其官方文档的时候可能会给人造成困扰,因为他从来不提 "regular expressions";

　　你可以在 http://www.lua.org/manual/5.0/manual.html#5.3 里面获得更多的信息.

　　Lua 里面的正则表达式

　　Lau 提供了几个使用正则表达式的字符串函数,其中有:

　　string.find(string, pattern) -- 在字符串 string 中查找第一个符合正则表达式 pattern 的位置.

　　string.gfind(string, pattern) -- 当重复调用此函数时,可以在字符串 string 中查找多次符合正则表达式 pattern 的位置

　　string.gsub(string, pattern, repl) -- 返回一个替换完成的字符串,替换规则是将 string 中每个符合表达式 pattern 的地方都替换为 repl.

　　如何编写正则表达式?

　　那么,我们如何来构造一个正则表达式?

　　从本质上来说,正则表达式是带有转义字符的字符串,Lua 的正则表达式的转义字符有:

　　x (这里 x 是指其不是这些转义字符 ^$()%.[]*+-? 之一) --- 其代表了这个字符本身.

　　[set] --- 代表一个字符集合. 如果要表达一个范围集合,在范围开始的字符和结尾的字符之间使用减号 - 。例如要表达 3,4,5,6 这个集合可以用 [3456] ,也可以用 [3-6] .下面提到的其他转义字符也可以用到集合中. 例如, [%w_] 表达所有的字母和数字以及下划线.

　　. --- 代表任何字符

　　%a --- 代表任何字母,等同于[a-zA-Z]

　　%c --- 代表任何的控制字符.

　　%d ---代表任何的数字,等同于[0-9].

　　%l --- 代表所有的小写字母,等同于[a-z].

　　%p --- 代表所有的标点符号.

　　%s --- 代表所有空格,tab 字符.

　　%u --- 代表所有的大写字母,等同于[A-Z].

　　%w --- 代表所有的字母数字,等同于[a-zA-Z0-9].

　　%x --- 代表16进制数字.

　　%z --- 代表字符值是 0 的字符. 注意:值为0 的字符是无法正常表达的,如果你要使用他,请使用 %z 代替%x0.

　　%x --- 代表字符 x. 这是一种来表示转义字符的标准方式. 任何标点符号字符(即使不是转义字符) 在其前面添加一个 % 都可以用来表示其自己例如 %% 表示字符 % , %$ 表示字符 $.

　　[set] --- 表达所有不出现在集合内的,比如[0-9]或[^%d]表示非数字.

　　一切正则表达式均可由上述基本语言元素构成。例如，要表示一个形如"###Abc"的字符串(#表示一个数字)，就可以使用"%d%d%dAbc"这个模式。下面是一些匹配原则

　　x* -- 表示0个或更多个x，将匹配最多个x.

　　x+ -- 表示1个或更多个x，将匹配最多个x.

　　x- -- 表示0个或更多个x，但将匹配最少个x.

　　x? -- 表示0个或1个x.

　　另外，对匹配还有以下选项：

　　%n -- 这里n是1-9的数字. 将匹配第n个结果。

　　%bxy -- 匹配一个以x开始，并以y结束的字符串。The substring must also have the same number of x and y.

　　-- 若一个模式以开始，那么结果一定与字符串的头部相匹配。When at the beginning of a pattern, it forces the pattern to match the beginning of a string

　　$ -- 若一个模式以$结束，那么结果一定与字符串的尾部相匹配。When at the end of a pattern, it forces the pattern to match the end of a string

　　如果^和$没有出现在上述位置，则不起这个作用。

　　模式是可以嵌套的，使用()来进行嵌套之间层次的分隔。

　　例子

　　比如说我们在写一个计算附近死亡的鱼人的数量的函数，该怎么做呢?

　　为了简单起见，架设所有鱼人的名字里都含有"鱼人"这个字符串。只需使用"鱼人.*死亡了"即可匹配到"鱼人死亡了"、"鱼人冒险者死亡了"、"胖子鱼人死亡了"这些信息。

　　For a better challenge, let's go one step further. Instead of just counting the number that die, let's replace the name of whatever died with "An unholy demon spawn". Since the name of the mob will vary, we need a pattern that will match all of them. To do this, we can use "^Greymist .* dies()". This pattern will match strings that start with the word "Greymist" followed by any number of characters followed by the word "dies". Then we can just use the string.gsub function to do the replacement like so:

　　string.gsub(chatstring, "^Greymist .* dies", "An unholy demon spawn dies");

　　Afterthoughts

　　Hopefully this has given you a good start into how to use regular expressions. There are many many resources online regarding them, and if there's anything not covered here, you should be able to find it somewhere out there.