今天研究了正则表达式的贪婪算法,量词的贪婪性是在量词的三大算法(贪婪,惰性,支配)中最实用也是最难理解的一种算法,下面我们举例将它说清楚。
定义:
A greedy quantifier starts by looking at the entire string for a match. If no match is found, it eliminates the last character in the string and tries again. If a match is still not found, the last character is again discarded and the process repeats until a match is found or the string is left with no characters. All the quantifiers discussed to this point have been greedy.
实例:
-
var str1 = "abbbaabbbaaabbb1234"; var reg1 = /.*bbb/g; document.write(str1.match(reg1)); document.write("<br/>");
结果:abbbaabbbaaabbb
-
var str2 = "zewsdf skyyy flyok156sky skyy skyyyy"; var reg2 = /sky*/g; document.write(str2.match(reg2)); document.write("<br/>");
结果:skyyy,sky,skyy,skyyyy
-
var str3 = "yyyyyyyyyyyyy"; var reg3 = /yy*/g; document.write(str3.match(reg3));
结果:yyyyyyyyyyyyy
-
var str4 = "yyyyyykyyyyy"; var reg4 = /yy*/g; document.write(str4.match(reg4));
结果:yyyyyy,yyyyy
总结:
大家仔细观察上面的例子和输出结果不难看出,贪婪算法是这样的:首先会按照正则表达式制定的匹配顺序进行匹配直到发现一个量词,这个时候贪婪算法会先取整个串看是否符合包括量词在内的整个正则表达式的匹配情况,如果满足砍掉匹配上的子串,在剩下的子串中再找是否有符合的情况,如有取出,没有结束。一直递归下去,直到找到所有的匹配情况或没有任何匹配的时候结束匹配。