LeetCode Regular Expression Matching 极品代码赏析

靖心

于 2013-11-23 20:51:26 发布

阅读量2.5k

点赞数 1

分类专栏： Algorithm算法算法和数据结构C++实现文章标签： LeetCode Regular Expression M 极品代码赏析

本文链接：https://blog.csdn.net/kenden23/article/details/16902083

版权

Algorithm算法同时被 2 个专栏收录

568 篇文章 2 订阅

订阅专栏

算法和数据结构C++实现

194 篇文章 212 订阅

订阅专栏

Regular Expression Matching

Implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

为什么最后一个会匹配呢？原因是：'*' Matches zero or more of the preceding element.这个条件；

这个条件很容易忽略，也是最重要的条件了，因为这个相当于*可以把它的前缀字母”吃掉“；我就栽在这里了。

所以也是为什么我们需要“往前看”，看到pattern字符串后一位，假设这个字符串为const char *p; 那么我们就要先考虑到前一个指针，然后再比较当前指针。

下面代码来自：

http://leetcode.com/2011/09/regular-expression-matching.html

class Solution {
public:
    bool isMatch(const char *s, const char *p) {
		if (*p == '\0') return *s == '\0';

		// next char is not '*': must match current character
		if (*(p+1) != '*') {
			return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
		}
		// next char is '*'
		while ((*p == *s) || (*p == '.' && *s != '\0')) {
			if (isMatch(s, p+2)) return true;
			s++;
		}
		return isMatch(s, p+2);
	}
};

但是文中只是分析了原理，并没有分析各个代码的作用和如何实现的，所以，我这里就分析一下他的代码。

因为程序非常短，所以每一句都有非常重要的意义，看仔细了，漏掉任何一个细节都理解不了这个程序。

if (*p == '\0') return *s == '\0';
//if (*s == '\0') return *p == '\0';

不能增加后面一句，因为*(p+1) = '*'的时候，是可以吃掉*p的。

// next char is not '*': must match current character
		if (*(p+1) != '*') {
			return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
		}

这里是向前判断，不能判断*p，同样道理因为*可以吃点前一个字母。满足条件就递归。

// next char is '*'
		while ((*p == *s) || (*p == '.' && *s != '\0')) {
			if (isMatch(s, p+2)) return true;
			s++;
		}

*(p+1)=='*'的时候；判断当前*p的值如果等于*s或是*p=='.'的情况，就可以跃过当前p和p+1的值，直接在p+2的位置开始重新进行递归比较。

p+2位置之后的字符串可以和s串的s, s+1,s+2...等位置开始的子串匹配，如果匹配成功都可以。

return isMatch(s, p+2);

最后一句也非常重要，就是如果p+2开始的子串和s,s+1,s+2等所有子串都不匹配的时候，就吃掉这个*，就是这个时候把*当做是0个*p使用。

这也是为什么无需要判断*p==*s这个条件的原因。

然后就回溯，原文中提到的所谓回溯法就在这句体现出来了！够精巧吧！

2014-2-22 update

好困难的题目，主要是要考虑s为空的情况，这样就比较难处理了。

不过终于被我征服了（彻底理解了）。

我上面的分析有一定道理，不过感觉当时还是理解的不是那么透切。留着参考吧。

下面是我自己写的代码：和上面的程序，效率是一样的，不过递归处理有点不一样。

//2014-2-22 update
	bool isMatch_3(const char *s, const char *p) 
	{
		if (!*p) return !*s;
		for ( ; (*s && *p) && *(p+1) != '*' && (*s == *p || *p == '.'); s++, p++);

		if (!*s && !*p) return true;
		if (!*p || *(p+1) != '*') return false;
		//if (!*s) return isMatch(s, p+2);有了下面的return isMatch(s, p+2)，那么就不需要这句了。主要是处理特例：s为空，p == "c*d*g*"等字符串
		while (*s)
		{
			if (isMatch(s, p+2)) return true;//这就相当于先序遍历了
			if (*p != '.' && *p != *s) return false;
			s++;
		}
		return isMatch(s, p+2);//为什么是isMatch(s, p+2)而不是!*(p+2)?因为当s进入循环，然后s == ''空的时候，那么就需要继续递归判断
	}