正则表达式
题目:
Given an input string (s
) and a pattern (p
), implement regular expression matching with support for '.'
and '*'
.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note:
s
could be empty and contains only lowercase lettersa-z
.p
could be empty and contains only lowercase lettersa-z
, and characters like.
or*
.
Example 1:
Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".
Example 5:
Input:
s = "mississippi"
p = "mis*is*p*."
Output: false
解法一:递归
Approach 1: Recursion
Intuition
If there were no Kleene stars (the *
wildcard character for regular expressions), the problem would be easier - we simply check from left to right if each character of the text matches the pattern. 如果没有 * ,问题就会变得简单,只需要简单的从左到右检查s中的字符串是否与pattern匹配
When a star is present, we may need to check many different suffixes of the text and see if they match the rest of the pattern. A recursive solution is a straightforward way to represent this relationship. 如果有 * ,我们需要检查s许多不同的前缀,看剩下的部分是否匹配pattern
Algorithm
Without a Kleene star, our solution would look like this: 如果没有* ,解法如下
class Solution {
public:
bool isMatch(string s, string p) {
if(p.empty()) return s.empty();
bool first_match = (!s.empty() && (p[0] == s[0] || p[0] == '.'));
return first_match && isMatch(s.substr(1),p.substr(1));
}
};
If a star is present in the pattern, it will be in the second position pattern[1]. Then, we may ignore this part of the pattern, or delete a matching character in the text. If we have a match on the remaining strings after any of these operations, then the initial inputs matched. 如果出现 * ,它将会出现在第二个位置 pattern[1] , 然后我们可以忽略这部分的,或者删除s中匹配的字符,如果(忽略 || 删除)前一步操作后,剩余的字符仍匹配,那么原来的 s 和 pattern 匹配
class Solution {
public:
bool isMatch(string s, string p) {
if(p.empty()) return s.empty();
bool first_match = (!s.empty() && (p[0] == s[0] || p[0] == '.'));
if(p.length()>=2 && p[1] == '*')
return(isMatch(s,p.substr(2)) || (first_match && isMatch(s.substr(1),p)));
else
{
return first_match && isMatch(s.substr(1),p.substr(1));
}
}
};
解法二:
Dynamic Programming
Intuition
As the problem has an optimal substructure, it is natural to cache intermediate results. We ask the question dp(i, j): does text[i:] and pattern[j:] match? We can describe our answer in terms of answers to questions involving smaller strings.
由于这个问题有最优的子结构,很容易缓存中间结果。我们用dp[i][j]表示,s[i] 和s[j]是否匹配。 我们用子串是否匹配来回答最后的字符串是否匹配
Algorithm
We proceed with the same recursion as in Approach 1, except because calls will only ever be made to match(text[i:], pattern[j:])
, we use dp(i, j) to handle those calls instead, saving us expensive string-building operations and allowing us to cache the intermediate results. 我们用和递归同样的方法,只不过动态规划,使用dp[i][j] 来代替调用,节省了字符串的操作并且可以缓存结果
自顶向下:
class Solution {
public:
bool isMatch(string s, string p) {
bool dp[200][200] = {false};
dp[s.length()][p.length()] = true;
int slen = s.length();
int plen = p.length();
for(int i = slen; i >= 0; i--)
for(int j = plen - 1; j >= 0; j--)
{
bool fmatch = (i < slen && (p[j] == s[i] || p[j] == '.'));
if( j + 1 < plen && p[j + 1] == '*')
{
dp[i][j] = dp[i][j+2] || (fmatch && dp[i+1][j]);
}
else{
dp[i][j] = (fmatch && dp[i+1][j+1]);
}
}
return dp[0][0];
}
};
解法三:动态规划
自底向上
f[i,j] 表使用示 s 的前 i 个字母和 p 的前 j 个字母是否匹配
1)p[j] != *, f[i][j] = f[i-1][f[j-1] && (s[i] 和 p[j]匹配)
2)p[j] == *,需要枚举*表示多少个字母
f[i][j] = f[i][j-2] 匹配0个 x*都不匹配
|| f[i-1][j-2] 匹配一个 x --> x* s[i] 和 p[j-1] 匹配
|| f[i-2][j-2] 匹配二个 xx --> x* s[i-1]s[i] 和 p[j-1] 匹配
......
f[i-1][j] = f[i-1][j - 2] 匹配0个
|| f[i -2][j - 2] 匹配一个 s[i - 1] 和 p[j-1]匹配
上面比下面多了一项,s[i]与p[j-1]匹配
把上式化简为:
f[i][j] = f[i][j-2] || ( f[i-1][j] &&s[i]和p[j-1]相匹配)
class Solution {
public:
bool isMatch(string s, string p) {
int n = s.size(), m = p.size();
s = ' ' + s, p = ' ' + p;
vector<vector<bool>> f(n+1,vector<bool>(m+1));
f[0][0] = true;
for(int i = 0; i <= n; i++)
{
for(int j = 0; j <= m; j++)
{
if(!i && !j) f[i][j] = true;
else
{
if(j + 1 <= m && p[j + 1] == '*') continue;
if(p[j] != '*')
{
if(p[j] == '.' || s[i] == p[j])
if(i >0 && j >0) f[i][j] = f[i-1][j-1];
}
else
{
if(j >=2) f[i][j] = f[i][j-2];
if(i > 0 && j > 0)
{
if(p[j-1] == '.' || s[i] == p[j-1])
if(f[i-1][j])
f[i][j] = true;
}
}
}
}
}
return f[n][m];
}
};