Implement regular expression matching with support for '.'
and '*'
.
'.' Matches any single character. '*' Matches zero or more of the preceding element. The matching should cover the entire input string (not partial). The function prototype should be: bool isMatch(const char *s, const char *p) Some examples: isMatch("aa","a") → false isMatch("aa","aa") → true isMatch("aaa","aa") → false isMatch("aa", "a*") → true isMatch("aa", ".*") → true isMatch("ab", ".*") → true isMatch("aab", "c*a*b") → true
Idea:
This problem deserves "hard" tag, very tricky. It can be solved with two solutions, dynamic programming or backtracking.
Solution One, backtracking.
We use two pointers for string and pattern separately. for each string pointer "ps" and pattern pointer "pp". Write a backtracking function bool match(string s, int ps, string p, int pp) to determine whether s[0,ps] matches p[0, pp] (all inclusive).
First check the pp and ps position, if they are both to the end of corresponding string, that is ps==len(s) and pp == len(p), we can get to this case if and only if previous s[0, len(s)-1] matches p[0, len(p)-1], so in this case, return true.
if ps!=len(s) and pp == len(p), obviously we cannot make s and p match, return false.
if ps==len(s) and pp != len(p). we need to pay a bit more attention. For example, s = "ab", p="aba*b*c*", it should return true. If s = "ab", p="abab*b*c*" it should return false. In the general case, we need to traverse along remaining characters in p, and check whether the remaining p match a pattern like "a*b*c*.*....", that is character and * appear in pairs. It's easy to implement.
Now we can check the pattern in position pp: pat = p[pp]. there are three cases: pat=='*', pat=='.', pat==[a-z].
In case of pat=='*', first we have to find the previous pattern pre = p[pp-1]. Then from ps, check whether substring of s s[ps, i] ps<=i<=len(s) matches pre* (that is, "", pre, prepre, ...). if they match, make ps = i and pp++, check next pattern. if next pattern return true, this time we return true, if next pattern return false, loop to next i. if substring s[ps, i] does not match pre*, ths time we return false, since next i either cannot make s[ps,i] matches pre*. Note that if pre=='.', there no need to check matching of substring and pre*.
In other two cases, we need to check p[pp+1], if p[pp+1]=='*', just make pp+1 and go to next pattern. Next pattern will check case one and implement all work, and return whether next pattern matches. If p[pp+1]!='*', check current p[pp] and s[ps], and go to next pattern ps++ and pp++. When next pattern is true and this time p[pp] matches s[ps], return true, otherwise return false.
Code:
public class Solution {
public boolean isMatch(String s, String p) {
return match(s, 0, p, 0);
}
private boolean match(String s, int ps, String p, int pp){
if( ps==s.length() && pp==p.length() ) return true;
if( ps==s.length() && pp!=p.length() ){
if((p.length()-pp)%2==0){
for(int i=pp;i<p.length();i=i+2){
if(p.charAt(i+1)!='*') return false;
}
return true;
}
return false;
}
if(ps!=s.length() && pp==p.length()) return false;
if(p.charAt(pp)=='*'){
char pre = p.charAt(pp-1);
String m = "";
for(int i=ps;i<=s.length();i++){
if(pre=='.'){
if(match(s, i, p, pp+1)) return true;
}
else{
if(s.substring(ps, i).equals(m) ){
if (match(s, i, p, pp+1)) return true;
}else{
break;
}
m = m+pre;
}
}
return false;
}else if(p.charAt(pp)=='.'){
if(pp<p.length()-1 && p.charAt(pp+1)=='*'){
return match(s, ps, p, pp+1);
}else{
return match(s, ps+1, p, pp+1);
}
}else{
if(pp<p.length()-1 && p.charAt(pp+1)=='*'){
return match(s, ps, p, pp+1);
}else{
return s.charAt(ps)==p.charAt(pp) && match(s, ps+1, p, pp+1);
}
}
}
}
The time complexity is O(n!). Space complexity is O(1)
Solution Two, dynamic programming.
Construct a matrix bool dp[][] with n*m, n=len(s)+1, m=len(p)+1. dp[i][j] means whether s[0, i-1] matches 0[0, j-1], 0<=i< n, 0<=j<m. s[0,-1] and p[0,-1] represent empty string.
Obviously "" matches "", so dp[0][0] = true. If p=="" while s!="", it wouldn't match. So dp[i][0] = false (i!=0). If p!="" while s=="", only p with pattern x*x*x*... make it match. It can be expressed as dp[0][j] = p[j-1]=='*' && dp[0][j-2] (j!=0)
For i>0 and j>0, we need to consider two cases
case 1, p[j-1]!='*', s[0, i-1] matches p[0,j-1] iff s[0,i-2] matches p[0,i-2] AND s[i-1] matches s[j-1]. It can be expressed as dp[i][j]=match(s[i-1], p[j-1]) && dp[i-1][j-1]
case 2 p[j-1]=='*', find pre = p[j-2]. if "pre*" appears 0 times, p[0, i-1] matches s[0, j-1] iff s[0, i-1] matches p[0, j-3], it can be expressed as dp[i][j]=d[i][j-2]. If "pre*" appears one or more times, dp[i][j]= match(pre, s[i-1]) && dp[i-1][j], that is, s[0, i-1] matches p[0, j-1] iff s[i-2] matches p[0, j-1] AND s[i-1] matches pre. Very tricky, you'd better draw a matrix on a paper and think carefully.
public class Solution {
public boolean isMatch(String s, String p) {
int n = s.length()+1;
int m = p.length()+1;
boolean[][] dp = new boolean[n][m];
dp[0][0] = true;
for(int i=1;i<n;i++){
dp[i][0] = false;
}
for(int i=1;i<m;i++){
dp[0][i] = p.charAt(i-1)=='*' && dp[0][i-2];
}
for(int i=1;i<n;i++){
for(int j=1;j<m;j++){
if(p.charAt(j-1)!='*'){
dp[i][j] = compare(s.charAt(i-1), p.charAt(j-1)) && dp[i-1][j-1];
}else{
dp[i][j] = dp[i][j-2] || (compare(s.charAt(i-1), p.charAt(j-2)) && dp[i-1][j]);
/*
if(compare(s.charAt(i-1), p.charAt(j-2))){
dp[i][j] = dp[i-1][j] || dp[i][j-2];
}else{
dp[i][j] = dp[i][j-2];
}
*/
}
}
}
return dp[n-1][m-1];
}
private boolean compare(char s, char p){
if(p=='.') return true;
return s==p;
}
}
Time and space complexity are both O(N^2)