Regular Expression Matching (Leetcode hard algorithm problem)

最新推荐文章于 2024-07-14 19:56:17 发布

YouyouPan

最新推荐文章于 2024-07-14 19:56:17 发布

阅读量163

点赞数 1

分类专栏： leetcode 文章标签：算法 leetcode 正则表达式

本文链接：https://blog.csdn.net/YouyouPan/article/details/60479693

版权

leetcode 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

经过上周的medium热身之后，这周决定选个hard的来训练一下。
这次题目为正则表达式，描述如下。原题链接

‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch(“aa”,”a”) → false
isMatch(“aa”,”aa”) → true
isMatch(“aaa”,”aa”) → false
isMatch(“aa”, “a*”) → true
isMatch(“aa”, “.*”) → true
isMatch(“ab”, “.*”) → true
isMatch(“aab”, “c*a*b”) → true

在以前学习CS的时候，老师最爱说的一句话是，当一个问题来临的时候，先去尝试最粗暴直接去解决它，如果这个直接方法效率太差，再去想办法去优化，解决问题先放在第一位。
于是，我使用了穷尽的方法……

不难看出来，正则表达式每一次匹配最多会有三种情况：
Type 1.正则表达式向后偏移（例子：a* 比配 b，此时之后正则表达式向后一位）
Type 2.字符串向后偏移（例子：a*匹配 a，此时字符串匹配成功，向后一位，正则表达式可不便宜）
Type 3.正则表达式和字符串都向后偏移（a匹配a，成功匹配，二者都向后偏移）

那么每一次匹配都是一颗三叉树，只要把这三叉树遍历完全，就会有答案。

结点的操作为：
1.当正则表达式和字符串都匹配到最尽头，返回成功。
2.如果能够按照上述三个条件生成出子节点的，加入子节点，然后遍历子节点。
3.如果既不成功，也不能生成子节点的，就返回失败。

既然是一棵树，那么就很明了了，直接上递归。
递归伪代码：

match(s, p, current_index_s, current_index_p):
    return1=return2=return3=false
    if s and p both end:
        return true;
    else:
        if Type1:
            return1 = match(s,p,cur_index_s, current_index_p+1)
        if Type2:
            return2 = match(s,p,cur_index_s+1, current_index_p)
        if Type3:
            return3 = match(s,p,cur_index_s +１, current_index_p＋１)
    return return1 | return2 | return3

跑出来结果，是对的，只是额……超时了。
上诉递归是有问题的，看起来像是深度优先，但是跑起来是完全遍历。
应该是这样才对。

match(s, p, current_index_s, current_index_p):
    return1=return2=return3=false
    if s and p both end:
        return true;
    else:
        if Type1:
            return1 = match(s,p,cur_index_s, current_index_p+1)
            if return1：
                return true
        if Type2:
            return2 = match(s,p,cur_index_s+1, current_index_p)
            if return2：
                return true
        if Type3:
            return3 = match(s,p,cur_index_s +１, current_index_p＋１)
            if return3：
                return true
    return false

再次提交，还是超时，超时的例子在于
a*a*a*a*a*a*c 和 ab的匹配，这个是匹配不成功的的样例，因此，无论是深度优先还是广度优先，都需要把树完全遍历一遍，这样显然是很慢的。

不过我选择的并不是再在递归上去做优化，而是对正则表达式做优化。
因为
a*a*a*a*a*a*c = a*c
同类的，还有：
.a*b*c*d=.*
a*a*aaaaaa*= a*aaaa

好了，做完优化后，提交成功了，中规中矩。

提交结果

最后附上C++代码

# include <iostream>
# include <string>

using namespace std;

// 化简正则表达式
string str_init(string p) {
    if (p.length() < 3) {
        return p;
    }
    string ret_str = string();
    ret_str.push_back(p[0]);
    ret_str.push_back(p[1]);
    int index = 2;
    bool cfind = false;
    while (p[index] != '\0') {
        if (index + 1 < p.length() && p[index + 1] == '*') {
            int counter = index - 1;
            cfind = false;
            while (counter >= 0) {
                if (p[counter] != '*' && p[counter] != p[index]) {
                    break;
                }
                if ((p[counter] == '.' || p[counter] == p[index]) && p[counter + 1] == '*') {
                    cfind = true;
                    break;
                }
                counter -= 1;
            }
            if (!cfind) {
                ret_str.push_back(p[index]);
                ret_str.push_back(p[index + 1]);
                index += 2;
            }
            else {
                index+=2;
            }
        }
        else {
            ret_str.push_back(p[index]);
            index++;
        }
    }
    return ret_str;
}

// 递归函数
bool matchOne(string& s, string& p, int s_i, int p_i) {
    if (s[s_i] == '\0' && p[p_i] == '\0') {
        return true;
    }
    // Judgement
    bool type1 = false;
    bool type2 = false;
    bool type3 = false;
    if (s[s_i] == '\0') {
        if (p[p_i] == '*') {
            type1 = true;
        }
    }
    else {
        if (p[p_i] == '*') {
            type1 = true;
            if (s[s_i] == p[p_i - 1] || p[p_i - 1] == '.') {
                type2 = true;
                type3 = true;
            }
        }
        else {
            if (s[s_i] == p[p_i] || p[p_i] == '.') {
                type3 = true;
            }
        }
    }
    // Alignment
    bool ret1 = false;
    bool ret2 = false;
    bool ret3 = false;
    int temp_p_i = 0;
    int temp_s_i = 0;
    if (type3) {
        if (p_i + 2 < p.length() && p[p_i + 2] == '*') {
            temp_p_i = p_i + 2;
        }
        else {
            temp_p_i = p_i + 1;
        }
        temp_s_i = s_i + 1;
        ret3 = matchOne(s, p, temp_s_i, temp_p_i);
        if (ret3) {
            return true;
        }
    }
    if (type1) {
        if (p_i + 2 < p.length() && p[p_i + 2] == '*') {
            temp_p_i = p_i + 2;
        }
        else {
            temp_p_i = p_i + 1;
        }
        temp_s_i = s_i;
        ret1 = matchOne(s, p, temp_s_i, temp_p_i);
        if (ret1) {
            return true;
        }
    }
    if (type2) {
        temp_s_i = s_i + 1;
        temp_p_i = p_i;
        ret2 = matchOne(s, p, temp_s_i, temp_p_i);
        if (ret2) {
            return true;
        }
    }

    return false;
}

class Solution {
public:
    bool isMatch(string s, string p) {
        int s_i = 0;
        int p_i = 0;
        string modified_p = str_init(p);
        if (p_i + 1 < p.length() && p[p_i + 1] == '*') {
            p_i++;
        }
        return matchOne(s, modified_p, s_i, p_i);
    }
};

YouyouPan

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Regular Expression Matching (Leetcode hard algorithm problem)

经过上周的medium热身之后，这周决定选个hard的来训练一下。这次题目为正则表达式，描述如下。原题链接‘.’ Matches any single character. ‘*’ Matches zero or more of the preceding element.The matching should cover the entire input string (not partia
复制链接

扫一扫