Leetcode_10_RegularExpressionMatching正则表达式匹配与Leetcode_44_WildcardMatching模糊匹配

最新推荐文章于 2024-06-02 20:13:23 发布

Rain_Bow_2021

最新推荐文章于 2024-06-02 20:13:23 发布

阅读量159

点赞数

分类专栏： LeetCodeAlgorithms

本文链接：https://blog.csdn.net/qq_29545781/article/details/102363393

版权

LeetCodeAlgorithms 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本文详细探讨了LeetCode上的两道经典题目——Leetcode_10_RegularExpressionMatching正则表达式匹配和Leetcode_44_WildcardMatching模糊匹配。文章通过四中不同的方法介绍了如何实现这两种匹配，包括动态规划的二维和一维数组实现，着重优化时间和空间复杂度。

摘要由CSDN通过智能技术生成

一、 Leetcode_10_RegularExpressionMatching正则表达式匹配

1. 题目介绍：

*Leetcode_10_RegularExpressionMatching_Hard
* https://leetcode.com/problems/regular-expression-matching/
* Given an input string (s) and a pattern (p), implement regular expression
* matching with support for '.' and '*'.
* (1)、'.' Matches any single character.
* (2)、'*' Matches zero or more of the preceding element.
* The matching should cover the entire input string (not partial).
* Note:
* s could be empty and contains only lowercase letters a-z.
* p could be empty and contains only lowercase letters a-z, and characters like . or *.
* Example 1:
* Input:
* s = "aa"
* p = "a"
* Output: false
* Explanation: "a" does not match the entire string "aa".
* Example 2:
* Input:
* s = "aa"
* p = "a*"
* Output: true
* Explanation: '*' means zero or more of the preceding element, 'a'.
* Therefore, by repeating 'a' once, it becomes "aa".
* Example 3:
* Input:
* s = "ab"
* p = ".*"
* Output: true
* Explanation: ".*" means "zero or more (*) of any character (.)".
* Example 4:
* Input:
* s = "aab"
* p = "c*a*b"
* Output: true
* Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".
* Example 5:
* Input:
* s = "mississippi"
* p = "mis*is*p*."
* Output: false

2. 思路分析

* 正则表达式匹配与模糊匹配的区别：
* 模糊匹配中，'?'表示单个任意字符；等同于正则中的'.';
* 但是模糊匹配中的'*'可以表示任意个任意字符;
* 而正则中'*'表示前面字符可以出现任意多个，如0个、1个或多个。
* 这样的话，模糊匹配中，'*'可以单独去匹配字符串；
* 但是正则中，'*'必须依赖于前面的字符,如"a*"或".*"去匹配。
* 正则匹配：
* 也是分成两种情况：
* ①当ps[i]!='*'时,与模糊匹配一样：与左上角值有关；
* dp[i][j] = dp[i - 1][j - 1] && (ps[i - 1] == '.' || ps[i - 1] == ss[j - 1])
* ②当ps[i]=='*'时,两种情况:
* A. *代表0,说明前一个元素ps[i-1]可以一个字符不匹配，即dp[i][j]=i > 1 && dp[i - 2][j];
* B. *代表大于1，匹配多个，此时在匹配一个的基础上(dp[i][j - 1]==true)，
* 前一个字符ps[i-2]为'.'或前一个元素与ss[j-1]相等。(i-2是因为i是从1开始的到lp结束的)。

3. JAVA代码

public boolean isMatch(String s, String p) {
        if (s != null && s.equals(p)) return true;
        char[] ss = s.toCharArray(), ps = p.toCharArray();
        int ls = ss.length, lp = ps.length;
        //创建dp数组并初始化
        boolean[][] dp = new boolean[lp + 1][ls + 1];
        dp[0][0] = true;
        for (int i = 1; i <= lp; i++)//初始化
            dp[i][0] = i > 1 && ps[i - 1] == '*' && dp[i - 2][0];
        for (int i = 1; i <= lp; i++) {
            if (ps[i - 1] == '*') {
                for (int j = 1; j <= ls; j++) {//关键
                    //*表示前一个字符匹配零次或多次：
                    // 匹配零次：dp[i][j] = (i > 1 && dp[i - 2][j])
                    // 匹配多次：.* 和 a*有区别，在前一个元素匹配的基础上：
                    // dp[i][j] =(dp[i][j - 1] && (ps[i - 2] == '.' || ps[i - 2] == ss[j - 1]))
                    //综合起来就是：
                    dp[i][j] = (i > 1 && dp[i - 2][j]) ||
                            (dp[i][j - 1] && (ps[i - 2] == '.' || ps[i - 2] == ss[j - 1]));
                }
            } else {
                for (int j = 1; j <= ls; j++) //简单：只与左上角dp[i - 1][j - 1]值有关系
                    dp[i][j] = dp[i - 1][j - 1] && (ps[i - 1] == '.' || ps[i - 1] == ss[j - 1]);
            }
        }
        return dp[lp][ls];
    }

二、 Leetcode_44_WildcardMatching模糊匹配

1. 题目介绍

* Leetcode_44_WildcardMatching_Hard
* 模糊匹配
* https://leetcode.com/problems/wildcard-matching/
* 问题描述：
* Given an input string (s) and a pattern (p), implement wildcard pattern
* matching with support for '?' and '*'.
* 1. '?' Matches any single character.
* 2. '*' Matches any sequence of characters (including the empty sequence).
* The matching should cover the entire input string (not partial).
* Note:
* s could be empty and contains only lowercase letters a-z.
* p could be empty and contains only lowercase letters a-z, and characters like ? or *.
* Example 1:
* Input:
* s = "aa"
* p = "a"
* Output: false
* Explanation: "a" does not match the entire string "aa".
* Example 2:
* Input:
* s = "aa"
* p = "*"
* Output: true
* Explanation: '*' matches any sequence.
* Example 3:
* Input:
* s = "cb"
* p = "?a"
* Output: false
* Explanation: '?' matches 'c', but the second letter is 'a',
* which does not match 'b'.
* Example 4:
* Input:
* s = "adceb"
* p = "*a*b"
* Output: true
* Explanation: The first '*' matches the empty sequence,
* while the second '*' matches the substring "dce".
* Example 5:
* Input:
* s = "acdcb"
* p = "a*c?b"
* Output: false

2. 思路分析

* 思路分析：动态规划
* 方法1：简单的动态规划，dp[lp+1][ls+1];最后返回dp[lp][ls]
* dp[i][j]表示p的前i段与s的前j段是否匹配；
* '*'可以匹配0个或1个或任意多个：
* ①匹配0个：dp[i][j]=dp[i-1][j];
* ②匹配1个(等同于'?')：dp[i][j]=dp[i-1][j-1];
* ③匹配任意多个：dp[i][j]=dp[i-1][j-2]||dp[i-1][j-3]||...||dp[i-1][0];
* 综合，得出：dp[i][j]=dp[i-1][j]||dp[i-1][j-1]||...||dp[i-1][0];
* ps[i]='*',让j-1取代j,则dp[i][j-1]=dp[i-1][j-2]||...||dp[i-1][0];
* 故dp[i][j] = dp[i-1][j]||dp[i][j-1];条件ps[j]='*';
* 故状态转移方程：(核心)
* dp[i][j] = dp[i-1][j]||dp[i][j-1]; 当ps[i]=='*';
* dp[i][j] = dp[i-1][j-1]&&(ps[i] == '?' || ss[j] == ps[i]);当ps[i]！='*';
* 显然：dp[i][j]只与其左上角、左边和上边dp值有关；所以只需要两个一维数组即可。
* 方法2：动态规划：两个一维数组dp[];
*  空间复杂度降低，时间复杂度升高。

3. Java代码

(1). 方法1

/**
* 方法1：动态规划dp[][];(最易理解)
* 时间复杂度和空间复杂度都比较高,O(M*N)。
*/
public boolean isMatch1(String s, String p) {
    char[] ss = s.toCharArray(), ps = p.toCharArray();
    int ls = ss.length, lp = ps.length;
    boolean[][] dp = new boolean[lp + 1][ls + 1];
    dp[0][0] = true;//初始化
    for (int i = 1; i <= lp; i++) {
        if (ps[i - 1] == '*') {
            dp[i][0] = dp[i - 1][0];//边界问题(只有左边没有上边)
            for (int j = 1; j <= ls; j++)
                dp[i][j] = dp[i - 1][j] || dp[i][j - 1];//上边和左边
        } else {
            for (int j = 1; j <= ls; j++)
                dp[i][j] = (ps[i - 1] == '?' || ss[j - 1] == ps[i - 1]) && dp[i - 1][j - 1];//左上角
        }
    }
    return dp[lp][ls];
}

(2). 方法2

/**
* 方法2：动态规划dp[2][];
* 两个一维数组dp[];空间复杂度降低O(M*N)->O(2*N)
* 时间复杂度略微提高；两个一维数组之间的copy和清空操作是一定的代价。
*/
public boolean isMatch2(String s, String p) {
    if (s != null && s.equals(p)) return true;//若相等直接返回true
    char[] ss = s.toCharArray(), ps = p.toCharArray();
    int ls = ss.length, lp = ps.length;
    boolean[][] dp = new boolean[2][ls + 1];//两个一维数组
    dp[0][0] = true;//初始化
    for (int i = 1; i <= lp; i++) {
        if (ps[i - 1] == '*') {
            dp[1][0] = dp[0][0];//边界问题(只有左边没有上边)
            for (int j = 1; j <= ls; j++)
                dp[1][j] = dp[0][j] || dp[1][j - 1];//上边和左边
            for (int k = 0; k <= ls; k++) dp[0][k] = dp[1][k];
            dp[1][0] = false;//清空，因为这个元素不能自动被覆盖
        } else {
            for (int j = 1; j <= ls; j++)
                dp[1][j] = (ps[i - 1] == '?' || ss[j - 1] == ps[i - 1]) && dp[0][j - 1];//左上角
            for (int k = 0; k <= ls; k++) dp[0][k] = dp[1][k];
            dp[1][0] = false;//清空，因为这个元素不能自动被覆盖
        }
    }
    return dp[0][ls];//一定要返回[0],避免一些特殊案例，没有进入for循环的情况
}

（3）.方法3

/**
* 方法3：时间复杂度的优化：在二维dp[][]基础上优化。
* 优化：
* （1）引入pos和pos_b;
* A、 int pos标记当前行第一个匹配点，即i行中dp[i][j]第一个为true的j值；
* B、 boolean pos_b标记当前行是否存在匹配点；若某一行不存在一个匹配点，
* 则说明后面肯定也不存在，直接返回false。
* （2）pos的作用：pos为当前行第一个匹配点，下一行的匹配点肯定在pos或pos+1之后。
* 原因：dp[i][j]取决于其左上角、左边、上边三个值，因此当前遍历第一个出现匹配的点
* 肯定在上一轮的第一个匹配点之后,利用pos记录下一轮开始匹配的位置。
* p的下一个字符为'*'，下一行就直接从pos开始，pos之前的肯定为false;
* p的下一个字符不是'*',下一行直接从pos+1位置开始，pos及pos之前的肯定为false；原因如下：
* ①若p[i]=='*',由于dp[i][j]=dp[i-1][j] || dp[i][j - 1];只要上边或左边值为true，则dp[i][j]=true;
* 所以一定会有dp[i][pos]=dp[i-1][pos]=true;dp[i][pos]=true,
* 进而有dp[i][pos+1]=true,dp[i][pos+2]=true,...,dp[i][ls]=true;则从pos开始直到末尾都是true。
* ②若p[i]!='*',pos必须先加1;因为此时dp[i][j]只与dp[i-1][j-1]左上角值有关,在下一列，故pos++。
*/
public boolean isMatch(String s, String p) {
    char[] ss = s.toCharArray();
    char[] ps = p.toCharArray();
    int ls = ss.length, lp = ps.length;
    boolean[][] dp = new boolean[lp + 1][ls + 1];
    dp[0][0] = true;//初始化
    boolean pos_b = true;
    for (int i = 1, pos = 0; i <= lp && pos_b; i++) {
        if (ps[i - 1] == '*') {
            for (int j = pos; j <= ls; j++) dp[i][j] = true;//从pos开始到ls都是true
        } else {
            pos_b = false;
            for (int j = ++pos; j <= ls; j++) {//不是*,直接前进一步;因为*可能会不匹配,所以从当前开始
                dp[i][j] = dp[i - 1][j - 1] && (ps[i - 1] == '?' || ss[j - 1] == ps[i - 1]);
                if (pos_b) continue;//已经存在匹配点
                if (dp[i][j]) {//记住且仅仅记住第一个为true的,作为下一轮的pos起点
                    pos_b = true;//标识当前行已经存在匹配点
                    pos = j;//当前行的第一个匹配点位置
                } else pos++;//这里可以没有，去掉也完全通过;目的是找不到匹配点的时候，pos也前进
            }
        }
    }
    print(dp);
    return dp[lp][ls];
}

(4). 方法4

/**
* 方法4：进一步优化。在一维dp[]上优化。
* 应该是最佳方法。
*/
public boolean isMatch1_P(String s, String p) {
    if (s != null && s.equals(p)) return true;//若相等直接返回true
    char[] ss = s.toCharArray(), ps = p.toCharArray();
    int ls = ss.length, lp = ps.length;
    boolean[][] dp = new boolean[2][ls + 1];//两个一维数组
    dp[0][0] = true;//初始化
    boolean pos_b = true;
    for (int i = 1, pos = 0; i <= lp && pos_b; i++) {
        if (ps[i - 1] == '*') {
            for (int j = pos; j <= ls; j++) dp[1][j] = true;
            for (int k = 0; k <= ls; k++) {
                dp[0][k] = dp[1][k];
                dp[1][k] = false;//清空，因为不是从第二个开始遍历，所以需要全部清空为false
            }
        } else {
            pos_b = false;
            for (int j = ++pos; j <= ls; j++) {
                dp[1][j] = (ps[i - 1] == '?' || ss[j - 1] == ps[i - 1]) && dp[0][j - 1];//左上角
                if (pos_b) continue;
                if (dp[1][j]) {
                    pos_b = true;
                    pos = j;
                } else pos++;
            }
            for (int k = 0; k <= ls; k++) {
                dp[0][k] = dp[1][k];
                dp[1][k] = false;//清空，因为不是从第二个开始遍历，所以需要全部清空为false
            }
        }

//辅助：打印二维数组 public void print(boolean[][] arr) { Arrays.stream(arr).forEach(a -> System.out.println(Arrays.toString(a))); System.out.println("=============================="); }

}
return dp[0][ls];//一定要返回[0],避免一些特殊案例，没有进入for循环的情况
}

Rain_Bow_2021

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Leetcode_10_RegularExpressionMatching正则表达式匹配与Leetcode_44_WildcardMatching模糊匹配

一、 Leetcode_10_RegularExpressionMatching正则表达式匹配1. 题目介绍：*Leetcode_10_RegularExpressionMatching_Hard* https://leetcode.com/problems/regular-expression-matching/* Given an input string (s) and ...
复制链接

扫一扫

专栏目录