LeetCode 0010 Regular Expression Matching

最新推荐文章于 2024-09-14 18:52:56 发布

Jiachen Yu

最新推荐文章于 2024-09-14 18:52:56 发布

阅读量203

点赞数

分类专栏： dp 字符串文章标签： leetcode

本文链接：https://blog.csdn.net/dpppBR/article/details/106065242

版权

dp 同时被 2 个专栏收录

7 篇文章 0 订阅

订阅专栏

字符串

3 篇文章 0 订阅

订阅专栏

The General Solution

I think we better understand the general solution for string matching problem.
The general solution is Nondeterministic Finite Automate (NFA) and Deterministic Finite Automate (DFA).
If you don’t know or forget the 2 things, do not panic, spend a little time on Compiler Construction and you will learn a lot.

The difference between NFA and DFA is that DFA is a simpler version of NFA, which means DFA run faster than NFA.
But please notice that we need more time to build a DFA from a NFA.

Since we only perform string match once for each query, building a NFA and querying from it is obviously faster than building a DFA from a NFA and querying from the DFA.

Check the NFA solution for more implementation detail.

Time & Space Complex

Say string s’s length is $n$ and string p’s length is $m$ .
Building a NFA needs $O (m)$ time and $O (m)$ space.
Querying an answer needs $O (n * m)$ time in worst case with $O (n)$ space if we deal with the GC perfectly.

Simpler DP Solution

With the idea of NFA, we can evaluate a simpler solution since complex data structures in interview coding are meaningless.
(but they are still very important in daily development!!)

Imagine the process of NFA:

input an empty character $\varepsilon$ to the initial states of NFA.
generate new states from previous states and input.
input next character of the string s to current states of NFA.
relate step 2 and step 3 until all characters of string s have been processed.
check whether current states have the final state of NFA.

Since each states in NFA corresponding to each characters of string s, we do not need to construct the NFA explicitly.
Every character of string s can be used as NFA states directly.

So we can write the following DP equation:

if $P [j]$ is a repeatable character (end with $*$ ), then $f_{i, j} = f_{i, j-1} || isCharacterMatch(S_i, P_j) \&\& (f_{i-1, j-1} || f_{i-1, j})$
if $P [j]$ is not a repeatable character, then $f_{i, j} = isCharacterMatch(S_i, P_j) \&\& f_{i-1, j-1}$

$f_{i,j}$ means the query result of string $S [0 . . . i]$ from the $P [0 . . . j]$ NFA.

Then the simpler DP solution come out:

boolean isMatch(int i, int j) {
    if (j < 0) {
      return i < 0;
    }
    if (i < 0) {
      return '*' == p[j] && isMatch(i, j-2);
    }
    if ('*' == p[j]) {
      // repeatable
      int realJ = j -1;
      return isMatch(i, realJ-1) ||
        isCharacterMatch(s[i], p[realJ]) && (isMatch(i-1, realJ-1) || isMatch(i-1, j));
    }
    else {
      return isCharacterMatch(s[i], p[j]) && isMatch(i-1, j-1);
    }
}

boolean isCharacterMatch(char chS, char chP) {
    return '.' == chP || chS == chP;
}