LeetCode 0010 Regular Expression Matching

The General Solution

I think we better understand the general solution for string matching problem.
The general solution is Nondeterministic Finite Automate (NFA) and Deterministic Finite Automate (DFA).
If you don’t know or forget the 2 things, do not panic, spend a little time on Compiler Construction and you will learn a lot.

The difference between NFA and DFA is that DFA is a simpler version of NFA, which means DFA run faster than NFA.
But please notice that we need more time to build a DFA from a NFA.

Since we only perform string match once for each query, building a NFA and querying from it is obviously faster than building a DFA from a NFA and querying from the DFA.

Check the NFA solution for more implementation detail.

Time & Space Complex

Say string s’s length is n n n and string p’s length is m m m.
Building a NFA needs O ( m ) O(m) O(m) time and O ( m ) O(m) O(m) space.
Querying an answer needs O ( n ∗ m ) O(n*m) O(nm) time in worst case with O ( n ) O(n) O(n) space if we deal with the GC perfectly.

Simpler DP Solution

With the idea of NFA, we can evaluate a simpler solution since complex data structures in interview coding are meaningless.
(but they are still very important in daily development!!)

Imagine the process of NFA:

  1. input an empty character ε \varepsilon ε to the initial states of NFA.
  2. generate new states from previous states and input.
  3. input next character of the string s to current states of NFA.
  4. relate step 2 and step 3 until all characters of string s have been processed.
  5. check whether current states have the final state of NFA.

Since each states in NFA corresponding to each characters of string s, we do not need to construct the NFA explicitly.
Every character of string s can be used as NFA states directly.

So we can write the following DP equation:

  • if P [ j ] P[j] P[j] is a repeatable character (end with ∗ * ), then f i , j = f i , j − 1 ∣ ∣ i s C h a r a c t e r M a t c h ( S i , P j ) & & ( f i − 1 , j − 1 ∣ ∣ f i − 1 , j ) f_{i, j} = f_{i, j-1} || isCharacterMatch(S_i, P_j) \&\& (f_{i-1, j-1} || f_{i-1, j}) fi,j=fi,j1isCharacterMatch(Si,Pj)&&(fi1,j1fi1,j)

  • if P [ j ] P[j] P[j] is not a repeatable character, then f i , j = i s C h a r a c t e r M a t c h ( S i , P j ) & & f i − 1 , j − 1 f_{i, j} = isCharacterMatch(S_i, P_j) \&\& f_{i-1, j-1} fi,j=isCharacterMatch(Si,Pj)&&fi1,j1

f i , j f_{i,j} fi,j means the query result of string S [ 0... i ] S[0 ... i] S[0...i] from the P [ 0... j ] P[0 ... j] P[0...j] NFA.

Then the simpler DP solution come out:

boolean isMatch(int i, int j) {
    if (j < 0) {
      return i < 0;
    }
    if (i < 0) {
      return '*' == p[j] && isMatch(i, j-2);
    }
    if ('*' == p[j]) {
      // repeatable
      int realJ = j -1;
      return isMatch(i, realJ-1) ||
        isCharacterMatch(s[i], p[realJ]) && (isMatch(i-1, realJ-1) || isMatch(i-1, j));
    }
    else {
      return isCharacterMatch(s[i], p[j]) && isMatch(i-1, j-1);
    }
}

boolean isCharacterMatch(char chS, char chP) {
    return '.' == chP || chS == chP;
}

Time & Space Complex

Needs O ( N ∗ M ) O(N*M) O(NM) time and space in worst case.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值