题目:
Given string S
and a dictionary of words words
, find the number of words[i]
that is a subsequence of S
.
Example : Input: S = "abcde" words = ["a", "bb", "acd", "ace"] Output: 3 Explanation: There are three words inwords
that are a subsequence ofS
: "a", "acd", "ace".
Note:
- All words in
words
andS
will only consists of lowercase letters. - The length of
S
will be in the range of[1, 50000]
. - The length of
words
will be in the range of[1, 5000]
. - The length of
words[i]
will be in the range of[1, 50]
.
思路:
1、暴力法:
对于每个words[i],我们计算它是不是S的子序列。假设单词的平均长度为m,字典的大小为n,那么算法的时间复杂度就是O(mn),这个时间复杂度过不了大数据测试。
2、二分查找:
我们记录下来S中每个字符的出现次数。然后依次对于每个单词,在存储S中字符的有序数组中二分查找,每次找到之后,就从下一个位置上开始再找下一个字符。这样在处理每个words[i]的时候,我们是跳着走的,而不是顺序走的,就将原来判断一个word是不是S的子序列的时间复杂度从O(m)降低到了O(logm)这个量级。
代码:
1、暴力法:
class Solution {
public:
int numMatchingSubseq(string S, vector<string>& words) {
int ret = 0;
for (int i = 0; i < words.size(); ++i) {
if (isSubSeq(S, words[i])) {
++ret;
}
}
return ret;
}
private:
bool isSubSeq(const string &s, const string &sub) {
int i = 0, j = 0;
while (i < sub.length() && j < s.length()) {
if (sub[i] == s[j]) {
++i, ++j;
}
else {
++j;
}
}
return i == sub.length();
}
};
2、二分查找:
class Solution {
public:
int numMatchingSubseq(string S, vector<string>& words) {
vector<vector<int>> alpha (26); // map from chars to their indices
for (int i = 0; i < S.size (); ++i) {
alpha[S[i] - 'a'].push_back (i);
}
int res = 0;
for (const auto& word : words) {
int x = -1; // the current index
bool found = true;
for (char c : word) {
auto it = upper_bound (alpha[c - 'a'].begin (), alpha[c - 'a'].end (), x);
if (it == alpha[c - 'a'].end ()) {
found = false;
}
else {
x = *it;
}
}
if (found) {
++res;
}
}
return res;
}
};