题目:
You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]
You should return the indices: [0,9]
.
(order does not matter).
思路:
1、常规思路:将该题目和Leetcode 28 implement strStr()做类比。如果我们将words中的每个string看做是strStr()中待查找子串中的每个字符,则两者唯一的区别就在于:findSubstring()中的各个string之间的出现顺序是无关的,而strStr()中子串的字符出现顺序是需要严格遵守的。因此,strStr()可以采用KMP这样的高级算法,而findSubstring则可以在类似strStr()暴力枚举的基础上,采用哈希表提高查找效率。假设原串s的长度为m,字符串数组words中共有k个字符串,每个字符串的长度为n,则该算法的时间复杂度为O(m*n*k),而空间复杂度为O(n*k)。
2、移动窗口法:刚刚看到这一更为精妙的算法。首先初始化一个长度为0的窗口,定义头部是begin,尾部是end。判断end后面的一个单词,如果该单词不在word里面,那么把begin后移到end的后一个位置,重新初始化窗口,继续判断。如果该单词在word里面,则end往后移动,即窗口伸长一个单词的量;如果end后面的单词是之前出现过的,不过words中已经没有这个单词的容量,则把begin后移到该单词第一次出现的位置,继续判断。这个方法的时间复杂度为O(m*n),而空间复杂度继续为O(n*k)。
移动窗口是字符串处理相关题目中最常见的解决思路之一,一般可以采用Two Pointers进行解决。后面Leetcode中还有好几个类似的题目。
代码:
1、常规思路:
class Solution {
public:
vector<int> findSubstring(string s, vector<string>& words)
{
vector<int> result;
if(s.size()==0 || words.size()==0)
return result;
unordered_map<string, int> hash;
for(auto val: words)
hash[val]++;
int size = words.size();
int length = words[0].size();
int s_length = s.size();
for(int i = 0; i <= s_length - size * length; ++i)
{
unordered_map<string, int> find;
int j = i;
for(; j < i + size * length; j += length)
{
string tem = s.substr(j, length);
find[tem]++;
if(!hash.count(tem) || find[tem] > hash[tem])
break;
}
if(j >= i + size * length)
result.push_back(i);
}
return result;
}
};
2、移动窗口法:
class Solution {
public:
vector<int> findSubstring(string s, vector<string>& words)
{
unordered_map<string, int> hash; // the count of word in words
for(int i = 0; i < words.size(); i++)
hash[words[i]]++;
int word_length = words[0].size();
int size = s.size();
vector<int> result;
for(int i = 0; i < word_length; i++) // the start index of the first word
{
unordered_map<string, int> hash2;
int start = i, count = 0;
for(int end = i; end <= size - word_length; end += word_length)
{
string word = s.substr(end, word_length);
if(hash.find(word) != hash.end()) // the word is found
{
hash2[word]++;
if(hash2[word] <= hash[word])
{
count++;
}
else
{
for(int k = start; ; k += word_length)
{
string tmpstr = s.substr(k, word_length);
hash2[tmpstr]--;
if(tmpstr == word)
{
start = k + word_length;
break;
}
count--;
}
}
if(count == words.size())
result.push_back(start);
}
else // the word is not found, so reset the windows
{
start = end + word_length;
hash2.clear();
count = 0;
}
}
}
return result;
}
};