You are given a string, S, and a list of words, L, that are all of the same length. Find all starting indices of substring(s) in S that is a concatenation of each word in L exactly once and without any intervening characters.
For example, given:
S: "barfoothefoobarman"
L: ["foo", "bar"]
You should return the indices: [0,9]
.
(order does not matter).
题意:给出字符串s和单词数组t,t中所有单词等长,在s中找出一些子串,满足子串中t中所有单词恰好出现一次,且单词不能重叠(但是不同的子串可以重叠),求所有子串的起始下标。单词表t中单词可能有重复。
分析:设单词长wlen,字符串长slen,单词数wcnt,复杂度O(wlen*slen)的方法:
先对所有单词建立哈希表计数word,在建立哈希表mp,记录当前子串中每个单词的个数,cnt记录子串中单词总数,子串起始位置st
外层循环,遍历wlen,确定起始点为[0,…,wlen-1]
内层循环,遍历s,从起始点开始每wlen取一个单词now,看是否在t中,若不在,cnt=0,st = 当前位置+wlen
若在t中,分两种情况:(1)若mp[now]<word[now],mp[now]++,若cnt到达t长度,记录起始下标st
(2)否则,从st开始遍历单词直到遇到now,逐个从mp中去掉
代码:
class Solution {
public:
vector<int> findSubstring(string S, vector<string> &L) {
vector<int> ans;
if(L.empty() || S.empty())
return ans;
unordered_map<string,int> word;
int wcnt = L.size();
for(int i=0; i<wcnt; i++)
{
if(word.find(L[i])!=word.end())
word[L[i]]++;
else
word[L[i]] = 1;
}
int wlen = L[0].length();
for(int i=0; i<wlen; i++)
{
unordered_map<string,int> mp;
int cnt = 0,st=i;
for(int j=i; j+wlen<=S.length(); j+=wlen)
{
string now = S.substr(j,wlen);
if(word.find(now)!=word.end())
{
if(mp.find(now)!=mp.end() && mp[now]>=word[now])
{
string tmp;
do
{
tmp = S.substr(st,wlen);
mp[tmp]--;
cnt--;
st += wlen;
}while(tmp!=now);
}
if(mp.find(now)!=mp.end()) mp[now]++;
else mp[now] = 1;
cnt++;
if(cnt==wcnt)
{
ans.push_back(j-(wcnt-1)*wlen);
}
}
else
{
st = j+wlen;
cnt = 0;
mp.clear();
}
}
}
return ans;
}
};