LeetCode 30. Substring with Concatenation of All Words
考点 | 难度 |
---|---|
Hash Table | Hard |
题目
You are given a string s
and an array of strings words
. All the strings of words are of the same length.
A concatenated substring in s is a substring that contains all the strings of any permutation of words
concatenated.
For example, if words = ["ab","cd","ef"]
, then “abcdef”, “abefcd”, “cdabef”, “cdefab”, “efabcd”, and “efcdab” are all concatenated strings. “acdbef” is not a concatenated substring because it is not the concatenation of any permutation of words
.
Return the starting indices of all the concatenated substrings in s. You can return the answer in any order.
思路
sliding window, 用一个hash table wordsfound来存储某个词在window里出现的次数,用wordsused记录window里有多少个词,用excessword判断window里有没有出现重复的词。right从window的左边界开始,到s
的结尾结束,距离是wordlength。
每一个iteration检查一个right,如果substring不在words里面需要reset window,left移到right+wordlength。如果substring在words里面,如果windows超长或者有excess,移动left一个wordlength,wordsfound里对应的word -1。另外检查最左边(被移动掉的)substring是不是excess。之后把新的substring在wordsfound里对应位置+1,如果wordsfound里这个词出现次数小于等于words里这个词出现次数,说明还没有excess,wordsused可以+1.否则excess改为true。如果wordsused = k且没有excess,把left存到ans里面。
把range(wordlength)里的每个数当作初始left进行一次sliding window来排除掉offset。
答案
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
n = len(s)
k = len(words)
word_length = len(words[0])
substring_size = word_length * k
word_count = collections.Counter(words)
def sliding_window(left):
words_found = collections.defaultdict(int)
words_used = 0
excess_word = False
# Do the same iteration pattern as the previous approach - iterate
# word_length at a time, and at each iteration we focus on one word
for right in range(left, n, word_length):
if right + word_length > n:
break
sub = s[right : right + word_length]
if sub not in word_count:
# Mismatched word - reset the window
words_found = collections.defaultdict(int)
words_used = 0
excess_word = False
left = right + word_length # Retry at the next index
else:
# If we reached max window size or have an excess word
while right - left == substring_size or excess_word:
# Move the left bound over continously
leftmost_word = s[left : left + word_length]
left += word_length
words_found[leftmost_word] -= 1
if words_found[leftmost_word] == word_count[leftmost_word]:
# This word was the excess word
excess_word = False
else:
# Otherwise we actually needed it
words_used -= 1
# Keep track of how many times this word occurs in the window
words_found[sub] += 1
if words_found[sub] <= word_count[sub]:
words_used += 1
else:
# Found too many instances already
excess_word = True
if words_used == k and not excess_word:
# Found a valid substring
answer.append(left)
answer = []
for i in range(word_length):
sliding_window(i)
return answer