LeetCode知识点总结 - 30_leetcode 30-CSDN博客

本文链接：https://blog.csdn.net/m0_59773145/article/details/127140900

该博客详细介绍了LeetCode第30题的解决方案，通过滑动窗口策略寻找给定字符串中包含所有单词任意排列组合的子串，并解释了如何使用hash table记录单词出现次数，以及如何判断是否找到符合条件的子串。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

LeetCode 30. Substring with Concatenation of All Words

考点	难度
Hash Table	Hard

题目

You are given a string s and an array of strings words. All the strings of words are of the same length.

A concatenated substring in s is a substring that contains all the strings of any permutation of words concatenated.

For example, if words = ["ab","cd","ef"], then “abcdef”, “abefcd”, “cdabef”, “cdefab”, “efabcd”, and “efcdab” are all concatenated strings. “acdbef” is not a concatenated substring because it is not the concatenation of any permutation of words.

Return the starting indices of all the concatenated substrings in s. You can return the answer in any order.

思路

sliding window, 用一个hash table wordsfound来存储某个词在window里出现的次数，用wordsused记录window里有多少个词，用excessword判断window里有没有出现重复的词。right从window的左边界开始，到s的结尾结束，距离是wordlength。
每一个iteration检查一个right，如果substring不在words里面需要reset window，left移到right+wordlength。如果substring在words里面，如果windows超长或者有excess，移动left一个wordlength，wordsfound里对应的word -1。另外检查最左边（被移动掉的）substring是不是excess。之后把新的substring在wordsfound里对应位置+1，如果wordsfound里这个词出现次数小于等于words里这个词出现次数，说明还没有excess，wordsused可以+1.否则excess改为true。如果wordsused = k且没有excess，把left存到ans里面。
把range(wordlength)里的每个数当作初始left进行一次sliding window来排除掉offset。

答案

class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        n = len(s)
        k = len(words)
        word_length = len(words[0])
        substring_size = word_length * k
        word_count = collections.Counter(words)
        
        def sliding_window(left):
            words_found = collections.defaultdict(int)
            words_used = 0
            excess_word = False
            
            # Do the same iteration pattern as the previous approach - iterate
            # word_length at a time, and at each iteration we focus on one word
            for right in range(left, n, word_length):
                if right + word_length > n:
                    break

                sub = s[right : right + word_length]
                if sub not in word_count:
                    # Mismatched word - reset the window
                    words_found = collections.defaultdict(int)
                    words_used = 0
                    excess_word = False
                    left = right + word_length # Retry at the next index
                else:
                    # If we reached max window size or have an excess word
                    while right - left == substring_size or excess_word:
                        # Move the left bound over continously
                        leftmost_word = s[left : left + word_length]
                        left += word_length
                        words_found[leftmost_word] -= 1

                        if words_found[leftmost_word] == word_count[leftmost_word]:
                            # This word was the excess word
                            excess_word = False
                        else:
                            # Otherwise we actually needed it
                            words_used -= 1
                    
                    # Keep track of how many times this word occurs in the window
                    words_found[sub] += 1
                    if words_found[sub] <= word_count[sub]:
                        words_used += 1
                    else:
                        # Found too many instances already
                        excess_word = True
                    
                    if words_used == k and not excess_word:
                        # Found a valid substring
                        answer.append(left)
        
        answer = []
        for i in range(word_length):
            sliding_window(i)

        return answer