leetcode 792. Number of Matching Subsequences

最新推荐文章于 2022-11-17 11:03:21 发布

大师所言极是

最新推荐文章于 2022-11-17 11:03:21 发布

阅读量1.1k

点赞数 1

本文链接：https://blog.csdn.net/u010829672/article/details/79833283

版权

leetcode 同时被 3 个专栏收录

13 篇文章 0 订阅

订阅专栏

python

13 篇文章 0 订阅

订阅专栏

Binary Search

3 篇文章 0 订阅

订阅专栏

leetcode 792. Number of Matching Subsequences

leetcode 792. Number of Matching Subsequences

题目描述

Given string S and a dictionary of words words, find the number of words[i] that is a subsequence of S.

Note:

All words in words and S will only consists of lowercase letters..
The length of S will be in the range of [1, 50000].
The length of words will be in the range of [1, 5000].
The length of words[i] will be in the range of [1, 50].

Difficulty: Medium
792. Number of Matching Subsequences

中文描述
给你一个字符串S和一些单词组合words。找出所有words中可以由S子串构成的单词word的个数。

输入格式
输入个字符串S和一些单词组合words。

Examples:

Input: S = “abcde”，words = [“a”, “bb”, “acd”, “ace”]
Output: 3
解释:
a, acd, ace可以由S = “abcde”的子串组成（不用连续），所以答案为3

解答思路

解法一：逐个比对

1.把 $words$ 里的单词 $word$ 和 $S$ 逐个比对。用两个指针分别记录 $word$ 和 $S$ 当前的位置，如果匹配两个指针都往前进1，否则只有 $S$ 的指针前进。如果 $S$ 的指针先到尾部表示无法由子串构成，反正表示可以。

2.复杂度估计 $O\left(mn \right)，m$ 为 $words$ 的个数， $n$ 表示 $S$ 的长度。这种情况会超时。
解法二：二分查找

1.考虑第一种情况主要是因为字符批对的效率不够高，我们需要一位位字符批对过去。所以我们可考虑用更高效的方法去批对。

2.先记录下 $S$ 中各个字符出现的位置，并且按顺序记录下来，记为 $dicts$ 。

3.初始化记录当前指向 $S$ 位置的指针为-1，对每个 $word$ 的字符 $char$ 去 $dicts$ 中查找是否存在该字符，如果没有则说明子串不存在，可以跳转下一个 $word$ 。反之，我们可以用二分查找的方式去寻找 $S$ 中所有字符 $char$ 的位置中，第一个大于当前指向 $S$ 位置的指针的位置（这边用到了贪心的思想）。如果存在，则更新指向 $S$ 位置的指针到该位置。否则说明子串不存在。

4.复杂度估计 $O\left(mlogn\right)，m$ 为 $words$ 的个数， $n$ 表示 $S$ 的长度。
解法三：字典树

1.把 $words$ 里中具有相同开头的单词 $word$ 开头字符合并，记成一个字典树。

2.遍历 $S$ 中的每个字符 $char$ ，如果 $char$ 存在当前字典树中，则把该节点删除，把原本该节点的元素根据剩下字符串的开头字符重新分配，构成新的字典树。如果剩下为空字符，表示该字符串能由 $S$ 的子串构成，计数器加1。

3.当遍历完 $S$ ，返回计数器的的结果即是最后答案。

4.复杂度估计 $O\left(nk \right)，n$ 为 $S$ 的长度， $k$ 表示删除一个节点后，需要重新分配字符串的平均个数。最坏情况下 $k=m$ ， $m$ 为 $words$ 的个数。不过常规情况下 $k << m$ 。

代码

解法一，超时

class Solution(object):
    def numMatchingSubseq(self, S, words):
        """
        :type S: str
        :type words: List[str]
        :rtype: int
        OVERTIME
        """
        ans = 0
        for word in words:
            start = 0
            i = 0
            # word的每个字符去查找
            while i < len(word):
                # 与S字符一个个比对
                while start < len(S):
                    # 有相同的，双方都往前进1位
                    if word[i] == S[start]:
                        start += 1
                        i += 1
                        break
                    # 否则只有S往前进一位
                    else:
                        start += 1
                # 如果S全部走完，还是不能与word匹配，则跳出
                if start >= len(S):
                    break
            # 记录所有能从S中找到的word
            if i == len(word):
                ans += 1
        return ans

解法二，二分：

class Solution(object):
    def numMatchingSubseq(self, S, words):
        """
        :type S: str
        :type words: List[str]
        :rtype: int
        938MS
        """
        import collections, bisect
        ans = 0
        S_dicts = collections.defaultdict(list)
        # 记录下S中每个字母出现的位置
        for i, char in enumerate(S):
            S_dicts[char].append(i)
        for word in words:
            # 一开每个从S的0位开始，用二分查找，我们就从-1开始
            start = -1
            flag = True
            # word中的每个字母去S中查找位置
            for char in word:
                # 如果该字母不再S中，则word不能由S子串组成
                if S_dicts[char] == []:
                    flag = False
                    break
                else:
                    # 在S中查找最近的位置，且要在上次选定的位置之后的
                    pos = bisect.bisect(S_dicts[char], start)
                    # 如果有，则更新当前位置
                    if pos < len(S_dicts[char]):
                        start = S_dicts[char][pos]
                    # 找不到满足条件的位置
                    else:
                        flag = False
            # 满足条件的word，ans+1
            if flag:
                ans += 1
        return ans

解法三，字典树，参考了别人的代码：

class Solution(object):
    def numMatchingSubseq(self, S, words):
        """
        :type S: str
        :type words: List[str]
        :rtype: int
        527ms
        """
        import collections
        waiting = collections.defaultdict(list)
        for w in words:
            waiting[w[0]].append(iter(w[1:]))
        for c in S:
            # print('c', c)
            # Python 字典 pop() 方法删除字典给定键 key 所对应的值，返回值为被删除的值。key值必须给出。 否则，返回default值。
            # 把所有以c开头的word都删除
            for it in waiting.pop(c, ()):
                # 如果这个word还有其他字母，则与之前的合并，否则放到None中，表示该word能匹配
                waiting[next(it, None)].append(it)
        return len(waiting[None])

大师所言极是

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
leetcode 792. Number of Matching Subsequences

leetcode 792. Number of Matching Subsequencesleetcode 792. Number of Matching Subsequences题目描述解答思路代码题目描述Given string S and a dictionary of words words, find the number of words[i] t...
复制链接

扫一扫