LeetCode 28. 实现 strStr()

von Libniz

已于 2022-04-15 20:18:27 修改

阅读量92

点赞数

分类专栏： LeetCode解题目录文章标签： Leetcode

于 2022-04-13 19:46:36 首次发布

本文链接：https://blog.csdn.net/demon_lmman/article/details/124155829

版权

LeetCode解题目录专栏收录该内容

34 篇文章 0 订阅

订阅专栏

给你两个字符串 haystack 和 needle ，请你在 haystack 字符串中找出 needle 字符串出现的第一个位置（下标从 0 开始）。如果不存在，则返回 -1 。
说明：当 needle 是空字符串时，我们应当返回什么值呢？这是一个在面试中很好的问题。

对于本题而言，当 needle 是空字符串时我们应当返回 0 。这与 C 语言的 strstr() 以及 Java 的 indexOf() 定义相符。

示例 1：

输入：haystack = "hello", needle = "ll"
输出：2

示例 2：

输入：haystack = "aaaaa", needle = "bba"
输出：-1

示例 3：

输入：haystack = "", needle = ""
输出：0

虽然这是一道简单题，但是也有着不简单的解法，这里附上暴力匹配的解法和Rabin Karp的解法，至于KMP的解法虽然经典，但是代码并不好记，实际面试还是采用Rabin Karp更加方便，而且两者时间复杂度相同。

1 暴力匹配

原字符串长度为n，需匹配字符串长度为m。时间复杂度 O(m*n)

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        if haystack is None or needle is None:
            return -1
        if len(needle) == 0:
            return 0

        tar_len = len(needle)
        for i in range(len(haystack) - tar_len + 1):
            for j in range(tar_len):  # 原地匹配, 避免传参过程中复制字符串的时间消耗
                if haystack[i + j] != needle[j]:
                    break
            else:
                return i
        return -1

2 Rabin Karp

时间复杂度 O(n+m)
Rabin Karp解法的思路十分好记，将原来暴力解法中每个子串的匹配，代替成哈希值的匹配即可，不过哈希值的计算需要仔细设计，通过前一个子串的哈希值，就可以在O(1)的时间下算出下一个子串的哈希值，因此匹配哈希值的部分需要O(n)。最后在哈希值相等的情况下，为了避免哈希冲突，再次进行字符的逐个匹配，时间消耗O(m) ，所以最终时间复杂度为O(n+m)

class Solution_Rabin_Karp:
    """
    Rabin Karp 算法求解字符串匹配 时间O(n+m)
    """
    def strStr(self, haystack: str, needle: str) -> int:
        if haystack is None or needle is None:
            return -1
        if len(needle) == 0:
            return 0

        tar_len = len(needle)
        tar_hash = self.str2hash(needle, begin=0, end=tar_len, order=0)
        src_hash = 0
        for i in range(len(haystack) - tar_len + 1):
            src_hash = self.str2hash(haystack, begin=i, end=i + tar_len, pre_hash=src_hash, order=i)
            if src_hash == tar_hash:
                for j in range(tar_len):
                    if haystack[i + j] != needle[j]:  # 再次比较, 避免哈希冲突的结果
                        break
                else:
                    return i
        return -1

    def str2hash(self, input_str, begin, end, pre_hash=0, order=0):

        hash_value = 0
        weight = 31
        mod = 1e6

        if order == 0:
            for s in input_str[begin:end]:
                hash_value = hash_value * weight + ord(s)
                hash_value = int(hash_value % mod)
            return hash_value

        power = pow(weight, end - begin - 1) % mod
        hash_value = pre_hash - ord(input_str[begin - 1]) * power
        hash_value = (hash_value + mod) % mod  # 防止为负数
        hash_value = hash_value * weight + ord(input_str[end - 1])

        return int(hash_value % mod)

3 原地匹配

由于字符串是不可变类型，在进行函数参数传递时将为形参重新生成一份原字符串，这里将进行一定时间的消耗。而在原地计算哈希值或匹配字符串，则可避免这些消耗，下面是Rabin Karp原地哈希的代码，多次运行计时后，发现是比传参的形式快了一些，但代码的可读性就差了一点。

class Solution_Rabin_Karp_inplace:
    """
    Rabin Karp 算法求解字符串匹配 时间O(n+m)
    """
    weight = 31
    Base = 1e6

    def strStr(self, haystack: str, needle: str) -> int:
        if haystack is None or needle is None:
            return -1
        if len(needle) == 0:
            return 0

        tar_len = len(needle)

        # generate target hash code
        tar_hash = 0
        for s in needle:
            tar_hash = tar_hash * self.weight + ord(s)
            tar_hash = int(tar_hash % self.Base)

        src_hash = 0
        for i in range(len(haystack) - tar_len + 1):
            if i == 0:  # first substring hash code
                for s in haystack[i: i + tar_len]:
                    src_hash = src_hash * self.weight + ord(s)
                    src_hash = int(src_hash % self.Base)
            else:
                power = pow(self.weight, tar_len - 1) % self.Base
                src_hash = src_hash - ord(haystack[i - 1]) * power
                src_hash = (src_hash + self.Base) % self.Base  # 防止为负数
                src_hash = int(src_hash * self.weight + ord(haystack[i + tar_len - 1])) % self.Base

            if src_hash == tar_hash:
                for j in range(tar_len):
                    if haystack[i + j] != needle[j]:  # 再次比较, 避免哈希冲突的结果
                        break
                else:
                    return i
        return -1

von Libniz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
LeetCode 28. 实现 strStr()

28. 实现 strStr()链接：https://leetcode-cn.com/problems/implement-strstr/实现 strStr() 函数。给你两个字符串 haystack 和 needle ，请你在 haystack 字符串中找出 needle 字符串出现的第一个位置（下标从 0 开始）。如果不存在，则返回 -1 。说明：当 needle 是空字符串时，我们应当返回什么值呢？这是一个在面试中很好的问题。对于本题而言，当 needle 是空字符串时我们应当返回 0 。这
复制链接

扫一扫

专栏目录