【LEETCODE】28-Implement strStr()

最新推荐文章于 2022-04-22 16:58:43 发布

Alice熹爱学习

最新推荐文章于 2022-04-22 16:58:43 发布

阅读量499

点赞数

分类专栏： LEETCODE 文章标签： LEETCODE PYTHON

本文链接：https://blog.csdn.net/aliceyangxi1987/article/details/50373163

版权

LEETCODE 专栏收录该内容

137 篇文章 0 订阅

订阅专栏

Implement strStr().

Returns the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

参考：

http://blog.csdn.net/hcbbt/article/details/44099749

http://www.cnblogs.com/zuoyuan/p/3698900.html

KMP算法详解

http://blog.csdn.net/joylnwang/article/details/6778316

题意：

在字符串haystack中找needle，

如果存在，则返回第一次出现的起始位置

如果不存在，则返回－1

思路：

参考：

rolling hash 算法

http://blog.csdn.net/yanghua_kobe/article/details/8914970

Rolling Hash(Rabin-Karp 算法)匹配字符串与anagram串

http://blog.csdn.net/yanghua_kobe/article/details/8914970

http://courses.csail.mit.edu/6.006/spring11/rec/rec06.pdf

例如在S中找P：

S=[4,8,9,0,2,1,0,7]

P=[9,0,2,1,0]

移动窗口开始匹配：

S0=[4,8,9,0,2]

S1=[8,9,0,2,1]

S2=[9,0,2,1,0]

每次的窗口计算的hash值与needle的hash值做比较

h(P)=90210mod m

其中移动窗口的简单方式为：去掉最高位－》尾部加上后续一位

h(S0)= 48902 mod m

h(S1)= 89021 mod m

从48902开始，去除第一位得到8902，乘以10得到89020，然后加上下一位数值得到：89021

即：

更通用的表示：

hash函数如下：

则：

h：si为从4开始数5个作用后的函数值

h：s(i+1)为从8开始数5个作用后的函数值

class Solution:  
    # @param haystack, a string  
    # @param needle, a string  
    # @return an integer  
    def strStr(self, haystack, needle):  
        hlen, nlen = len(haystack), len(needle)                    #haystack＝‘aabbaa’，needle＝‘bb’
        if nlen == 0:  
            return 0  
        if nlen > hlen or hlen == 0:  
            return -1  
  
        rolling = lambda x, y: x * 26 + y                           #基是26
        get_hash = lambda ch: ord(ch) - ord('a')                    #将字母转化成数字：ch='b'，与‘a’的距离
  
        nhash = reduce(rolling, map(get_hash, needle))              #map将needle整体转换成数字串
                                                                    #rolling为两步间的关系
                                                                    #reduce将map得到的串通过rolling的关系完成两两迭代
                                                                    #最终得到：k1*26^(n-1)+k2*26^(n-2)+...+kn*1
        hhash = reduce(rolling, map(get_hash, haystack[:nlen]))     
        if nhash == hhash:                                          #先check首尾的nlen个字符是否就是needle
            return 0  
  
        high_base = 26 ** (nlen - 1)                                #最高位的基数位
        
        for i in range(nlen, hlen):                                 #上面check完nlen之前的了，所以直接从此处开始check后面的每个i
            hhash -= get_hash(haystack[i - nlen]) * high_base       #移除最高位  
            hhash = rolling(hhash, get_hash(haystack[i]))           #加上后续一位
            if nhash == hhash:                                      #如果得到的hash值与needle的hash值相等，说明是needle
                return i - nlen + 1                                 #因此时i已经到此小窗口尾部，所以起始位置为 i - nlen + 1
  
        return -1