LeetCode 28. Implement strStr() Rabin-Karp算法

Description:
Implement strStr().

Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example 1:

Input: haystack = “hello”, needle = “ll”
Output: 2

Clarification:

What should we return when needle is an empty string? This is a great question to ask during an interview.

For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C’s strstr() and Java’s indexOf().

Solution:

思路一:
本题就是字符串逐个比较,如果用String.startWith()方法 应该还会方便点。

public int strStr(String haystack, String needle) {
        if(needle.length() == 0) return 0;
        if(needle.length() > haystack.length()) return -1;
        int len1 = haystack.length();
        int len2 = needle.length();
        int count = 0;
        for(int i = 0; i < len1; ++i){
            for(int j = 0; j < len2; ++j){
                if(i >= len1) return -1;
                if(j == 0) count = i;
                if(haystack.charAt(i) == needle.charAt(j)){
                    if(j == len2 - 1) return count;
                    ++i;
                    continue;
                }
                i -= j;
                break;
            }
        }
        return -1;
    }

思路二:
此题可以用Rabin-Karp算法:维基百科Rabin-Karp算法
该算法的伪代码如下

function RabinKarp(string s[1..n], string pattern[1..m])
    hpattern := hash(pattern[1..m]);
    for i from 1 to n-m+1
        hs := hash(s[i..i+m-1])
        if hs = hpattern
            if s[i..i+m-1] = pattern[1..m]
                return i
    return not found

特别注意的是:只有当两个字符串hash值相同时,才会比较两个字符串值是否相同(java开发中,如果要比价两个元素一致,如果直接用equals方法,可能会造成效率低下,因为很多对象equals都重写了,String类就是如此。但是用hashCode只需要比较一下数字就行,执行速度特别快,如果hashcode不一致,也没必要equals,如果一样可以再次比较equals,确保是同一个对象。)

对于时间复杂度,维基百科如是说:

Lines 2, 4, and 6 each require O(m) time. However, line 2 is only executed once, and line 6 is only executed if the hash values match, which is unlikely to happen more than a few times. Line 5 is executed O(n) times, but each comparison only requires constant time, so its impact is O(n). The issue is line 4.
所以通常来说,主要是第四行的代码,时间复杂度是O(n)

通常来说,计算hash时间复杂度是O(mn),但是由于采用滚动hash计算,时间复杂度是O(n),解析如下:(看不懂可以只记住结论~)

Naively computing the hash value for the substring s[i+1…i+m] requires O(m) time because each character is examined. Since the hash computation is done on each loop, the algorithm with a naïve hash computation requires O(mn) time, the same complexity as a straightforward string matching algorithms. For speed, the hash must be computed in constant time. The trick is the variable hs already contains the previous hash value of s[i…i+m-1]. If that value can be used to compute the next hash value in constant time, then computing successive hash values will be fast.
The trick can be exploited using a rolling hash. A rolling hash is a hash function specially designed to enable this operation. A trivial (but not very good) rolling hash function just adds the values of each character in the substring. This rolling hash formula can compute the next hash value from the previous value in constant time:

滚动计算hash的思路如下:

s[i+1..i+m] = s[i..i+m-1] - s[i] + s[i+m]

举个例子,滑动窗口大小是2,字符串是"abcde",“bc” = “ab” - “a” + "c"
例如:

 [(104 × 256 ) % 101  + 105] % 101  =  65
 (ASCII of 'h' is 104 and of 'i' is 105)
 // ASCII a = 97, b = 98, r = 114. 
hash("abr") =  [ ( [ ( [ (97 × 256) % 101 + 98 ] % 101 ) × 256 ] % 101 ) + 114 ] % 101 = 4
//old hash -ve avoider old 'a' left base offset  base shift new 'a' prime modulus
hash("bra") =  [ ( 4 + 101 - 97 * [(256%101)*256] % 101 ) * 256 + 97 ] % 101 = 30

如果理解不了,可以思考下我们碰到101,该怎么计算他的值?
先拿到1,然后拿到0 (1 x 10 + 0)= 10,再然后拿到1, (10 x 10 + 1 = 101 )

所以本题最终解题代码如下:

这里对函数hashString进行了重载,较少参数的是直接生成string的hash值,较多参数则是滑动hash

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.length() > haystack.length()) return -1;
        if(needle.equals(haystack.substring(0, needle.length()))) return 0;
        int length = needle.length();
        int res = hashString(haystack, length);
        int ans_hash = hashString(needle, needle.length());
        for(int i = 0; i < haystack.length() - length ; ++i){
            res = hashString(haystack, i + length, length, res);
            if(res == ans_hash) {
                if(needle.equals(haystack.substring(i + 1, i + 1 + length)) ) {
                    return i + 1;
                }
            }
        }
        return -1;
    }
    
    public int hashString(String s, int index, int length, int res){
        int base = 256;
        int temp = s.charAt(index - length);
        for(int i = 0; i < length - 1; ++i) {
            temp = temp * base % 101;
        }
        res = (res + 101 - temp) * base + s.charAt(index);
        return res % 101;
    }

    public int hashString(String s, int length){
        int base = 256;
        int result = 0;
        for(int i = 0; i < length; ++i){
            result = ((result * base) % 101 + s.charAt(i)) % 101;
        }
        return result;
    }
}
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值