LeetCode OJ - Implement strStr()

最新推荐文章于 2016-09-28 17:25:33 发布

姥姥教我学编程

最新推荐文章于 2016-09-28 17:25:33 发布

阅读量483

点赞数

分类专栏：基础篇

本文链接：https://blog.csdn.net/afxstar/article/details/38173379

版权

基础篇专栏收录该内容

154 篇文章 0 订阅

订阅专栏

Implement strStr().

Returns a pointer to the first occurrence of needle in haystack, or null if needle is not part of haystack.

分析：字符串模式匹配呀，想到用KMP。不过还是先实现普通算法。

思路一：逐个遍历haystack，没有考虑haystack和needle长度关系，可能会有很多无效计算。

char *strStr(char *haystack, char *needle) {
    assert(haystack && needle);
    
    char *s, *d, *p;
    for(p = haystack; *p != '\0'; p++) {
        s = p;
        d = needle;
        while(*d != '\0'){
            if(*s != *d) break;
            s++;
            d++;
	};

	if(*d == '\0') return p;
    }
    
    return NULL;
}

思路二：考虑他们之间的长度关系，让比较次数变小。代码虽然通过，但是太低效了。

class Solution {
public:
    char *strStr(char *haystack, char *needle) {
        assert(haystack && needle);
        
        int len1 = strlen(haystack);
        int len2 = strlen(needle);
        
        for(int i = 0; i <= len1 - len2; i++) {
            char *p = haystack + i;
            char *q = needle;
            
            while(*q != '\0') {
                if(*p != *q) break;
                
                p++;
                q++;
            }
            
            if(*q == '\0') return haystack + i;            
        }
        
        return NULL;
    }
};

思路三：KMP算法解决，本质是对不同匹配算法的改进，过滤哪些不需要做匹配的情况。

参考 http://billhoo.blog.51cto.com/2337751/411486 和 http://www.cnblogs.com/goagent/archive/2013/05/16/3068442.html

关于KMP讨论都太麻烦，不够感性。假设目标串为T，模式串为P。其中i，j分别指向两者，i->T和 j->P。

1.当P在j出发生不匹配时，这时可以找 j 前面的某一位置来继续与i进行比较（只需要让 j 回溯，而不让 i 回溯）。2.设 j 前面已经匹配的字符为X，若X的真前缀与X的真后缀相等时，pos = strlen(真前缀) 可以成为下一个 j 的位置（回溯的位置）。3.若能找到一个更长的真前缀 = 真后缀，那么得到下一个pos，它是我们更应该去关注的下一个 j 的位置。

以上感性认识后，对于那些理论也就好分析了，有感性认识，理论分析不在话下。

关于NextJ，实际就是求在 j 位置不匹配时，j 应该回溯的位置（回溯后与 i 继续比较），说到这里可以直接写代码了吧。

    void getNext(char *needle, int n, int *next) {
        int k = 0;
        next[0] = -1;
        for(int j = 1; j < n; j++) {
            k = next[j - 1];
            while(k > 0 && needle[k] != needle[j - 1]) {
                k = next[k];
            }
            if(k == -1) next[j] = 0;

            if(needle[k] == needle[j - 1])
                next[j] = k + 1;
            else 
                next[j] = 0;
        }
    }

上述代码采用迭代来求NextJ数据，使用假设法。如果next[ j ] = k，则可以将P序列写出[0, k-1] k 。。。[0, k-1] j ，这里为了表现真前缀和真后缀的相等关系，使用了下表。若P[k] = P[j] 则next[ j ] = k + 1;若P[k] != P[j] 则需要寻找下一个k = next[k]使得它们相等。

class Solution {
public:
    void getNext(char *needle, int n, int *next) {
        int k = 0;
        next[0] = -1;
        for(int j = 1; j < n; j++) {
            k = next[j - 1];
            while(k > 0 && needle[k] != needle[j - 1]) {
                k = next[k];
            }
            if(k == -1) next[j] = 0;

            if(needle[k] == needle[j - 1])
                next[j] = k + 1;
            else 
                next[j] = 0;
        }
    }
    
    
    char *strStr(char *haystack, char *needle) {
        assert(haystack && needle);
        int len1 = strlen(haystack);
        int len2 = strlen(needle);
        
        if(len1 == 0 && len2 == 0) return "";
        if(len1 == 0) return NULL;
        if(len2 == 0) return haystack;
        
        //计算nextJ
        int *next = new int[len2];
        getNext(needle, len2, next);
    
        
        //模式匹配
        int i = 0, j = 0;
        while(i < len1 && j < len2) {	
            if(j == -1 || haystack[i] == needle[j]) {	
                i++;
                j++;
    		
            } else {
                j = next[j];
            }
        }
    
        if(j == len2) {
            return haystack + (i - j);
    	} else {
            return NULL;
    	}
    }


};

KMP算法可以继续改进: