例题描述:String haystack = "abcabaabaabcacb"; String needle = "abaabcac"; 求字符串needle与字符串haystack的子串匹配的部分,返回haystack字符串匹配初的起始位置。
KMP算法:KMP算法的核心思想是求解next[]数组,next[]数组的含义是所求串中前缀与后缀相等的最大长度。比如needle字符串的值为“abaabcac”,那么其的next数组为:
needle | a | b | a | a | b | c | a | c |
j | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
next[j] | -1 | 0 | 0 | 1 | 1 | 2 | 0 | 1 |
求出next数组后,开始对两个字符串进行匹配,定义i指向字符串haystack的下标位置,定义j指向字符串needle的下标位置,如果haystack.charAt(i)与needle.charAt(j)的字符相等,则i++,j++;当两个字符串在下标为j的位置发生不匹配时,则令j=next[j],i位置不变,重新开始匹配。如果j 的值为-1,则j++,i++,随后继续进行逐个字符的比较。
java代码实现:
package cn.wanggeng; public class KmpSolution { //求解next数组 public int[] GetNext(String needle){ int len = needle.length(); int[] next = new int[len]; next[0] = -1; next[1] = 0; int k = 0; for(int j = 2; j < len; j++){ k = next[j-1]; while(k != -1){ if(needle.charAt(k) == needle.charAt(j-1)){ next[j] = k+1; break; }else{ k = next[k]; } next[j] = 0; } } return next; } //进行字符串的匹配 public int strStr(String haystack, String needle){ if(needle.length() == 0){ return 0; } if(haystack.length() == 0){ return -1; } if(needle.length() == 1){ for(int m = 0; m < haystack.length(); m++){ if(haystack.charAt(m) == needle.charAt(0)){ return m; } } return -1; } char[] hay_arr = haystack.toCharArray(); char[] need_arr = needle.toCharArray(); int[] next = GetNext(needle); int i = 0; int j = 0; while(i < hay_arr.length && j < need_arr.length){ if(j == -1 || hay_arr[i] == need_arr[j]){ i++; j++; }else{ j = next[j]; } } if(j == need_arr.length){ return i-j; }else{ return -1; } } //主方法 public static void main(String[] args) { KmpSolution s = new KmpSolution(); int m = s.strStr("abcabaabaabcacb","abaabcac"); System.out.println(m); } }