BF和KMP字符串匹配

最新推荐文章于 2019-11-11 11:32:20 发布

不蛋定

最新推荐文章于 2019-11-11 11:32:20 发布

阅读量179

点赞数

分类专栏：算法文章标签： BF KMP Java 字符串匹配

本文链接：https://blog.csdn.net/yc______/article/details/97393203

版权

算法专栏收录该内容

95 篇文章 1 订阅

订阅专栏

暴力匹配（BF）算法是普通的模式匹配算法，BF算法的思想就是将目标串S的第一个字符与模式串T的第一个字符进行匹配，若相等，则继续比较S的第二个字符和T的第二个字符；若不相等，则比较S的第二个字符和T的第一个字符，依次比较，直到得出最后的匹配结果。

package com.yc.algorithm.string;

/**
 * BF模式匹配算法
 * @author yc
 */
public class BruteForce {
    public static void main(String[] args) {
        int res = bf("qwer", "er");
        if (-1 == res) {
            System.out.println("没有找到匹配项！！！");
        } else {
            System.out.println("从下标【" + res + "】开始匹配！！！");
        }
    }

    private static int bf(String str, String target) {
        char[] chars = str.toCharArray();
        char[] tars = target.toCharArray();
        int i = 0; //记录chars数组当前元素下标
        int j = 0; //记录tars数组当前元素下标

        while (i < chars.length && j < tars.length) {
            if (chars[i] == tars[j]) { //如果chars的第i个和tars的第j个相同，则比较下一个
                i++;
                j++;
            } else { //如果不相同，则i返回到相同之前的下一个元素，j回到0.准备下一次匹配
                i = i - j + 1;
                j = 0;
            }
        }
        if (j >= tars.length) { //循环结束后如果找到匹配项，则返回其下标。
            return i - j;
        }else { //如果没有匹配项则返回-1
            return -1;
        }
    }
}

输出如下：

缺点：

1.当第一个字符不相同时j也会继续向后比较，比如例子中的“abcdefg”和“def”，当“a”和“d”不相同时，则明显之后的两个字符及时相等也不是相同的子串。

2.每次j下标都要回到0号下标，当主串和字串匹配失败时，主串进行回溯会影响效率，回溯之后，主串与字串有些部分比较是没有必要的

综上：这种简单的丢弃前面的匹配信息的算法，造成了极大的浪费和底下的匹配效率。

由此产生了KMP算法，在前边基础上，增加了一个next数组，该数组里边存放的是j应该返回的位置，而不是让j每次都回溯都下标为0的位置。

这个next数组算法，说起来比较复杂，就是求最长前缀与最长后缀。这个步骤建议一定要自己跟踪一下，不然很难明白！！！

package com.yc.algorithm.string;

/**
 * KMP
 * @author yc
 */
public class KMP {
    public static void main(String[] args) {
        int res = KMP("qwer", "er");
        if (-1 == res) {
            System.out.println("没有找到匹配项！！！");
        } else {
            System.out.println("从下标【" + res + "】开始匹配！！！");
        }
    }

    private static int KMP(String str, String tars) {
        char[] cstr = str.toCharArray();
        char[] ctars = tars.toCharArray();
        int[] next = getNext(ctars);
        int i = 0;
        int j = 0;

        while (i < cstr.length && j < ctars.length) {
            if (j == -1 || cstr[i] == ctars[j]) {
                ++i;
                ++j;
            } else {
                j = next[j];
            }
        }

        if (j >= ctars.length) {
            return i - j;
        } else {
            return -1;
        }
    }
    /**
     *  chars    a    b   c   a   b   c   a   b   d   c   a   b
     *  i         0   1   2   3   4   5   6   7   8   9   10  11
     *  next     -1   0   0   0   1   2   3   4   5   0   0   1
     *
     *  j        -1   0   0   0   1   2   3   4   5   0   0   1
     * @param chars
     * @return
     */
    private static int[] getNext(char[] chars) {
        int[] next = new int[chars.length];
        next[0] = -1;
        int i = 0;
        int j = -1;

        while (i < chars.length - 1) {
            if (j == -1 || chars[i] == chars[j]) {
                i++;
                j++;
                if (chars[i] != chars[j]) {
                    next[i] = j;
                } else {
                    next[i] = next[j];
                }
            } else {
                j = next[j];
            }
        }
        return next;
    }
}

输出和BF算法一样，但效率提高了很多。