数据结构复习篇4

林距离的孤独

已于 2024-05-01 12:53:54 修改

阅读量56

点赞数 9

文章标签：数据结构 java 算法

于 2024-05-01 12:47:08 首次发布

本文链接：https://blog.csdn.net/todoLove/article/details/138367865

版权

字符串匹配算法

暴力匹配算法

两个无脑for循环即可。

    //返回如果是子串的话，第一个起点的位置
    public static int getIndexOf1(String father, String son) {
        if (father == null || son == null || father.length() < son.length())
            return -1;
        int len1 = father.length();
        int len2 = son.length();
        char[] fa = father.toCharArray();
        char[] so = son.toCharArray();
        for (int i = 0; i < len1 - len2; i++) {
            boolean isSame = true;
            for (int j = 0; j < len2; j++) {
                if (fa[i + j] != so[j]) {
                    isSame = false;
                    break;
                }
            }
            if (isSame)
                return i;
        }
        return -1;
    }

KMP匹配算法

几个小问题：

假设当前父串匹配成功的第一个字符的位置是i,当前失败的父串的位置为x,x的最长公共前缀位置为j,为什么kmp敢直接从j开始匹配，不会担心遗漏吗？

理由：不会，如果从i到j的位置中，假设为k，从它开始可以匹配到x位置，那么从它开始到x-1位置它必定是最长公共前缀的，那你的最长前缀便计算错误了。因此，只要你每一次都保证最长前缀计算正确，它没理由不从j位置开始。

假设当前子串失效的位置为y,为什么可以直接跳转到对应的next数组对应处匹配，前面的字符为什么可以省去匹配的过程？

理由：最长公共前缀，当时就保证了你前面的字符与后面的字符相等。

为什么计算next数组时，当前失效位置的字符与前一个最长前缀的后一个字符匹配相等，长度直接加一，不会少吗？而不相等，就从它的最长前缀中去判断。

理由：加一好理解，为什么不会少？理由同第一个问题一样，如果出现更长的，说明你前面计算错了。

为什么可以不相等可以直接跳到前一个最长前缀中去呢？

理由：还是最长前缀！你如果有比它更长的，说明你上一个最长前缀求错了。其实你理解一下，就好像是在那套娃，跟dp一样。前一个问题是该一个问题的子问题一样。

//kmp算法
    //1.得到next数组（核心）（默认排除空串，空串你跑个集贸呀，活该你报错）
    public static int[] getNextArray(char[] str) {
        //就一个元素，别跑了，
        if (str.length == 1)
            return new int[]{-1};
        //下面三行无脑填位置
        int[] next = new int[str.length];
        next[0] = -1;
        next[1] = 0;
        //i：当前位置处
        int i = 2;
        //cn：前一个最长前缀的后一个位置
        int cn = 0;
        while (i < next.length) {
            if (str[i - 1] == str[cn])//失效位置的前一个字符与上一个最长前缀后的字符相等，最长前缀加一
                next[i++] = ++cn;
            else if (cn > 0)//不相等，去到它的前一个最长前缀中去找
                cn = next[cn];
            else//此时就是cn=0，也就是说来到第一个最长前缀都失败了，别跑了，你就没人能跟你匹配，你下一个位置只能从0开始跑
                next[i++] = 0;
        }
        return next;
    }

KMP优化算法

如果当前失效字符和它的next对应的字符一样，欸，还是失败，这里可以继续优化一下。

//nextval
    public static int[] getNextValArray(char[] str) {
        //就一个元素，别跑了，
        if (str.length == 1)
            return new int[]{-1};
        //下面三行无脑填位置
        int[] nextval = new int[str.length];
        nextval[0] = -1;
        nextval[1] = 0;
        //i：当前位置处
        int i = 2;
        //cn：前一个最长前缀的后一个位置
        int cn = 0;
        while (i < nextval.length) {
            if (str[i - 1] == str[cn])//失效位置的前一个字符与上一个最长前缀后的字符相等，最长前缀加一
                nextval[i++] = ++cn;
            else if (cn > 0)//不相等，去到它的前一个最长前缀中去找
                cn = nextval[cn];
            else//此时就是cn=0，也就是说来到第一个最长前缀都失败了，别跑了，你就没人能跟你匹配，你下一个位置只能从0开始跑
                nextval[i++] = 0;
        }
        //最后计算一下nextval数组,我就不放在上面了，免得绕晕，反正两遍O(n)还是O(n)，算法不卡常数时间
        i = 2;
        while (i < nextval.length) {
            while (nextval[i] != -1 && str[i] == str[nextval[i]])
                nextval[i] = nextval[nextval[i]];
            ++i;
        }
        return nextval;
    }

给出主函数调用

 //2.主函数调用
    public static int getIndexOf2(String s1, String s2) {
        if (s1 == null || s2 == null || s1.length() < s2.length() || s2.length() < 1)
            return -1;
        //这只是为了方便C++理解
        char[] str1 = s1.toCharArray();
        char[] str2 = s2.toCharArray();
        //得到next数组
        int[] next = getNextValArray(str2);
        int x = 0, y = 0;
        while (x < str1.length && y < str2.length) {
            //如果两个字符匹配成功，一起前进
            if (str1[x] == str2[y]) {
                ++x;
                ++y;
            } else if (next[y] == -1)//第一个字符就匹配失败了，只能父串前进
                ++x;
            else//前面还有字符，此时看失效的位置应该与谁匹配，查看next数组
                y = next[y];
        }
        return y == str2.length ? x - y : -1;//如果y到达末尾，说米国匹配成功，否则便返回-1
    }

给出对数器校验

    // 生成指定长度的随机字符串
    public static String generateRandomString(int length) {
        String characters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        Random random = new Random();
        StringBuilder sb = new StringBuilder(length);
        for (int i = 0; i < length; i++) {
            int randomIndex = random.nextInt(characters.length());
            char randomChar = characters.charAt(randomIndex);
            sb.append(randomChar);
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        int times = 1000000;
        //跑times次，错一次就提前终止，并输出此时的字符串
        System.out.println("程序开始");
        for (int i = 0; i < times; i++) {
            Random random = new Random();
            int n = random.nextInt(20) + 10;
            int m = random.nextInt(10) + 10;
            //保证n一定不比m小
            if (n < m) {
                n ^= m;
                m ^= n;
                n ^= m;
            }
            String str = generateRandomString(n);
            String str2 = generateRandomString(m);
            if (getIndexOf1(str, str2) != getIndexOf2(str, str2)) {
                System.out.println("程序出错，检查算法");
                System.out.println("第一个字符串" + str);
                System.out.println("第二个字符串" + str2);
                return;
            }
            System.out.println("第" + i + "次成功!");
        }
        System.out.println("程序结束");
    }

（写错了私信就行，别骂孤独。）