KMP

_Shmily

于 2020-08-24 18:24:22 发布

阅读量215

点赞数

分类专栏：字符串基本算法

本文链接：https://blog.csdn.net/zhaoxinxin1234/article/details/89949584

版权

字符串基本算法专栏收录该内容

29 篇文章 0 订阅

订阅专栏

一、KMP：
时间复杂度O（N+M）。

next数组求法：
（1）初始化next（1）= j = 0，假设next（1----i-1）已经求出，下面求解next（i）。
（2）不断尝试扩展匹配长度j，如果扩展失败（下一个字符不相等），令 j 变为next（j），直至 j 为0（应该开始从头开始匹配）。
（3）如果能够扩展成，匹配长度 j 就增加1。next（i）的值就是 j。

f 数组与next数组的求解过程基本一致。

//A是否为B的子串
//获取A的next数组
//A与B进行匹配，求解 f 数组
nt[1]=0;
for(int i=2,j=0;i<=n;i++)
{
    while(j>0&&a[i]!=a[j+1]) j=nt[j];
    if(a[i]==a[j+1]) j++;
    nt[i]=j;
}

for(int i=1,j=0;i<=m;i++)
{
    while(j>0&&(j==n||b[i]!=a[j+1])) j=nt[j];
    if(b[i]==a[j+1]) j++;
    f[i]=j;
    if(f[i]==n) sum++;
}

一般来说上述代码已经够用，但是还有点小优化。

//失配时处理达到了最优
nt[1]=0;
for(int i=2,j=0;i<=n;i++)
{
    while(j>0&&a[i]!=a[j+1]) j=nt[j];
    if(a[i]==a[j+1]) j++;

    if(j==0||a[i+1]!=a[j+1])
        nt[i]=j;
    else nt[i]=nt[j];
}

二、例题：POJ - 1961 Period

For each prefix of a given string S with N characters (each character has an ASCII code between 97 and 126, inclusive), we want to know whether the prefix is a periodic string. That is, for each i (2 <= i <= N) we want to know the largest K > 1 (if there is one) such that the prefix of S with length i can be written as A K ,that is A concatenated K times, for some string A. Of course, we also want to know the period K.
Input
The input consists of several test cases. Each test case consists of two lines. The first one contains N (2 <= N <= 1 000 000) – the size of the string S.The second line contains the string S. The input file ends with a line, having the
number zero on it.
Output
For each test case, output “Test case #” and the consecutive test case number on a single line; then, for each prefix with length i that has a period K > 1, output the prefix size i and the period K separated by a single space; the prefix sizes must be in increasing order. Print a blank line after each test case.
Sample Input
3
aaa
12
aabaabaabaab
0
Sample Output
Test case #1
2 2
3 3

Test case #2
2 2
6 2
9 3
12 4

如果一个字符串S是由一个字符串T重复K次形成的，则称T是S的循环元。使K最大的字符串T称为S的最小循环元，此时K称最大循环次数。
现在给定一个长度为N的字符串S，对S的每一个前缀S（1–i），如果它的最大循环次数大于1，则输出该前缀的最小循环元长度和最大循环次数。

引理：S（1----i）具有长度为len ＜ i 的循环元的充要条件是 len 能整除 i 并且S(len + 1----i) = S（1----i-len）

根据引理：当 i -next（i）能整除 i 时，S（1----i-next（i））就是S（1----i）的最小循环元。它的最大循环次数就是 i / （ i-next（i））。其中 i - next（i）能整除 i 的条件是为了保证循环元每次重复的完整性。

进一步地，如果 i - next（ next（i））能整除 i ，那么 S（1----i - next（ next（i）））就是S（1----i）的次小循环元。以此类推可以找出s（1----i）的所有循环元。

一个字符串的任意循环元的长度必然是最小循环元长度的倍数。

#include<iostream>
#include<cstring>
#include<string>
#include<algorithm>
#include<cstdio>
#include<cmath>
#define ll long long
using namespace std;

const int maxn=1000010;
char a[maxn];
int nt[maxn];
int n,t;
int main(void)
{
    while(scanf("%d",&n),n)
    {
        scanf("%s",a+1);
        nt[1]=0;
        for(int i=2,j=0;i<=n;i++)
        {
            while(j>0&&a[i]!=a[j+1]) j=nt[j];
            if(a[i]==a[j+1]) j++;
            nt[i]=j;
        }

        printf("Test case #%d\n",++t);
        for(int i=2;i<=n;i++)
        {
            if(i%(i-nt[i])==0&&i/(i-nt[i])>1)
                printf("%d %d\n",i,i/(i-nt[i]));
        }
         putchar('\n');
    }
    return 0;
}

_Shmily

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
KMP

一、KMP：时间复杂度O（N+M）。哪有那么多事情，板子背就完了。next数组求法：（1）初始化next（1）= j = 0，假设next（1----i-1）已经求出，下面求解next（i）。（2）不断尝试扩展匹配长度j，如果扩展失败（下一个字符不相等），令 j 变为next（j），直至 j 为0（应该开始从头开始匹配）。（3）如果能够扩展成，匹配长度 j 就增加1。next（i）的值...
复制链接

扫一扫

专栏目录