Power Strings（KMP）

最新推荐文章于 2023-06-30 19:30:28 发布

--Simone-

最新推荐文章于 2023-06-30 19:30:28 发布

阅读量63

点赞数

分类专栏： POJ

本文链接：https://blog.csdn.net/u014135990/article/details/88340016

版权

POJ 专栏收录该内容

102 篇文章 0 订阅

订阅专栏

Power Strings

Time Limit: 3000MS		Memory Limit: 65536K
Total Submissions: 29064		Accepted: 12140

Description

Given two strings a and b we define a*b to be their concatenation. For example, if a = "abc" and b = "def" then a*b = "abcdef". If we think of concatenation as multiplication, exponentiation by a non-negative integer is defined in the normal way: a^0 = "" (the empty string) and a^(n+1) = a*(a^n).

Input

Each test case is a line of input representing s, a string of printable characters. The length of s will be at least 1 and will not exceed 1 million characters. A line containing a period follows the last test case.

Output

For each s you should print the largest n such that s = a^n for some string a.

Sample Input

abcd
aaaa
ababab
.

Sample Output

1
4
3

Hint

This problem has huge input, use scanf instead of cin to avoid time limit exceed.

Source

Waterloo local 2002.07.01

题意：

输入为多输入，给出一个序列，序列长度 at least 1 and will not exceed 1 million，输出最大的子串重复次数。

思路：

利用 KMP 的 next 值。

next 数组值指的是，当模式串该序号下的字符与之配对的字符不匹配的时候，模式串回退到模式串前面的某一序号最大值，但是并不表明当回退结束后的这个字符一定会与这个字符匹配，保证的是模式串在这个序号前面的字符是能匹配的。

这题只要判断len - next[len]是否能整除len即可判断算出最大的重复次数。

设P0P1P2P3P4P5P6P7序列（长度为8，下标从0开始）存在重复的子串，若next[8] = 6，则 len - next[8] = 8 - 6 = 2，而因为 next 代表的是当序号为8的第9位元素失配时的退回的最大序列号，所以也等于前移量，那么也就表明 P2P3P4P5P6P7 = P0P1P2P3P4P5，说明 P0P1 = P2P3 = P4P5 = P6P7。所以只要判断能否整除即可，能整除最后用 len / （len - next[8]）即可。

注意，当输入为 . 的时候退出。

注意点：

为什么判断的是 next[len] 而不是next[len - 1]？

判断next[len - 1]只能保证len - 1前面的字符有next[len - 1]个匹配的字符，而不保证 len - 1号元素是一定匹配到的，而判断next[len]，则能保证 len - 1前面且包括len - 1号元素能匹配到。

代码 get_next 的实现退出条件为什么不用改成 i <= len 也可以？

get_next 代码如下所示：

void get_next()
{
    int i = 0,j = -1;
    next[0] = -1;
    while(i < len)
    {
        if(j == -1 || str[i] == str[j])
        {
            i++;           //1
            j++;           //2
            next[i] = j;   //3
        }
        else j = next[j];
    }
}

每次对 i 号字符寻找 next 值都是当找到 i - 1号字符相等或者为 -1 的时候才进行+1赋值的。那么就是说，每个序号 next 值都是在保证前面字符能匹配的最多数量来寻找的，想找到再赋值。当 i == len - 1，还没有退出循环体，这时候还会进行循环，所以还会i++，当next [ len ]赋值之后再退出循环。所以代码的3句话也可缩写成 next[++i] = ++j。

AC：

#include<stdio.h>
#include<string.h>
char str[1000005];
int next[1000005],len;

void get_next()
{
    int i = 0,j = -1;
    next[0] = -1;
    while(i < len)
    {
        if(j == -1 || str[i] == str[j])  next[++i] = ++j;
        else j = next[j];
    }
}

int main()
{
    while(scanf("%s",str)!=EOF)
    {
        if(str[0] == '.') break;
        len = strlen(str);
        get_next();
        if(!(len % (len - next[len]))) printf("%d\n",len / (len - next[len]));
        else                           printf("1\n");
    }
    return 0;
}