KMP 算法详解

最新推荐文章于 2022-11-03 21:50:23 发布

小悟空

最新推荐文章于 2022-11-03 21:50:23 发布

阅读量682

点赞数

分类专栏： ACMer 其他文章标签：算法 concatenation integer signal string output

本文链接：https://blog.csdn.net/wukonwukon/article/details/6725320

版权

ACMer 同时被 2 个专栏收录

70 篇文章 0 订阅

订阅专栏

其他

15 篇文章 0 订阅

订阅专栏

这个算法简单，但是这两天做了两个kmp的算法，都没做出来，细细的想想，我的kmp一直是我的一块心病，一直没有完全的理解，对于我来说，这个算法真的很难！好吧，你也许会说这个算法很简单啊，next函数是个模板，套上之后匹配就是了，嗯，我同意，简单，但是next函数是怎么求的呢？得出的结果中有什么重要信息么？

我先在这儿存几个网址，以后我再细细的看：http://www.ics.uci.edu/~eppstein/161/960227.html http://www.ics.uci.edu/~eppstein/161/960222.html

这里面的东西全是英文的，听说很不错

开始kmp算法，开始学的时候总是一头的雾水，弄不明白，

其实next(j)的求法才是KMP算法最关键的地方，要理解了它，才算是理解了KMP呀！

我们来探索一下next(j)内部的原理。

next(j)的值完全可以看成一个函数，它的自变量是模式串和失配位置j。

假设某一个模式串里，next(6) = 3的话，这意味着什么呢？

就是说如果第6个位置失配了，那么我直接拿模式串第3个来和主串第6个比较，之所以能这么比，只有可能是模式串的第1、2个和主串的第4、5个是匹配的，但是主串的4、5个和模式串的4、5个也是匹配的，由相等关系的传递性，我们得知模式串的1、2个和4、5个是匹配的。

如果模式串内部存在相同的片段，例如123和345相同这样的情况，那么我们就可以在后一个相同片段的结束出失配时，从前一个相同片段的结束处继续比较。

而如果不存在相同的片段，那么就说明主串失配位置之前的部分不会再有匹配了（否则矛盾了），我们可以从主串失配位置的后一个位置继续比较了。

这样的一对相同片段恰好要从模式串的第一个开始，这样给了我们便利，只需要：

模式串第1个和第2个开始依次比较，

然后第1个和第3个开始依次比较，然后是第1个和第4个，依次类推，就能找到全部可能的相同片段了。

凑巧，这也恰好是一个找寻模式串的任务，自己既是主串又是模式串。

上述的自我比较过程中，在每一次比较相同时，我们都可以记下一个next值，

而出现失配时，我们从主串的下个位置重新开始比，等等，想起什么来了，对，主串不需要回溯。我们不是记下了前一个next值吗？

而如果第一次就不同呢，所以我们进行一个规定，next(1)=0，next(2)=1，这样在第1个和第2个比较时，next(2)的值已经有了，随后每一个比较都已经有了当前位置的next值了。这里next值为0是表示失配位置前不可能有匹配了，这时主串从失配位置的后一个继续比，而这个位置的next值我们同样记为0。

明白了？

好吧，看一个题：

Detection of Extraterrestrial _p

Time Limit: 1000MS

Memory Limit: Unknown

64bit IO Format: %lld & %llu

[Submit] [Go Back] [Status]

Description

Detection of Extraterrestrial

E.T. Inc. employs Maryanna as alien signal researcher. To identify possible alien signals and background noise, she develops a method to evaluate the signals she has already received. The signal sent by E.T is more likely regularly alternative.

Received signals can be presented by a string of small latin letters 'a' to 'z' whose length isN. For each X between 1 andN inclusive, she wants you to find out the maximum length of the substring which can be written as a concatenation ofX same strings. For clarification, a substring is a consecutive part of the original string.

Input

The first line contains T, the number of test cases (T $\le$ 200). Most of the test cases are relatively small. T lines follow, each contains a string of only small latin letters 'a' - 'z', whose lengthN is less than 1000, without any leading or trailing whitespaces.

Output

For each test case, output a single line, which should begin with the case number counting from 1, followed byN integers. The X-th (1-based) of them should be the maximum length of the substring which can be written as a concatenation ofX same strings. If that substring doesn't exist, output 0 instead. See the sample for more format details.

Hint: For the second sample, the longest substring which can be written as a concatenation of 2 same strings is "noonnoon", "oonnoonn", "onnoonno", "nnoonnoo", any of those has length 8; the longest substring which can be written as a concatenation of 3 same strings is the string itself. As a result, the second integer in the answer is 8 and the third integer in the answer is 12.

Sample Input

2
arisetocrat
noonnoonnoon

Sample Output

Case #1: 11 0 0 0 0 0 0 0 0 0 0
Case #2: 12 8 12 0 0 0 0 0 0 0 0 0

Input

Output

Sample Input

Sample Output

Hint

[Submit] [Go Back] [Status]

以下代码出处http://blog.csdn.net/allenjy123/article/details/6629885

/*KMP*/
/*注意：对ans[1]特殊考虑*/
/*AC代码：288ms*/
#include <iostream>
#include <cstdio>
#include <memory.h>
#include <algorithm>
#include <cstring>
#define MAXN 1005
#define max(a,b) (a>b?a:b)
using namespace std;
int cas,len;
int ans[MAXN],next[MAXN];
char s[MAXN];
void get_next(char s[])
{
	int i=1,t,lens=strlen(s+1);
	next[0]=-1;
	while(i<=lens)
	{
		t=next[i-1];
		while((t+1)&&s[t+1]!=s[i])
			t=next[t];
		next[i]=t+1;
		i++;
	}
}
void Solve()
{
	int i,j,k;
	memset(ans,0,sizeof(ans));
	ans[1]=len;
	char temp[MAXN];
	for(i=0;i<len;i++)
	{
		strcpy(temp+1,s+i);
		get_next(temp);
		int len=strlen(temp+1);
		for(j=len;j>=1;j--)
		{
			int x=j-next[j];
			if(j%x==0)
			{
				int w=j/x;
				for(k=w;k>=1;k--)//注意这里要更新多组答案
					ans[k]=max(ans[k],j-(w%k)*x);
			}
		}
	}
}
int main()
{
	int i,T;
	cas=1;
	scanf("%d",&T);
	while(T--)
	{
		scanf("%s",s);
		len=strlen(s);
		Solve();
		printf("Case #%d:",cas++);
		for(i=1;i<=len;i++)
			printf(" %d",ans[i]);
		printf("\n");
	}
	return 0;
}
/*
asasasa
Case #27: 7 4 6 0 0 0 0
*/

小悟空

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
KMP 算法详解

这个算法简单，但是这两天做了两个kmp的算法，都没做出来，细细的想想，我的kmp一直是我的一块心病，一直没有完全的理解，对于我来说，这个算法真的很难！好吧，你也许会说这个算法很简单啊，next函数是个模板，套上之后匹配就是了，嗯，我同意，简单，但是next函数是怎么求的呢？得出的结果中有什么重要信息么？我先在这儿存几个网址，以后我再细细的看：http://www.ics.uci.edu/~ep
复制链接

扫一扫