kuangbin带你飞 KMP专题

最新推荐文章于 2020-05-26 00:09:39 发布

默_silence

最新推荐文章于 2020-05-26 00:09:39 发布

阅读量227

点赞数

分类专栏： # 0x10 基本数据结构

本文链接：https://blog.csdn.net/weixin_43772166/article/details/89790339

版权

0x10 基本数据结构专栏收录该内容

21 篇文章 0 订阅

订阅专栏

文章目录

A - Number Sequence
B - Oulipo
C - 剪花布条
D - Cyclic Nacklace
E - Period
F - Power Strings
G - Seek the Name, Seek the Fame
I - Simpsons’ Hidden Talents
J - Count the string

A - Number Sequence

Given two sequences of numbers : a[1], a[2], … , a[N], and b[1], b[2], … , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). Your task is to find a number K which make a[K] = b[1], a[K + 1] = b[2], … , a[K + M - 1] = b[M]. If there are more than one K exist, output the smallest one.

Input

The first line of input is a number T which indicate the number of cases. Each case contains three lines. The first line is two numbers N and M (1 <= M <= 10000, 1 <= N <= 1000000). The second line contains N integers which indicate a[1], a[2], … , a[N]. The third line contains M integers which indicate b[1], b[2], … , b[M]. All integers are in the range of [-1000000, 1000000].

Output

For each test case, you should output one line which only contain K described above. If no such K exists, output -1 instead.

Sample Input

2
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2
1 2 3 1 3
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2
1 2 3 2 1

Sample Output

6
-1

这道题将KMP模板中的两个字符串匹配改成了两个数组，原理还是一样的。
这道题要求输出最小的能够匹配的下标，跳出kmp循环时有两种情况：
1.能够进行匹配时，j的值为len2，此时跳出循环。
2.匹配到s1的末尾时，仍为匹配成功，此时j一定不为len2。
所以只需要在跳出循环时判断j是否为len2，即可判断是否匹配成功。

#include <stdio.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
int s1[N + 5], s2[N + 5];
int len1, len2;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;//第一个next值为-1 
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;//如果最长前缀和最长后缀相同 
        else j = Next[j];//否则回退
    }
}


void kmp()
{
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2){
            printf("%d\n",i - len2 + 1);
            break;
    	}
	}
    if(j!=len2)
    	printf("-1\n");
}


int main()
{
	int t;
	scanf("%d",&t);
	while(t--)
	{
		scanf("%d%d",&len1,&len2);
	    for(int i=0;i<len1;i++)
	    	scanf("%d",&s1[i]);
	    for(int i=0;i<len2;i++)
	    	scanf("%d",&s2[i]);
	    get_next();
	    kmp(); 
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

int a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配，f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	int t, flag;
	scanf("%d", &t);
	while (t--) {
		scanf("%d%d", &m, &n);
		for (int i = 1; i <= m; i++)
			scanf("%d", &b[i]);
		for (int i = 1; i <= n; i++)
			scanf("%d", &a[i]);
		get_next();
		kmp();
		flag = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n) {
				flag = 1;
				cout << i - n + 1 << endl;
				break;
			}
		}
		if (flag == 0)
			cout << "-1" << endl;
	}
	
	return 0;
}

B - Oulipo

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter ‘e’. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T’s is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {‘A’, ‘B’, ‘C’, …, ‘Z’} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN

Sample Output

1
3
0
这道题输入的两个字符串可能是长的在后面，短的在前面。需要把两个字符串和它们的长度进行互换。

要求输出的是能够匹配出多少个短的字符串，如果能够匹配成功，则按照原来的模板中的j=next[j]继续进行匹配，并用一个变量ans来记录匹配成功的数量。

#include <stdio.h>
#include<string.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
char s1[N + 5], s2[N + 5];
char temp[N + 5];
int len1, len2;
int ans;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;
        else j = Next[j];
    }
}


void kmp()
{
	ans=0;
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2) {
        	ans++;    
        	j=Next[j];
        }
    }
   	printf("%d\n",ans);
}


int main()
{
	int t,n;
	scanf("%d",&t);
	while(t--) 
    {
	    scanf("%s", s1);
	    scanf("%s", s2);
	    len1 = strlen(s1), len2 = strlen(s2);
	    if(len1<len2)
	    {
	    	strcpy(temp,s2);
			strcpy(s2,s1);
			strcpy(s1,temp);
			n=len2;
			len2=len1;
			len1=n;
	    }
	    get_next();
	    kmp();
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N], c[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配，f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	int t;
	cin >> t;
	while (t--) {
		scanf("%s", a + 1); // a 为较短的字符串 
		scanf("%s", b + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		if (n > m) {
			strcpy(c, a);
			strcpy(a, b);
			strcpy(b, c);
			swap(n, m);
		}
		get_next();
		kmp();
		int res = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n) {
				res++;
			}
		} 
		cout << res << endl;
	}
	
	return 0;
}

C - 剪花布条

一块花布条，里面有些图案，另有一块直接可用的小饰条，里面也有一些图案。对于给定的花布条和小饰条，计算一下能从花布条中尽可能剪出几块小饰条来呢？

Input

输入中含有一些数据，分别是成对出现的花布条和小饰条，其布条都是用可见ASCII字符表示的，可见的ASCII字符有多少个，布条的花纹也有多少种花样。花纹条和小饰条不会超过1000个字符长。如果遇见#字符，则不再进行工作。

Output

输出能从花纹布中剪出的最多小饰条个数，如果一块都没有，那就老老实实输出0，每个结果之间应换行。

Sample Input
abcde a3
aaaaaa aa

Sample Output
0
3

这道题要输出最多能够裁剪出多少个s2，所以进行匹配成功后要让j = 0，让s1, s2从头开始匹配，而不是在最相应的next[j]进行匹配

#include <stdio.h>
#include<string.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
char s1[N + 5], s2[N + 5];
char temp[N + 5];
int len1, len2;
int ans;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;
        else j = Next[j];
    }
}


void kmp()
{
	ans=0;
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2) {
        	ans++;    
        	j=0;//如果匹配出了一个，那么让j重新为0，重新匹配 
        }
    }
   	printf("%d\n",ans);
}


int main()
{
	int t,n;
	while(true) 
    {
    	scanf("%s",s1);
    	if(strcmp(s1,"#")==0) break;
	    scanf("%s", s2);
	    len1 = strlen(s1), len2 = strlen(s2);
	    get_next();
	    kmp();
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配，f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	while (scanf("%s", b + 1) && b[1] != '#') {
		scanf("%s", a + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		get_next();
		kmp();
		int last = 0, res = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n && i - n + 1 > last) {
				last = i;
				res++;
			}
		} 
		cout << res << endl;
	}
	
	return 0;
}

D - Cyclic Nacklace

链接：https://blog.csdn.net/weixin_43772166/article/details/89815466

E - Period

链接：https://blog.csdn.net/weixin_43772166/article/details/97313894

F - Power Strings

链接：https://blog.csdn.net/weixin_43772166/article/details/97397571

G - Seek the Name, Seek the Fame

链接：https://blog.csdn.net/weixin_43772166/article/details/108782141

I - Simpsons’ Hidden Talents

Homer: Marge, I just figured out a way to discover some of the talents we weren’t aware we had.
Marge: Yeah, what is it?
Homer: Take me for example. I want to find out if I have a talent in politics, OK?
Marge: OK.
Homer: So I take some politician’s name, say Clinton, and try to find the length of the longest prefix
in Clinton’s name that is a suffix in my name. That’s how close I am to being a politician like Clinton
Marge: Why on earth choose the longest prefix that is a suffix???
Homer: Well, our talents are deeply hidden within ourselves, Marge.
Marge: So how close are you?
Homer: 0!
Marge: I’m not surprised.
Homer: But you know, you must have some real math talent hidden deep in you.
Marge: How come?
Homer: Riemann and Marjorie gives 3!!!
Marge: Who the heck is Riemann?
Homer: Never mind.
Write a program that, when given strings s1 and s2, finds the longest prefix of s1 that is a suffix of s2.

Input

Input consists of two lines. The first line contains s1 and the second line contains s2. You may assume all letters are in lowercase.

Output

Output consists of a single line that contains the longest string that is a prefix of s1 and a suffix of s2, followed by the length of that prefix. If the longest such string is the empty string, then the output should be 0.
The lengths of s1 and s2 will be at most 50000.

Sample Input

clinton
homer
riemann
marjorie

Sample Output

0
rie 3

输入两个字符串a，b，找出既是a的前缀又是b的后缀的最长字符串

根据 f 数组的定义，可知 f[len(b)] 即为所求字符串的长度

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配，f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	while (scanf("%s", a + 1) != EOF) {
		scanf("%s", b + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		get_next();
		kmp();
		int len = f[m];
		if (len == 0) {
			cout << "0" << endl;
		} else {
			for (int i = 1; i <= len; i++) {
				cout << a[i];
			}
			cout << " " << len << endl;
		}
	}
	
	return 0;
}

J - Count the string

链接：https://blog.csdn.net/weixin_43772166/article/details/108784482

默_silence

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
kuangbin带你飞 KMP专题

Given two sequences of numbers : a[1], a[2], … , a[N], and b[1], b[2], … , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). Your task is to find a number K which make a[K] = b[1], a[K + 1] = b[2...
复制链接

扫一扫

专栏目录