# KMP

KMP是单模式匹配算法，即在一个长度为 n n 的文本串S中查找一个长度 m m 的模式串P。它的复杂度是 O ( n + m ) O(n+m) ，差不多是此类算法能达到的最优复杂度。

## 原理

S [ ] = " a b c a b c a b c d " , P [ ] = “ a b c d ” S[]="abcabcabcd",P[]=“abcd” 为例， i i 指向 S [ i ] S[i] j j 指向 P [ j ] P[j]

next
n e x t [ ] next[] 数组是对串P预处理得到的， n e x t next 数组的值是除当前字符外的字符串的前缀与后缀相同的最大长度。

• 前缀：以 j j 为起点的一段字符串，终点不限(别越界)
• 后缀：以 j j 为终点的一段字符串，起点不限(别越界)

abab
-1001

## 模板

void getnext(char* p, int lp) {
nex[0] = nex[1] = 0;
for (int i = 1; i < lp; i++) {
int j = nex[i];
while (j && p[i] != p[j])j = nex[j];
nex[i + 1] = p[i] == p[j] ? j + 1 : 0;
}
}
int kmp(char* s, char* p) {	//统计s中有多少个p
int ans = 0;
int ls = strlen(s), lp = strlen(p);
getnext(p, lp);
for (int i = 0, j = 0; i < ls; i++) {
while (j && s[i] != p[j])j = nex[j];//失配则回溯j
if (s[i] == p[j])j++;//匹配则继续
if (j >= lp)ans++;//统计
}
return ans;
}

## 例题

### HDU-1686Oulipo

HDU-1686Oulipo

Problem Description
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter ‘e’. He was a member of the Oulipo group. A quote from the book:
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T’s is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {‘A’, ‘B’, ‘C’, …, ‘Z’} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
One line with the word W, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with |W| ≤ |T| ≤ 1,000,000.
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
Sample Input
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
Sample Output
1
3
0

#include<bits/stdc++.h>
using namespace std;
const int maxn = 10004;
int nex[maxn];
char s[maxn * 100], p[maxn];
void getnext() {
int lp = strlen(p);
nex[0] = nex[1] = 0;
for (int i = 1; i < lp; i++) {
int j = nex[i];
while (j && p[i] != p[j])j = nex[j];
nex[i + 1] = p[i] == p[j] ? j + 1 : 0;
}
}
int kmp() {
int ans = 0;
int ls = strlen(s), lp = strlen(p);
getnext();
for (int i = 0, j = 0; i < ls; i++) {
while (j && s[i] != p[j])j = nex[j];
if (s[i] == p[j])j++;
if (j >= lp)ans++;
}
return ans;
}
int main() {
int t;
scanf("%d", &t);
while (t--) {
scanf("%s%s", p, s);
printf("%d\n", kmp());
}
return 0;
}

### HDU-2087剪花布条

HDU-2087剪花布条

Problem Description

Input

Output

Sample Input
abcde a3
aaaaaa aa
#
Sample Output
0
3

#include<bits/stdc++.h>
using namespace std;
const int maxn = 1003;
int nex[maxn];
char p[maxn], s[maxn];
void getnext() {
nex[0] = nex[1] = 0;
int lp = strlen(p);
for (int i = 1; i < lp; i++) {
int j = nex[i];
while (j && p[i] != p[j])j = nex[j];
nex[i + 1] = p[i] == p[j] ? j + 1 : 0;
}
}
int kmp() {
int ls = strlen(s), lp = strlen(p);
getnext();
int last = -1; //指向上一个匹配的末尾
int ans = 0;
for (int i = 0, j = 0; i < ls; i++) {
while (j && s[i] != p[j])j = nex[j];
if (s[i] == p[j])j++;
if (j >= lp) { //若完全匹配
if (i - last >= lp) { //且与上一个不重合
ans++;
last = i;
}
}
}
return ans;
}
int main() {
while (~scanf("%s", s)) {
if (s[0] == '#')break;
scanf("%s", p);
printf("%d\n", kmp());
}
return 0;
}

### POJ-2752Seek the Name, Seek the Fame

POJ-2752Seek the Name, Seek the Fame

Description
The little cat is so famous, that many couples tramp over hill and dale to Byteland, and asked the little cat to give names to their newly-born babies. They seek the name, and at the same time seek the fame. In order to escape from such boring job, the innovative little cat works out an easy but fantastic algorithm:
Step1. Connect the father’s name and the mother’s name, to a new string S.
Step2. Find a proper prefix-suffix string of S (which is not only the prefix, but also the suffix of S).
Example: Father=‘ala’, Mother=‘la’, we have S = ‘ala’+‘la’ = ‘alala’. Potential prefix-suffix strings of S are {‘a’, ‘ala’, ‘alala’}. Given the string S, could you help the little cat to write a program to calculate the length of possible prefix-suffix strings of S? (He might thank you by giving your baby a name:)
Input
The input contains a number of test cases. Each test case occupies a single line that contains the string S described above.
Restrictions: Only lowercase letters may appear in the input. 1 <= Length of S <= 400000.
Output
For each test case, output a single line with integer numbers in increasing order, denoting the possible length of the new baby’s name.
Sample Input
ababcababababcabab
aaaaa
Sample Output
2 4 9 18
1 2 3 4 5

#include<cstdio>
#include<cstring>
#include<vector>
using namespace std;
const int maxn = 400005;
int nex[maxn], ls;
char s[maxn];
void getnext() {
nex[0] = nex[1] = 0;
for (int i = 1; i < ls; i++) {
int j = nex[i];
while (j && s[i] != s[j])
j = nex[j];
nex[i + 1] = s[i] == s[j] ? j + 1 : 0;
}
}
int main() {
while (~scanf("%s", s)) {
ls = strlen(s);
getnext();
int idx = ls ;
vector<int>ans;
while (nex[idx]) {
ans.push_back(nex[idx] );
idx = nex[idx];
}
int len = ans.size();
for (int i = len - 1; i >= 0; i--)
printf("%d ", ans[i]);
printf("%d\n", ls);
}
return 0;
}

### POJ-2406Power Strings

POJ-2406Power Strings

Description
Given two strings a and b we define ab to be their concatenation. For example, if a = “abc” and b = “def” then ab = “abcdef”. If we think of concatenation as multiplication, exponentiation by a non-negative integer is defined in the normal way: a^0 = “” (the empty string) and a^(n+1) = a*(a^n).
Input
Each test case is a line of input representing s, a string of printable characters. The length of s will be at least 1 and will not exceed 1 million characters. A line containing a period follows the last test case.
Output
For each s you should print the largest n such that s = a^n for some string a.
Sample Input
abcd
aaaa
ababab
.
Sample Output
1
4
3
Hint
This problem has huge input, use scanf instead of cin to avoid time limit exceed.

#include<cstdio>
#include<cstring>
using namespace std;
const int maxn = 1000006;
int nex[maxn], ls;
char s[maxn];
void getnext() {
nex[0] = nex[1] = 0;
for (int i = 1; i < ls; i++) {
int j = nex[i];
while (j && s[i] != s[j])
j = nex[j];
nex[i + 1] = s[i] == s[j] ? j + 1 : 0;
}
}
int main() {
while (~scanf("%s", s)) {
if (s[0] == '.')break;
ls = strlen(s);
getnext();
int len = ls - nex[ls];
if (len && ls % len == 0)
printf("%d\n", ls / len);
else puts("1");
}
return 0;
}

09-12 2717
08-06 201

06-17 6585
01-17 4480
04-03 1093
12-28 33
01-29 2725
02-25 801
09-28 49
03-10 546