Oulipo
Time Limit: 3000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total Submission(s): 4002 Accepted Submission(s): 1579
Problem Description
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
Sample Input
3 BAPC BAPC AZA AZAZAZA VERDI AVERDXIVYERDIAN
Sample Output
1 3 0
Source
Recommend
当我在比赛时看到这个题目时我嘿嘿了,因为我真的会KMP,但是随即我又呵呵了,因为我确实想不到对于这个应该是怎么做的,这让我想起来以前做过的hdu 2087的剪
华布条的题目,非常的类似,但是hdu 2087的问题在于我们对于这个字符串是不可以
重复利用的,但是求next数组的过程都是一致的,关键在于其比较。其实核心在于J
什么时候被赋成0的问题。对于这个问题来说,就是比较何时从小字符串的开始合适比。对于hdu 1686来说,j=0只能是当前面j==-1也就是没得匹配的时候才进行从0开始,可是对于剪花布条而言,因为不能重复利用,只能是等到当我们的小字符串已经被完全匹配过,sum++之后才可以进行的。
hdu 2087 点击打开链接的博客。
#include<iostream>
#include <stdio.h>
#include <string.h>
using namespace std;
int i,j,k,t;
char s1[10010],s2[1000010];
int next[10010];
/*
当发生失配的情况下,
j的新值next[j]取决于模式串
中T[0 ~ j-1]中前缀和后缀相等部分的长度,
并且next[j]恰好等于这个最大长度。
*/
void getNext(int len)
{
int j,k;
next[0]=-1;
j=0;
k=-1;
while(j<len)
{
if(k==-1||s1[k]==s1[j])
{
k++;
j++;
next[j]=k;
}
else
k=next[k];
}
}
int main()
{
/*
while(scanf("%s",s1))
{
if(s1[0]=='#')
break;
scanf("%s",s2);
la=strlen(s1);
lb=strlen(s2);
getNext();
int sum=0;
if(la<lb)
printf("0\n");
else
{
int i=0;
int j=0;
while(i<la)
{
if(j==-1||s1[i]==s2[j])
{
i++;
j++;
}
else
j=next[j];
if(j==lb)
{
j=0;
sum++;
}
}
printf("%d\n",sum);
}
}
return 0;
*/
scanf("%d",&t);
while(t--)
{
scanf("%s",s1);
scanf("%s",s2);
int i,j,k;
i=0;
j=0;
k=0;
int la=strlen(s2);
int lb=strlen(s1);
getNext(lb);
int sum=0;
/* for(i=0;i<=lb;i++)
{
printf("h%d\n",next[i]);
}*/
/*for(i=0;i<=lb;i++)
{
printf("h %d\n",next[i]);
}*/
while(i<la)
{
//printf("jj %d\n",j);
if(s2[i]==s1[j])
{
i++;
j++;
// printf("jjjj %d\n",j);
}
else
{
//printf("%c hh %c\n",s2[i],s1[j]);
j=next[j];
if(j==-1)
{
i++;
j=0;
}
}
if(j==lb)
{
sum++;
}
}
printf("%d\n",sum);
}
return 0;
}