【HDU6153 2017中国大学生程序设计竞赛 - 网络选拔赛 D】【KMP 或 扩展KMP】A Secret 匹配串前缀中含有的模板串前缀长度和

A Secret

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 256000/256000 K (Java/Others)
Total Submission(s): 796    Accepted Submission(s): 311


Problem Description
Today is the birthday of SF,so VS gives two strings S1,S2 to SF as a present,which have a big secret.SF is interested in this secret and ask VS how to get it.There are the things that VS tell:
  Suffix(S2,i) = S2[i...len].Ni is the times that Suffix(S2,i) occurs in S1 and Li is the length of Suffix(S2,i).Then the secret is the sum of the product of Ni and Li.
  Now SF wants you to help him find the secret.The answer may be very large, so the answer should mod 1000000007.
 

Input
Input contains multiple cases.
  The first line contains an integer T,the number of cases.Then following T cases.
  Each test case contains two lines.The first line contains a string S1.The second line contains a string S2.
  1<=T<=10.1<=|S1|,|S2|<=1e6.S1 and S2 only consist of lowercase ,uppercase letter.
 

Output
For each test case,output a single line containing a integer,the answer of test case.
  The answer may be very large, so the answer should mod 1e9+7.
 

Sample Input
  
  
2 aaaaa aa abababab aba
 

Sample Output
  
  
13 19
Hint
case 2: Suffix(S2,1) = "aba", Suffix(S2,2) = "ba", Suffix(S2,3) = "a". N1 = 3, N2 = 3, N3 = 4. L1 = 3, L2 = 2, L3 = 1. ans = (3*3+3*2+4*1)%1000000007.
 

Source
 

#include<stdio.h>
#include<iostream>
#include<string.h>
#include<string>
#include<ctype.h>
#include<math.h>
#include<set>
#include<map>
#include<vector>
#include<queue>
#include<bitset>
#include<algorithm>
#include<time.h>
using namespace std;
void fre() { freopen("c://test//input.in", "r", stdin); freopen("c://test//output.out", "w", stdout); }
#define MS(x, y) memset(x, y, sizeof(x))
#define ls o<<1
#define rs o<<1|1
typedef long long LL;
typedef unsigned long long UL;
typedef unsigned int UI;
template <class T1, class T2>inline void gmax(T1 &a, T2 b) { if (b > a)a = b; }
template <class T1, class T2>inline void gmin(T1 &a, T2 b) { if (b < a)a = b; }
const int N = 1e6 + 10, M = 0, Z = 1e9 + 7, inf = 0x3f3f3f3f;
template <class T1, class T2>inline void gadd(T1 &a, T2 b) { a = (a + b) % Z; }
int casenum, casei;

//KMP的0base模板,求b在a中出现了几次
namespace KMP0
{
	int n, m;
	char a[N], b[N];
	int nxt[N];
	int len[N];
	//注意,我的KMP模板中,要对b[lenb]和a[lena]做封堵,char串的话用\0,数串的话用特殊数字,否则应使得j不越界
	void getnxt(char b[])
	{
		int lenb = strlen(b);
		int j = -1; nxt[0] = -1;
		for (int i = 1; i < lenb; ++i)
		{
			while (j >= 0 && b[j + 1] != b[i])j = nxt[j];
			if (b[j + 1] == b[i])++j;
			nxt[i] = j;
		}
	}
	void kmp(char a[], char b[])
	{
		int lena = strlen(a), lenb = strlen(b);
		int j = -1;
		for (int i = 0; i < lena; ++i)
		{
			while (j >= 0 && b[j + 1] != a[i])j = nxt[j];
			if (b[j + 1] == a[i])++j;
			len[i] = j + 1;
		}
	}
	int f[N];
	void solve()
	{
		scanf("%s", a); n = strlen(a);
		scanf("%s", b); m = strlen(b);
		reverse(a, a + n);
		reverse(b, b + m);
		getnxt(b);
		kmp(a, b);
		int ans = 0;

		for (int i = 0; i < m; ++i)
		{
			int pre = nxt[i] == -1 ? 0 : f[nxt[i]];
			f[i] = (pre + i + 1) % Z;
		}
		for (int i = 0; i < n; ++i)
		{
			gadd(ans, f[len[i] - 1]);
		}
		printf("%d\n", ans);
	}
}

//EXKMP的0base模板,求b在a中出现了几次
namespace EXKMP0
{
	int n, m;
	char a[N], b[N];
	int nxt[N];
	int len[N];
	//处理模板串
	void getnxt(char b[])
	{
		int lenb = strlen(b);
		nxt[0] = lenb;															//处理以0为开头
		int i; for (i = 0; i + 1 < lenb && b[i] == b[i + 1]; ++i); nxt[1] = i;	//处理以1为开头
		int st = 1;
		for (int i = 2; i < lenb; ++i)											//处理以i为开头
		{
			if (i + nxt[i - st] < st + nxt[st])nxt[i] = nxt[i - st];
			else
			{
				int j = max(0, st + nxt[st] - i);
				for (; i + j < lenb && b[i + j] == b[j]; ++j);
				nxt[i] = j;
				st = i;
			}
		}
	}

	//处理匹配串
	void EKMP(char a[], char b[])
	{
		int lena = strlen(a), lenb = strlen(b);
		int i; for (i = 0; i < lena && i < lenb && a[i] == b[i]; ++i); len[0] = i;	//处理以0为开头
		int st = 0;
		for (int i = 1; i < lena; ++i)												//处理以i为开头
		{
			if (i + nxt[i - st] < st + len[st])len[i] = nxt[i - st];
			else
			{
				int j = max(0, st + len[st] - i);
				for (; i + j < lena && j < lenb && a[i + j] == b[j]; ++j);
				len[i] = j;
				st = i;
			}
		}
	}
	void solve()
	{
		scanf("%s", a); n = strlen(a);
		scanf("%s", b); m = strlen(b);
		reverse(a, a + n);
		reverse(b, b + m);
		getnxt(b);
		EKMP(a, b);
		int ans = 0;
		for (int i = 0; i < n; ++i)
		{
			gadd(ans, (1ll + len[i]) * len[i] / 2);
		}
		printf("%d\n", ans);
	}
}

int main()
{
	scanf("%d", &casenum);
	for (casei = 1; casei <= casenum; ++casei)
	{
		KMP0::solve();
		//EXKMP0::solve();
	}
	return 0;
}
/*
【题意】
给定子串a,b,让你求∑ b的每个后缀 * 后缀在a中的出现次数

【分析】
我们把a与b都各自做reverse

其实如果我们用扩展kmp的话,这道题做起来会更简单。
因为扩展kmp本来求的就是,以匹配串的每个位置为开头,可以匹配的模板串的最长前缀。

但是这里我们思考使用kmp——
这里复习下kmp,kmp可以求出,以匹配串的每个位置为结尾,可以匹配的模板串的最长前缀。
但是这个并不是我们直接需要的,于是要加一些处理。
我们在处理完模板串的nxt[]之后,用f[l]表示如果匹配串匹配模板串的长度是l,其对答案的贡献为f[l]
那这个f[]代表什么意义呢?

先回到问题,问题上所要求的(在双双reverse之后),是第一个串的哪些子串,在b串中是前缀。
于是,扩展kmp其实是枚举子串的左界,然后匹配的长度决定了子串右界的范围
而对于kmp,是枚举子串的右界,而这里匹配的左界是什么呢?需要累计加上fail指针的前驱值。换句话说,沿着kmp的fail指针向上爬到头,累积值的和便是其贡献。
于是程序中做了这样的DP——
for (int i = 0; i < m; ++i)
{
	int pre = nxt[i] == -1 ? 0 : f[nxt[i]];
	f[i] = (pre + i + 1) % Z;
}

【时间复杂度&&优化】
O(n)

*/


  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值