Manacher模板

最新推荐文章于 2023-07-24 14:30:40 发布

m0_51864047

最新推荐文章于 2023-07-24 14:30:40 发布

阅读量117

点赞数

分类专栏：字符串文章标签：字符串

本文链接：https://blog.csdn.net/m0_51864047/article/details/119685779

版权

字符串专栏收录该内容

5 篇文章 0 订阅

订阅专栏

字符串哈希
字符串哈希求最长回文
Manacher

字符串哈希

兔子与兔子

题目描述

很久很久以前，森林里住着一群兔子。有一天，兔子们想要研究自己的 DNA 序列。我们首先选取一个好长好长的 DNA 序列（小兔子是外星生物，DNA 序列可能包含 $26$ 个小写英文字母），然后我们每次选择两个区间，询问如果用两个区间里的 DNA 序列分别生产出来两只兔子，这两个兔子是否一模一样。注意两个兔子一模一样只可能是他们的 DNA 序列一模一样。

输入描述:

第一行一个 DNA 字符串 $S$ 。
接下来一个数字 $m$ ，表示 $m$ 次询问。
接下来 $m$ 行，每行四个数字 $l_1, r_1, l_2, r_2$ ，分别表示此次询问的两个区间，注意字符串的位置从 $1$ 开始编号。
其中 $\leq length(S)，m \leq 1000000$

输出描述:

对于每次询问，输出一行表示结果。如果两只兔子完全相同输出 Yes，否则输出 No（注意大小写）

示例1

输入

aabbaabb
3
1 3 5 7
1 3 6 8
1 2 1 2

输出

Yes
No
Yes

思路

把字符串看作是一个 $131$ 进制的正整数（取质数），将字符串中的每个前缀映射为一个 $u n s i g n e d l o n g l o n g$ 类型的数字。那么任意一段字符，都可以通过前缀的哈希值之间转换， $O (1)$ 计算出来。若两个字符串的映射值相同，则视为字符串相同。

最常见的哈希是 $10$ 进制下，取模一个质数。字符串哈希利用 $u l l$ 上溢的性质，避免了取模运算。但是 $u l l$ 类型的最大值不是质数，所以，要把字符串作为质数进制的数。

#include<bits/stdc++.h>
using namespace std;
const int N=1e6+10,P=26;
char s[N]; int m,l1,r1,l2,r2,len;
unsigned long long h[N],p[N],a,b;

signed main(){
	ios::sync_with_stdio(false);
	cin>>(s+1)>>m;
	len=strlen(s+1);
	p[0]=1;
	for(int i=1;i<=len;i++){
		h[i]=h[i-1]*P+s[i]-'a'+1;
		p[i]=p[i-1]*P;
	}
	while(m--){
		cin>>l1>>r1>>l2>>r2;
		if(h[r1]-h[l1-1]*p[r1-l1+1]==h[r2]-h[l2-1]*p[r2-l2+1])
			cout<<"Yes\n";
		else cout<<"No\n";
	}
}

字符串哈希求最长回文

Palindrome

题目描述

Andy the smart computer science student was attending an algorithms class when the professor asked the students a simple question, “Can you propose an efficient algorithm to find the length of the largest palindrome in a string?”
A string is said to be a palindrome if it reads the same both forwards and backwards, for example “madam” is a palindrome while “acm” is not.
The students recognized that this is a classical problem but couldn’t come up with a solution better than iterating over all substrings and checking whether they are palindrome or not, obviously this algorithm is not efficient at all, after a while Andy raised his hand and said “Okay, I’ve a better algorithm” and before he starts to explain his idea he stopped for a moment and then said “Well, I’ve an even better algorithm!”.
If you think you know Andy’s final solution then prove it! Given a string of at most $1000000$ characters find and print the length of the largest palindrome inside this string.

输入描述:

Your program will be tested on at most $30$ test cases, each test case is given as a string of at most $1000000$ lowercase characters on a line by itself. The input is terminated by a line that starts with the string “END” (quotes for clarity).

输出描述:

For each test case in the input print the test case number and the length of the largest palindrome.

示例1

输入

abcbabcbabcba
abacacbaaaab
END

输出

Case 1: 13
Case 2: 6

思路

正序用字符串哈希预处理一遍，再逆序预处理一遍。枚举对称中心（可能为字母，也可能为间隙），二分对称半径，用字符串哈希判断正序和逆序是否相同。

时间复杂度 $O(nlog_2n)$ 。

#include<bits/stdc++.h>
using namespace std;
const int N=1e6+10,P=131;
char s[N]; int len,ca;
unsigned long long pre[N],post[N],p[N];

unsigned long long calc(int l,int r){
	if(l<r) return pre[r]-pre[l-1]*p[r-l+1];
	else{ swap(l,r); return post[l]-post[r+1]*p[r-l+1]; }
}

int main(){
	ios::sync_with_stdio(false);
	while(cin>>(s+1)){
		int mx=0;
		if(s[1]=='E'&&s[2]=='N'&&s[3]=='D') break;
		len=strlen(s+1); p[0]=1;
		for(int i=1;i<=len;i++){
			p[i]=p[i-1]*P;
			pre[i]=pre[i-1]*P+s[i]-'a'+1;
			post[len-i+1]=post[len-i+2]*P+s[len-i+1]-'a'+1;
		}
		for(int i=1;i<=len;i++){
			int l=1,r=min(i,len-i+1);
			while(l<r){
				int mid=(l+r+1)/2;
				if(calc(i,i+mid-1)==calc(i,i-mid+1)) l=mid;
				else r=mid-1;
			}
			mx=max(mx,l*2-1);
			l=0,r=min(i,len-i);
			while(l<r){
				int mid=(l+r+1)/2;
				if(calc(i+1,i+mid)==calc(i,i-mid+1)) l=mid;
				else r=mid-1;
			}
			mx=max(mx,l<<1);
		}
		cout<<"Case "<<++ca<<": "<<mx<<"\n";
	}
}

Manacher

题目描述

给出一个只由小写英文字符 $\texttt a,\texttt b,\texttt c,\ldots\texttt y,\texttt z$ 组成的字符串 $S$ ,求 $S$ 中最长回文串的长度。

字符串长度为 $n$ 。

输入格式

一行小写英文字符 $\texttt a,\texttt b,\texttt c,\cdots,\texttt y,\texttt z$ 组成的字符串 $S$ 。

输出格式

一个整数表示答案。

输入输出样例

输入

aaa

输出

说明/提示
$1\le n\le 1.1\times 10^7$ .

思路

首先是插入字符 ‘#’，把字符串统一转换为奇数长度。再在首尾插入’^’、’$’（两个字符不能相同，也不能为’#’），目的是防止越界，处理到首尾会发现和每个字母都不相同，退出while(s[i+p[i]]==s[i-p[i]]) p[i]++;这个循环。由于字符串末尾本身就自带一个’\0’字符，所以也可以省略尾部插入字符 ‘$’。

算法的主要思想为，记录右边界最靠右的一个回文范围 $[l, r]$ 。当已知 $[1, i - 1]$ 的回文半径，要求出第 $i$ 位的半径时，有两种情况：

$i < = r$ ，这代表着，可以找到 $i$ 关于 $[l, r]$ 中心对称的一个点 $j$ ， $i$ 与 $j$ 在一定半径之内全等。那么j的回文半径p[i]至少和p[j]，若i的回文半径超出了[l,r]的范围，还可能会更大。所以初始化 $p [i]$ 位 $p [j]$ ，再暴力求一下它会不会更大。
$i > r$ ，那么 $i$ 的半径完全是未知，只能暴力求解。

不断更新 $[l, r]$ 区间。

在这里插入图片描述

例如这个字符串，已知前三位的回文半径， $[l, r]$ 为 $[0, 4]$ 。现在求第 $i$ 位的回文半径 $p [i]$ ，那么它至少为 $p [j] = 2$ ，之后再从半径为 $2$ 开始暴力扩展，最终求得它的回文半径为 $4$ ，更新 $[l, r]$ .

在这里插入图片描述

为什么这个充满暴力的算法，复杂度是 $O (n)$ 呢。可以发现一个规律，若是暴力拓展 $p [i]$ 成功，必然会更新 $[l, r]$ 的范围；不更新 $[l, r]$ 的范围的时候，肯定是第一次拓展就失败了。 $r$ 不断向右移动， $i$ 也向右移动，复杂度 $O (2 n)$ 。

#include<bits/stdc++.h>
using namespace std;
const int N=1.2e7;
char s[N<<1]; int cnt,p[N<<1],mx;

void read(){
	s[0]='^',s[cnt=1]='#'; char ch=getchar();
	while(ch>'z'||ch<'a') ch=getchar();
	while(ch>='a'&&ch<='z') s[++cnt]=ch,s[++cnt]='#',ch=getchar();
	s[++cnt]='$';
}

int main(){
	read();
	for(int i=1,mid=0,r=0;i<=cnt-1;i++){
		if(i<=r) p[i]=min(p[(mid<<1)-i],r-i+1);
		while(s[i+p[i]]==s[i-p[i]]) p[i]++;
		if(p[i]+i-1>r) r=i+p[i]-1,mid=i;
		mx=max(mx,p[i]);
	}
	cout<<(mx-1)<<"\n";
}

m0_51864047

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
Manacher模板

字符串哈希字符串哈希兔子与兔子题目描述很久很久以前，森林里住着一群兔子。有一天，兔子们想要研究自己的 DNA 序列。我们首先选取一个好长好长的 DNA 序列（小兔子是外星生物，DNA 序列可能包含 262626 个小写英文字母），然后我们每次选择两个区间，询问如果用两个区间里的 DNA 序列分别生产出来两只兔子，这两个兔子是否一模一样。注意两个兔子一模一样只可能是他们的 DNA 序列一模一样。输入描述:第一行一个 DNA 字符串 SSS。接下来一个数字 mmm，表示 mmm 次询问。接下来.
复制链接

扫一扫