CSUOJ 2294Hidden Anagrams（字符串Hash、Hash表）

最新推荐文章于 2021-10-03 20:53:21 发布

就算过了一载春秋

最新推荐文章于 2021-10-03 20:53:21 发布

阅读量329

点赞数

分类专栏： # >>>>字符串算法 ∨∨Acm 文章标签： Hash 哈希表

本文链接：https://blog.csdn.net/qq_40889820/article/details/88985490

版权

∨∨Acm 同时被 2 个专栏收录

31 篇文章 0 订阅

订阅专栏

>>>>字符串算法

5 篇文章 0 订阅

订阅专栏

题目链接

Description

An anagram is a word or a phrase that is formed by rearranging the letters of another. For instance, by rearranging the letters of “William Shakespeare,” we can have its anagrams “I am a weakish speller,” “I’ll make a wise phrase,” and so on. Note that when A is an anagram of B, B is an anagram of A. In the above examples, differences in letter cases are ignored, and word spaces and punctuation symbols are freely inserted and/or removed. These rules are common but not applied here; only exact matching of the letters is considered. For two strings s1 and s2 of letters, if a substring s 0 1 of s1 is an anagram of a substring s 0 2 of s2, we call s 0 1 a hidden anagram of the two strings, s1 and s2. Of course, s 0 2 is also a hidden anagram of them. Your task is to write a program that, for given two strings, computes the length of the longest hidden anagrams of them. Suppose, for instance, that “anagram” and “grandmother” are given. Their substrings “nagr” and “gran” are hidden anagrams since by moving letters you can have one from the other. They are the longest since any substrings of “grandmother” of lengths five or more must contain “d” or “o” that “anagram” does not. In this case, therefore, the length of the longest hidden anagrams is four. Note that a substring must be a sequence of letters occurring consecutively in the original string and so “nagrm” and “granm” are not hidden anagrams.

Input

The input consists of a single test case in two lines.
s1
s2
s1 and s2 are strings consisting of lowercase letters (a through z) and their lengths are between 1 and 4000, inclusive

Output

Output the length of the longest hidden anagrams of s1 and s2. If there are no hidden anagrams, print a zero.

Sample Input

anagram
grandmother

Sample Output

给定两个长度不超过4000的字符串，由小写字母构成，求最长长度的子串，满足字母构成相同。样例中anagram与grandmother满足条件的子串分别为nagr和gran。

时间给了10s，自然想到了暴力+字符串Hash，不过打多校的时候Hash的方式错了导致了TLE。

由于这题求的子串要求是字母构成相同，顺序可以不同，按照普通的HashPOJ3461 && LOJ#10033 Oulipo（字符串HASH）会很麻烦，每次Hash都需要对要求的字符串进行排序再求哈希值，滚动哈希的优化技巧也不大好用。

用一个数组num[ch-'a'+1]保存字符ch-'a'+1的个数（ch>='a'&&ch<='z'）,可以将哈希函数定义成 $H(S)=\sum_{i=1}^{26}num[i]*i*Base^{26-i}$ ，在O(1)时间内完成一个子串的哈希值的求解，巧妙的避开了字符顺序的问题。

因为答案最大是较短字符串的长度LL，故可以从LL开始，将一个字符串该长度的子串的哈希值存入set，再计算另一个字符串该长度的子串的哈希值，若在set中可以查到则说明该长度为解。

利用set的AC代码：

//CSDN博客：https://blog.csdn.net/qq_40889820
#include<iostream>
#include<sstream>
#include<fstream>
#include<algorithm>
#include<string>
#include<cstring>
#include<iomanip>
#include<vector>
#include<cmath>
#include<ctime>
#include<stack>
#include<queue>
#include<map>
#include<set>
#define mem(a,b) memset(a,b,sizeof(a))
#define random(a,b) (rand()%(b-a+1)+a)
#define ull unsigned long long
#define e 2.71828182
#define Pi 3.141592654
using namespace std;
const int MAXN=2e5+9;
const int Base=1e9+7;
static char S[MAXN],T[MAXN];
ull htmp; 
int num[30];
int LS,LT,LL;
set<ull> my;//存储哈希值 
ull fun()//计算哈希值 
{
	ull ans=0;
	for(int i=1;i<=26;++i)
	ans=ans*Base+num[i]*i;
	return ans;
}

int main()
{
	ios::sync_with_stdio(false);
	cin.tie(0);cout.tie(0);
	cin>>S+1>>T+1;
	LS=strlen(S+1);
	LT=strlen(T+1);
	LL=min(LS,LT);
	
	for(int i=LL;i>=1;--i)//长度 
	{
		mem(num,0);
		my.clear(); 
		for(int k=1;k<=i-1;++k)  num[S[k]-'a'+1]++;
		for(int j=i;j<=LS;++j)//终点 
		{
			num[S[j]-'a'+1]++;
			htmp=fun();
			my.insert(htmp);
			num[S[j-i+1]-'a'+1]--; 
		}
		
		mem(num,0);
		for(int k=1;k<=i-1;++k) num[T[k]-'a'+1]++;
		for(int j=i;j<=LT;++j)//终点
		{
			num[T[j]-'a'+1]++;
			htmp=fun();
			if(my.count(htmp))//找到了 
			{
				cout<<i;
				return 0;
			}
			num[T[j-i+1]-'a'+1]--; 
		} 
		
	}
	cout<<"0";
	return 0;
}

/**********************************************************************
	Problem: 2294
	User: wz1823636309
	Language: C++
	Result: AC
	Time:3124 ms
	Memory:2700 kb
**********************************************************************/

set的查找效率是O(logn)，还可以利用哈希表，如果超时了可以调整mod值。

利用vector做邻接表的AC代码：

//CSDN博客：https://blog.csdn.net/qq_40889820
#include<iostream>
#include<sstream>
#include<fstream>
#include<algorithm>
#include<string>
#include<cstring>
#include<iomanip>
#include<vector>
#include<cmath>
#include<ctime>
#include<stack>
#include<queue>
#include<map>
#include<set>
#define mem(a,b) memset(a,b,sizeof(a))
#define random(a,b) (rand()%(b-a+1)+a)
#define ull unsigned long long
#define e 2.71828182
#define Pi 3.141592654
using namespace std;
const int MAXN=2e5+9;
const int Base=1e9+7;
const int mod=2e5+7;
static char S[MAXN],T[MAXN];
ull htmp; 
int num[30];
int LS,LT,LL;
vector<ull> HashTable[MAXN];
void add_edge(ull h)
{
	int pos=h%mod;
	HashTable[pos].push_back(h);
}
bool find(ull h)
{
	int pos=h%mod;
	for(int i=0;i<HashTable[pos].size();++i)
		if(h==HashTable[pos][i]) return true;	
	return false;
}
ull fun()
{
	ull ans=0;
	for(int i=1;i<=26;++i)
	ans=ans*Base+num[i]*i;
	return ans;
}
inline void initial_hash()
{
	for(int i=LL;i>=1;--i)//长度 
	{
		mem(num,0);
		for(int k=1;k<=i-1;++k) num[S[k]-'a'+1]++;
		
		for(int j=i;j<=LS;++j)//终点 
		{
			num[S[j]-'a'+1]++; 
			htmp=fun();
			add_edge(htmp);
			num[S[j-i+1]-'a'+1]--;
		}
	}
}
int main()
{
	ios::sync_with_stdio(false);
	cin.tie(0);cout.tie(0);
	cin>>S+1>>T+1;
	//scanf("%s%s",S+1,T+1);
	LS=strlen(S+1);
	LT=strlen(T+1);
	LL=min(LS,LT);
	
	initial_hash();
	
	bool flag=false;
	
	for(int i=LL;i>=1;--i)//长度 
	{
		mem(num,0);
		for(int k=1;k<=i-1;++k)  num[T[k]-'a'+1]++;
		for(int j=i;j<=LT;++j)//终点 
		{
			num[T[j]-'a'+1]++;
			htmp=fun();
			if(find(htmp))
			{
				flag=true;cout<<i;
				return 0;
			}
			num[T[j-i+1]-'a'+1]--; 
		}
	}
	if(!flag) cout<<'0';
	return 0;
}

/**********************************************************************
	Problem: 2294
	User: wz1823636309
	Language: C++
	Result: AC
	Time:4924 ms
	Memory:115232 kb
**********************************************************************/

参考：
https://www.cnblogs.com/caomingpei/p/9637396.html
https://blog.csdn.net/albertluf/article/details/79522958

就算过了一载春秋

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CSUOJ 2294Hidden Anagrams（字符串Hash、Hash表）

题目链接DescriptionAn anagram is a word or a phrase that is formed by rearranging the letters of another. For instance, by rearranging the letters of “William Shakespeare,” we can have its anagrams “I...
复制链接

扫一扫

专栏目录