Keyword
Problem Description
Kevin has invented a new algorithm to crypt and decrypt messages, which he thinks is unbeatable. The algorithm uses a very large key-string, out of which a keyword is found out after applying the algorithm. Then, based on this keyword, the message is easily crypted or decrypted. So, if one would try to decrypt some messages crypted with this algorithm, then knowing the keyword would be enough. Someone has found out how the keyword is computed from the large key-string, but because he is not a very experienced computer programmer, he needs your help. The key-string consists of N characters from the set {‘a’,‘b’}. The keyword is the shortest non-empty string made up of the letters ‘a’ and ‘b’, which is not contained as a contiguous substring (also called subsequence) inside the key-string. It is possible that more than one such string exists, but the algorithm is designed in such a way that any of these strings can be used as a keyword. Given the key-string, your task is to find one keyword.
Input
The first line contains the integer number N, the number of characters inside the key-string (1 <= N <= 500 000). The next line contains N characters from the set {‘a’,‘b’} representing the string.
Output
The first line of output should contain the number of characters of the keyword. The second line should contain the keyword.
Sample Input
11
aabaaabbbab
Sample Output
4
aaaa
题意
有一个长度为n的只包含字母’a’,'b’的字符串,求未在字符串中作为子串出现的最短字符串。若有多个,输出字典序最小的字符串。
题解:
长度为m的字符串共有
2
m
2^m
2m个,而长度为n的串的子串中长度为m的不同子串最多有n-m+1个。
n
≤
50000
≤
2
19
n\le50000\le2^{19}
n≤50000≤219,所以结果的长度肯定
≤
19
\le19
≤19。
对原串进行hash,枚举左端点和长度,求出hash值,对已出现的hash值进行标记。然后搜索结果,判断字符串的hash是否标记过。求最短即可。
#include<stdio.h>
#include<iostream>
#include<cstdlib>
#include<cmath>
#include<algorithm>
#include<cstring>
#include<set>
#include<vector>
#include<queue>
#include<iterator>
#define dbg(x) cout<<#x<<" = "<<x<<endl;
#define INF 0x3f3f3f3f
#define eps 1e-7
using namespace std;
typedef long long LL;
typedef pair<int, int> P;
const int maxn = 500100;
const int mod = 10000003;
int alen;
bool vis[mod+10];
char str[maxn], a[40], ans[40];
void dfs(int step, int num);
int main()
{
int n, i, j, k, len;
alen = INF;
scanf("%d %s", &len, str);
for(i=0;i<len;i++){
int num = 0;
for(j=0;j<20 && i+j<len;j++){
num = (num*3+str[i+j]-'a'+1)%mod;
vis[num] = 1;
}
}
vis[0] = 1;
dfs(0, 0);
ans[alen] = 0;
printf("%d\n%s\n", alen, ans);
return 0;
}
void dfs(int step, int num)
{
if(step > alen || step>20)return;
if(vis[num] == 0 && step < alen){
for(int i=0;i<step;i++)
ans[i] = a[i];
alen = step;
}
a[step] = 'a';
dfs(step+1, (num*3+1)%mod);
a[step] = 'b';
dfs(step+1, (num*3+2)%mod);
}