Eigen Substring
Problem Description
For any string s s s, we define the substring s [ l . . r ] s[l..r] s[l..r] an eigen substring of s s s if and only if s [ l . . r ] s[l..r] s[l..r] only appears once in s s s.
You are given a string s s s. Please calculate the length of the shortest eigen substring for each prefix of s s s.
The notation s [ l . . r ] s[l..r] s[l..r] represents the substring of s s s that spans from the l l l-th character in s s s to the r r r-th character in s s s, inclusive.
Input
The input contains two lines.
The first line contains an integer n n n ( 1 ≤ n ≤ 1 0 6 1≤n≤10^6 1≤n≤106) which is the length of string s s s.
The second line contains the string s s s. It’s guaranteed that all characters in s s s are lowercase English letters.
Output
Your output should contain n n n lines. The i i i-th line of your output should contain one integer which denotes the length of the shortest eigen substring of s [ 1.. i ] s[1..i] s[1..i].
Sample Input
5
ababb
Sample Output
1
1
1
2
2
题意
对于任意的串S,若S的子串 S [ l . . r ] S[l..r] S[l..r]在串中只出现过一次,则称子串 S [ l . . . r ] S[l...r] S[l...r]为特征子串。现在给一个串S,要求S的每个前缀中最短的特征子串的长度。
思路
后缀自动机可以维护子串出现次数。需要查询最小值,插入、删除元素,考虑set维护出现一次的子串的长度。
考虑每次插入字符,会产生1~2个新的节点,从新的节点向link链接方向枚举。每到达一个节点st,若st是第一次被访问到,则将其加入set中;若ST是第二次被访问到,则将其从set中移出;否则跳出即可。
每次输出set的首元素即可。
#include<cstdio>
#include<iostream>
#include<cstdlib>
#include<cmath>
#include<algorithm>
#include<cstring>
#include<map>
#include<vector>
#include<set>
#include<iterator>
#define dbg(x) cout<<#x<<" = "<<x<<endl;
#define INF 0x3f3f3f3f
#define LLINF 0x3f3f3f3f3f3f3f3f
#define eps 1e-6
using namespace std;
typedef long long LL;
typedef pair<int, int> P;
const int maxn = 1001000;
const int mod = 998244353;
struct node{
int link, len, nex[28];
}st[maxn*2];
int cnt, last, tim[maxn*2];
char str[maxn];
//为了区别长度相同的子串,set维护pair,first代表长度,second代表节点编号
set<P> sett;
void init();
void sam(int x);
void addnode(int pos);
int main()
{
int n, m, i, j, k;
init();
scanf("%d %s", &n, str);
for(i=0;i<n;i++){
sam(str[i]-'a');
printf("%d\n", (sett.begin())->first);
}
return 0;
}
void init()
{
cnt = 0, last = 0;
memset(st[0].nex, 0, sizeof(st[0].nex));
st[0].link = -1;
st[0].len = 0;
}
void sam(int x)
{
int p, q, cur = ++cnt;
tim[cur] = 1;
st[cur].len = st[last].len+1;
for(p=last;p!=-1 && !st[p].nex[x];p=st[p].link)
st[p].nex[x] = cur;
if(p == -1)
st[cur].link = 0;
else{
q = st[p].nex[x];
if(st[p].len+1 == st[q].len){
st[cur].link = q;
//新建节点指向节点q,若q在set中,则q是被第二次访问到了,从set中删除q
if(tim[q]) tim[q] = 0, sett.erase(P(st[st[q].link].len+1, q));
}
else{
int clo = ++cnt;
st[clo] = st[q];
st[clo].len = st[p].len+1;
//q的link有更新,若q在set中,则需要更新q的信息。
if(tim[q])sett.erase(P(st[st[q].link].len+1,q));
for(;p!=-1 && st[p].nex[x] == q;p=st[p].link)
st[p].nex[x] = clo;
st[cur].link = st[q].link = clo;
if(tim[q])sett.insert(P(st[st[q].link].len+1,q));
}
}
//新建节点一定是第一次被访问到,将其加入set中
sett.insert(P(st[st[cur].link].len+1, cur));
last = cur;
}